1. 程式人生 > >sys_connect_by_path函式配合group by 進行分組拼接

sys_connect_by_path函式配合group by 進行分組拼接

最近,碰到一個需求將 approval_code值對應的多個FIRST_NAME值通過line_no的asc排序 合併為一個最長的欄位  ,對應的表 如下:

對應表的sql 語句如下:

複製程式碼
 SELECT DISTINCT t1.FIRST_NAME,
        t2.approval_code,
        t2.line_no
      FROM K2_ACCESS_USER@k2global t1
      INNER JOIN k2_approval_path t2
      ON t1.DOMAIN_NAME=t2.USER_ID 
     right join
k2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') ORDER BY t2. APPROVAL_CODE,t2.line_no
複製程式碼

起初,我是打算這樣獲取approval_code對應的FIRST_NAME合併值(當時還不知道 可以直接配合group by 獲取到分組的最大的FIRST_NAME的合併值)

複製程式碼
 ------------------------start to combine the approver's name-------
SELECT max(substr(sys_connect_by_path(FIRST_NAME,','),2))FIRST_NAME FROM ( SELECT ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM, FIRST_NAME, APPROVAL_CODE FROM (select distinct FIRST_NAME, approval_code from (
SELECT DISTINCT t1.FIRST_NAME, t2.approval_code, t2.line_no FROM K2_ACCESS_USER@k2global t1 INNER JOIN k2_approval_path t2 ON t1.DOMAIN_NAME=t2.USER_ID right join k2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') ORDER BY t2. APPROVAL_CODE,t2.line_no )) ) t3 START WITH t3.APPROVAL_CODE='RQP0001105' --RQP0001105 用來作為一個測試的值 CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM ------------------------end to combine the approver's name--------
複製程式碼

但是很快我發現我獲取到的不是我想要的:

我去掉包含sys_connect_by_path函式的max()之後,並在select 列表中增加ROW_NUM

View Code


獲得的結果如下:

可以看到 其實在呼叫sys_connect_by_path函式的過程中 已經生成了我們想要的值'Kenneth,Lawrence'  但是由於一些原因這個值最後被重寫為Lawrence.

我觀察了下早先的程式碼 sys_connect_by_path最後的條件部分:

 START WITH 
    t3.APPROVAL_CODE='RQP0001105'
    CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

我start with 用的條件是t3.APPROVAL_CODE='RQP0001105' ('RQP0001105'是代入的測試值), 而實際上在表中APPROVAL_CODE值為'RQP0001105'

有兩個為別為ROW_NUM1106和1107的兩條記錄.於是我在執行函式sys_connect_by_path的時候其實是分為兩步來執行的 ,它會分別從ROW_NUM=1106和1107兩條記錄開始執行一次,也就是說它是這樣的

start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1106'

 CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

執行結果:

start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1107'

CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

執行結果:

我們可以判斷出來是由於從start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1107'的時候將上一步 呼叫函式生成的值'Kenneth,Lawrence' 重寫為Lawrence.

於是問題就清楚了, 解決方法是在start with 的時候再加上一個條件使他只從最上面的那條記錄開始執行. 我的方法是新增一個rank列,rank列的值只會和多條記錄中的第一個記錄的ROW_NUM相同

   ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM,
       ltrim(APPROVAL_CODE,'RQP')+RANK() over(ORDER BY APPROVAL_CODE ) RANK_NUM,

同時 下面的條件改為:

     START WITH t3.APPROVAL_CODE=hist.approval_code and  t3.ROW_NUM=t3.RANK_NUM
    CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

小組的leader建議我的方法是在原先的程式碼中新增了一列

rank_num,它是由表中分塊排序而來 見如下:

複製程式碼
SELECT
       ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM,
        row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM,
      FIRST_NAME,
      APPROVAL_CODE
          from (select distinct FIRST_NAME, approval_code
    FROM
      (
      SELECT DISTINCT t1.FIRST_NAME,
        t2.approval_code,
        t2.line_no
      FROM K2_ACCESS_USER@k2global t1
      INNER JOIN k2_approval_path t2
      ON t1.DOMAIN_NAME=t2.USER_ID 
     right join k2_credit_limit_hist t4
      on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd')
      ORDER BY t2.  APPROVAL_CODE,t2.line_no
      ))
複製程式碼

執行之後可以看到獲取到的資料如下:

我們將原先的  

START WITH t3.APPROVAL_CODE='RQP0001105'

CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

修改為

  START WITH t3.APPROVAL_CODE='RQP0001105'and  t3.RANK_NUM=1
   
    CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM

即可.

複製程式碼
 SELECT  max(substr(sys_connect_by_path(FIRST_NAME,','),2)) FIRST_NAME
 
-- , length(FIRST_NAME),t3.ROW_NUM,t3.APPROVAL_CODE
  FROM
    (
    SELECT
       ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM,
      -- ltrim(APPROVAL_CODE,'RQP')+RANK() over(ORDER BY APPROVAL_CODE ) RANK_NUM,
      row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM,
    -- row_number(),
      FIRST_NAME,
      APPROVAL_CODE
          from (select distinct FIRST_NAME, approval_code
    FROM
      (
      SELECT DISTINCT t1.FIRST_NAME,
        t2.approval_code,
        t2.line_no
      FROM K2_ACCESS_USER@k2global t1
      INNER JOIN k2_approval_path t2
      ON t1.DOMAIN_NAME=t2.USER_ID 
     right join k2_credit_limit_hist t4
      on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd')
   -- where APPROVAL_CODE='RQP0001199'
      ORDER BY t2.  APPROVAL_CODE,t2.line_no
      ))
    ) t3 
    START WITH 
  t3.APPROVAL_CODE='RQP0001105' and  t3.RANK_NUM=1
   --and t3.app
    CONNECT BY t3.ROW_NUM -1   = prior t3.ROW_NUM 
複製程式碼


執行後結果如下:

 好吧,上面寫的是我之前走比較繞的路子.實際上要實現我們要的值只需要配合group by 進行分組拼接即可 程式碼如下:

複製程式碼
SELECT max(SUBSTR(SYS_CONNECT_BY_PATH(create_by, ','), 2)) create_by
  FROM
    (SELECT
       ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM,
      -- row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM,
      create_by,
      approval_code
      
      from (select distinct create_by, approval_code
    FROM
      (SELECT DISTINCT t.create_by,
        t.approval_code,
        t.line_no
      FROM k2_approval_path t 
       RIGHT JOIN K2_CREDIT_LIMIT_HIST T4
      on t.approval_code=t4.approval_code and t4.expired_date>=to_date('2010-01-01','yyyy-mm-dd')
      ORDER BY t.approval_code,t.line_no
      )
    ) )T1
    START WITH  t1.approval_code= hist.approval_code 
    --t1.approval_code= hist.approval_code and t1.RANK_NUM=1
       CONNECT BY T1.ROW_NUM -1   = PRIOR T1.ROW_NUM
       group by t1.approval_code
複製程式碼