1. 程式人生 > >Oracle SQL效能優化 - 根據大表關聯更新小表

Oracle SQL效能優化 - 根據大表關聯更新小表

需求:

  小表資料量20w條左右,大表資料量在4kw條左右,需要根據大表篩選出150w條左右的資料並關聯更新小表中5k左右的資料。

效能問題:

對篩選條件中涉及的欄位加index後,如下常規的update語句仍耗時半小時左右。

  UPDATE WMOCDCREPORT.DM_WM_TRADINGALL A
  SET
  (
    A.RELATIONSHIPNO,
    A.PACKAGE
  )
  =
  (SELECT 
                B.RELATIONSHIPNO,
    CASE
                                WHEN
(B.SEGMENTCODE='52' OR B.SEGMENTCODE ='55' OR B.SEGMENTCODE ='56' OR B.SEGMENTCODE ='59') THEN 'BC' WHEN
(B.SEGMENTCODE='66') THEN 'PW' WHEN (B.SEGMENTCODE='60') THEN 'MM' WHEN (B.SEGMENTCODE='65')
THEN 'EB' WHEN (B.SEGMENTCODE='61') THEN 'PB' ELSE B.SEGMENTCODE END FROM DATACORE.DF_CUST_HISTORY B WHERE B.ACCOUNT_NO=A.SETTLEMENTACCOUNT AND B.DATA_DATE = '2018-11-30' AND rownum = 1 ) WHERE A.MONTH = 'SEP' AND A.DATA_DATE = '2018-09-30' AND EXISTS ( SELECT 1 FROM DATACORE.DF_CUST_HISTORY C WHERE C.ACCOUNT_NO=A.SETTLEMENTACCOUNT AND C.DATA_DATE = '2018-11-30' );

經過數次搜尋,發現同關聯更新有關的技術部落格基本上是更新大表資料,比如here.(使用批量更新)。

也分析過執行計劃,同預想的效能瓶頸一樣,主要由以下兩個方面造成

(1) DATACORE.DF_CUST_HISTORY資料量太大,本想將某一天的資料select出來提前插入到一張表中,但估計效果不會太明顯,因為插入150w條資料本身也會耗時很長。

(2) 需要更新5k條資料,且每條資料需要到150w條資料中做關聯查詢(時間主要耗在這)。

效能優化:

小表5k,大表150w,理所應當想到採用join的方式並保留小表中的資料。接下來是怎麼把join後的資料更新到小表中(不用update)?merge into!

這裡還涉及到一個小問題,merge into中的on條件需要保證一一對應,而大表中很可能出現重複的ACCOUNT_NO,所以需要排重,怎麼做?用partition by !

另外,關於join中on條件和where條件的比較

優化後的sql(執行時間8-10s):

merge into wmocdcreport.dm_wm_tradingall a
using (
    select
       t.rid,
       t.settlementaccount,
       tx.relationshipno,
       case
         when (tx.segmentcode = '52' or tx.segmentcode = '55' or
              tx.segmentcode = '56' or tx.segmentcode = '59') then
          'BC'
         when (tx.segmentcode = '66') then
          'PW'
         when (tx.segmentcode = '60') then
          'MM'
         when (tx.segmentcode = '65') then
          'EB'
         when (tx.segmentcode = '61') then
          'PB'
         else
          tx.segmentcode
       end as package
    from (
        select rowid rid,
            dwt.settlementaccount
        from wmocdcreport.dm_wm_tradingall dwt
        where dwt.month = 'SEP'
        and dwt.data_date = '2018-09-30'
    ) t
    inner join 
    (
        select row_number() over (partition by c.account_no order by c.relationshipno) seq,
              c.account_no,
              c.relationshipno,
              c.segmentcode
        from datacore.df_cust_history c
        where c.data_date = '2018-11-30'
    ) tx
    on tx.account_no = t.settlementaccount and tx.seq = 1
) b on (a.rowid = b.rid)
when matched then
   update set a.relationshipno = b.relationshipno, 
              a.package        = b.package;