1. 程式人生 > >oracle的分析函式over(Partition by...)

oracle的分析函式over(Partition by...)

oracle的分析函式over(Partition by…)

Sql程式碼

over(Partition by...) 一個超級牛皮的ORACLE特有函式。  

最近工作中才接觸到這個功能強大而靈活的函式。

oracle的分析函式over 及開窗函式

一:分析函式over

Oracle從8.1.6開始提供分析函式,分析函式用於計算基於組的某種聚合值,它和聚合函式的不同之處是

對於每個組返回多行,而聚合函式對於每個組只返回一行。

下面通過幾個例子來說明其應用。

**1:統計某商店的營業額。**          


  ` date       sale  
     1           20  
     2           15  
     3           14  
     4           18  
     5           30  
    規則:按天統計:每天都統計前面幾天的總額  
    得到的結果:  
    DATE   SALE       SUM 
    ----- -------- ------  
    1      20        20           --1天             
    2      15        35           --1天+2天             
    3      14        49           --1天+2天+3天             
    4      18        67            .            
    5      30        97            .  
        
**2:統計各班成績第一名的同學資訊**  
    NAME   CLASS S                           
    ----- ----- ----------------------   
    fda    1      80                       
    ffd    1      78                       
    dss    1      95                       
    cfe    2      74                       
    gds    2      92                       
    gf     3      99                       
    ddd    3      99                       
    adf    3      45                       
    asdf   3      55                       
    3dd    3      78                
  
通過:     
--  
select * from                                                                        
(                                                                              
select name,class,s,rank()over(partition by class order by s desc) mm from t2  
)                                                                              
where mm=1   
--  
得到結果:  


   NAME   CLASS S                       MM                                                                                          
    ----- ----- ---------------------- ----------------------   
    dss    1      95                      1                        
    gds    2      92                      1                        
    gf     3      99                      1                        
    ddd    3      99                      1         


  
注意:  
1.在求第一名成績的時候,不能用row_number(),因為如果同班有兩個並列第一,row_number()只返回一個結果            
2.rank()和dense_rank()的區別是:  
  --rank()是跳躍排序,有兩個第二名時接下來就是第四名  
  --dense_rank()l是連續排序,有兩個第二名時仍然跟著第三名  

3.分類統計 (並顯示資訊)

    A   B   C                        
    -- -- ----------------------   
    m   a   2                        
    n   a   3                        
    m   a   2                        
    n   b   2                        
    n   b   1                        
    x   b   3                        
    x   b   2                        
    x   b   4                        
    h   b   3   
   select a,c,sum(c)over(partition by a) from t2                  
   得到結果:  
   A   B   C        SUM(C)OVER(PARTITIONBYA)        
   -- -- ------- ------------------------   
   h   b   3        3                          
   m   a   2        4                          
   m   a   2        4                          
   n   a   3        6                          
   n   b   2        6                          
   n   b   1        6                          
   x   b   3        9                          
   x   b   2        9                          
   x   b   4        9                          
     
   如果用sum,group by 則只能得到  
   A   SUM(C)                              
   -- ----------------------   
   h   3                        
   m   4                        
   n   6                        
   x   9                        
   無法得到B列值         
     
=====  
select * from test  
   
資料:  
A B C   
1 1 1   
1 2 2   
1 3 3   
2 2 5   
3 4 6   
   
   
---將B欄位值相同的對應的C 欄位值加總  
select a,b,c, SUM(C) OVER (PARTITION BY B) C_Sum  
from test  
   
A B C C_SUM   
1 1 1 1   
1 2 2 7   
2 2 5 7   
1 3 3 3   
3 4 6 6   
   
---如果不需要已某個欄位的值分割,那就要用 null  
   
eg: 就是將C的欄位值summary 放在每行後面  
   
select a,b,c, SUM(C) OVER (PARTITION BY null) C_Sum  
from test  
   
A B C C_SUM   
1 1 1 17   
1 2 2 17   
1 3 3 17   
2 2 5 17   
3 4 6 17  
   
求個人工資佔部門工資的百分比   
   
SQL> select * from salary;  
   
NAME DEPT SAL  
---------- ---- -----  
a 10 2000  
b 10 3000  
c 10 5000  
d 20 4000  
   
SQL> select name,dept,sal,sal*100/sum(sal) over(partition by dept) percent from salary;  
   
NAME DEPT SAL PERCENT  
---------- ---- ----- ----------  
a 10 2000 20  
b 10 3000 30  
c 10 5000 50  
d 20 4000 100  
   
**二:開窗函式**             
      開窗函式指定了分析函式工作的資料視窗大小,這個資料視窗大小可能會隨著行的變化而變化,舉例如下:   
1:       
   over(order by salary) 按照salary排序進行累計,order by是個預設的開窗函式  
   over(partition by deptno)按照部門分割槽  
2:  
  over(order by salary range between 5 preceding and 5 following)  
   每行對應的資料視窗是之前行幅度值不超過5,之後行幅度值不超過5  
   例如:對於以下列  
     aa  
     1  
     2  
     2  
     2  
     3  
     4  
     5  
     6  
     7  
     9  
      
   sum(aa)over(order by aa range between 2 preceding and 2 following)  
   得出的結果是  
            AA                       SUM 
            ---------------------- -------------------------------------------------------   
            1                       10                                                        
            2                       14                                                        
            2                       14                                                        
            2                       14                                                        
            3                       18                                                        
            4                       18                                                        
            5                       22                                                        
            6                       18                                                                  
            7                       22                                                                  
            9                       9                                                                   
                
   就是說,對於aa=5的一行 ,sum為   5-1<=aa<=5+2 的和  
   對於aa=2來說 ,sum=1+2+2+2+3+4=14     ;  
   又如 對於aa=9 ,9-1<=aa<=9+2 只有9一個數,所以sum=9    ;  
                 
**3:其它:**  
     over(order by salary rows between 2 preceding and 4 following)  
          每行對應的資料視窗是之前2行,之後4行   
**4:下面三條語句等效:**             
     over(order by salary rows between unbounded preceding and unbounded following)  
          每行對應的資料視窗是從第一行到最後一行,等效:  
     over(order by salary range between unbounded preceding and unbounded following)  
           等效  
     over(partition by null)  
   
**常用的分析函式如下所列:**  
   
row_number() over(partition by ... order by ...)  
rank() over(partition by ... order by ...)  
dense_rank() over(partition by ... order by ...)  
count() over(partition by ... order by ...)  
max() over(partition by ... order by ...)  
min() over(partition by ... order by ...)  
sum() over(partition by ... order by ...)  
avg() over(partition by ... order by ...)  
first_value() over(partition by ... order by ...)  
last_value() over(partition by ... order by ...)  
lag() over(partition by ... order by ...)  
lead() over(partition by ... order by ...)  
   
示例  
SQL> select type,qty from test;  
   
TYPE QTY  
---------- ----------  
1 6  
2 9  
   
 SQL> select type,qty,to_char(row_number() over(partition by type order by qty))||'/'||to_char(count(*) over(partition by type)) as cnt2 from test;  
   
TYPE QTY CNT2   
---------- ---------- ------------  
3 1/2  
1 6 2/2  
2 5 1/3  
7 2/3   
2 9 3/3  
   
 SQL> select * from test;  
---------- -------------------------------------------------  
1 11111  
2 22222  
3 33333  
4 44444  
   
SQL> select t.id,mc,to_char(b.rn)||'/'||t.id)e  
2 from test t,  
 (select rownum rn from (select max(to_number(id)) mid from test) connect by rownum <=mid ))L  
4 where b.rn<=to_number(t.id)  
order by id  
   
ID MC TO_CHAR(B.RN)||'/'||T.ID  
--------- -------------------------------------------------- ---------------------------------------------------  
1 11111 1/1  
2 22222 1/2  
2 22222 2/2  
3 33333 1/3  
3 33333 2/3  
3 33333 3/3  
 44444 1/4 44444 2/4  
4 44444 3/4CNOUG4 44444 4/4  
   
10 rows selected  
   
*******************************************************************`

關於partition by

這些都是分析函式,好像是8.0以後才有的 row_number()和rownum差不多,功能更強一點(可以在各個分組內從1開時排序) rank()是跳躍排序,有兩個第二名時接下來就是第四名(同樣是在各個分組內) dense_rank()l是連續排序,有兩個第二名時仍然跟著第三名。相比之下row_number是沒有重複值的 lag(arg1,arg2,arg3): arg1是從其他行返回的表示式 arg2是希望檢索的當前行分割槽的偏移量。是一個正的偏移量,時一個往回檢索以前的行的數目。 arg3是在arg2表示的數目超出了分組的範圍時返回的值。

  1.  
select deptno,row_number() over(partition by deptno order by sal) from emp order by deptno;  
2.  
select deptno,rank() over (partition by deptno order by sal) from emp order by deptno;  
3.  
select deptno,dense_rank() over(partition by deptno order by sal) from emp order by deptno;  
4.  
select deptno,ename,sal,lag(ename,1,null) over(partition by deptno order by ename) from emp ord er by deptno;  
5.  
select deptno,ename,sal,lag(ename,2,'example') over(partition by deptno order by ename) from em p  
order by deptno;  
6.  
select deptno, sal,sum(sal) over(partition by deptno) from emp;--每行記錄後都有總計值  select deptno, sum(sal) from emp group by deptno;  
7. 求每個部門的平均工資以及每個人與所在部門的工資差額  
   
select deptno,ename,sal ,  
     round(avg(sal) over(partition by deptno)) as dept_avg_sal,   
     round(sal-avg(sal) over(partition by deptno)) as dept_sal_diff  
from emp;