1. 程式人生 > >pandas_cookbook學習(九)——apply

pandas_cookbook學習(九)——apply

Apply應用函式:

In [135]: df = pd.DataFrame(data={'A' : [[2,4,8,16],[100,200],[10,20,30]], 'B' : [['a','b','c'],['jj','kk'],['ccc']]},index=['I','II','III']); df
	A		B
I	[2, 4, 8, 16]	[a, b, c]
II	[100, 200]	[jj, kk]
III	[10, 20, 30]	[ccc]

In [136]: def SeriesFromSubList(aList):
   .....:    return pd.Series(
aList) .....: In [137]: df_orgz = pd.concat(dict([ (ind,row.apply(SeriesFromSubList)) for ind,row in df.iterrows() ])); df_orgz 0 1 2 3 I A 2 4 8 16.0 B a b c NaN II A 100 200 NaN NaN B jj kk NaN NaN III A 10 20 30 NaN B ccc NaN NaN NaN

Rolling Apply to multiple columns where function calculates a Series before a Scalar from the Series is returned

In [138]: df = pd.DataFrame(data=np.random.randn(2000,2)/10000,
   .....:                   index=pd.date_range('2001-01-01',periods=2000),
   .....:                   columns=['A','B']); df
   .....: 
Out[138]: 
                   A         B
2001-01-01  0.000032 -0.000004
2001-01-02 -0.000001  0.000207
2001-01-03  0.000120 -0.000220
2001-01-04 -0.000083 -0.000165
2001
-01-05 -0.000047 0.000156 2001-01-06 0.000027 0.000104 2001-01-07 0.000041 -0.000101 ... ... ... 2006-06-17 -0.000034 0.000034 2006-06-18 0.000002 0.000166 2006-06-19 0.000023 -0.000081 2006-06-20 -0.000061 0.000012 2006-06-21 -0.000111 0.000027 2006-06-22 -0.000061 -0.000009 2006-06-23 0.000074 -0.000138 [2000 rows x 2 columns] In [139]: def gm(aDF,Const): .....: v = ((((aDF.A+aDF.B)+1).cumprod())-1)*Const .....: return (aDF.index[0],v.iloc[-1]) .....: In [140]: S = pd.Series(dict([ gm(df.iloc[i:min(i+51,len(df)-1)],5) for i in range(len(df)-50) ])); S Out[140]: 2001-01-01 -0.001373 2001-01-02 -0.001705 2001-01-03 -0.002885 2001-01-04 -0.002987 2001-01-05 -0.002384 2001-01-06 -0.004700 2001-01-07 -0.005500 ... 2006-04-28 -0.002682 2006-04-29 -0.002436 2006-04-30 -0.002602 2006-05-01 -0.001785 2006-05-02 -0.001799 2006-05-03 -0.000605 2006-05-04 -0.000541 Length: 1950, dtype: float64