pandas.DataFrame.apply() 具體應用 實現新增 統計行 或 統計列
阿新 • • 發佈:2019-01-05
最近在工作中需要用到對pandas的列資料進行sum()
統計,那就需要新增一行資料
實現方法如下:
import pandas as pd
import numpy as np
df = pd.DataFrame([
{'date': '2018-12-01', 'total': 100, 'total2': 100.23},
{'date': '2018-12-02', 'total': 102, 'total2': 2312.13},
{'date': '2018-12-03', 'total': 112, 'total2': 123.32},
{'date': '2018-12-04' , 'total': 134, 'total2': 3453.23}
])
# 需求是對'total', 'totalarea' 兩列的資料 進行np.sum()操作
df2 = df
df2 = df2.set_index('date') # 將date設為index,不進行sum()計算
df2.loc['Sum'] = df2.apply(lambda x: np.sum(x)) # 關鍵步驟
print(df2)
# output
total total2
date
2018-12-01 100.0 100.23
2018-12-02 102.0 2312.13
2018-12-03 112.0 123.32
2018-12-04 134.0 3453.23
Sum 448.0 5988.91
#擴充套件需求: 對行資料進行SUM(),
df3 = df
df3['col_sum'] = df3.apply(lambda x: np.sum(x[1:]), axis=1)
# 等同於下面的寫法
df3['col_sum'] = df3.apply(lambda x: np.sum([x['total'], x['total2']]), axis=1)
print(df3)
# output
date total total2 col_sum
0 2018-12-01 100 100.23 200.23
1 2018-12-02 102 2312.13 2414.13
2 2018-12-03 112 123.32 235.32
3 2018-12-04 134 3453.23 3587.23