1. 程式人生 > >pandas DataFrame 根據多列的值做判斷,生成新的列值

pandas DataFrame 根據多列的值做判斷,生成新的列值

環境:Python3.6.4 + pandas 0.22

主要是DataFrame.apply函式的應用,如果設定axis引數為1則每次函式每次會取出DataFrame的一行來做處理,如果axis為1則每次取一列。

如程式碼所示,判斷如果城市名中含有ing欄位且年份為2016,則新列test值賦為1,否則為0.

import numpy as np
import pandas as pd

data = {'city': ['Beijing', 'Shanghai', 'Guangzhou', 'Shenzhen', 'Hangzhou', 'Chongqing'],
       'year': [2016,2016,2015,2017,2016, 2016],
       'population': [2100, 2300, 1000, 700, 500, 500]}
frame = pd.DataFrame(data, columns = ['year', 'city', 'population', 'debt'])

def function(a, b):
	if 'ing' in a and b == 2016:
		return 1
	else:
		return 0
print(frame, '\n')
frame['test'] = frame.apply(lambda x: function(x.city, x.year), axis = 1)
print(frame)

執行結果如下:


另外Series型別也有apply函式,用法示例如下:

import numpy as np
import pandas as pd

data = {'city': ['Beijing', 'Shanghai', 'Guangzhou', 'Shenzhen', 'Hangzhou', 'Chongqing'],
       'year': [2016,2016,2015,2017,2016, 2016],
       'population': [2100, 2300, 1000, 700, 500, 500]}
frame = pd.DataFrame(data, columns = ['year', 'city', 'population', 'debt'])

print(frame, '\n')
frame['panduan'] = frame.city.apply(lambda x: 1 if 'ing' in x else 0)
print(frame)

執行結果如下: