1. 程式人生 > >解決xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle'

解決xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle'

xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle'

 sys:1: DtypeWarning: Columns (65) have mixed types. Specify dtype option on import or set low_memory=False. ....

xgboost異常AttributeError: 'DMatrix' object has no attribute 'handle'

if self.handle is not None: AttributeError: 'DMatrix' object has no attribute 'handle'

當出現這個問題的時候,我們看sys1這裡提示我們在讀入資料的階段某一列(65)的資料的型別是混合的,並提示瞭解決方案,再讀入資料的時候,加上引數:

low_memory=False
traindata_df = pd.read_csv(train_path, sep=',',index_col='user_id', low_memory=False)
print(traindata_df.info())
....
remove_caller_fee               714686 non-null float64
logremove_caller_fee            713857 non-null object
dtypes: float64(13), int64(50), object(1)

雖然資料讀入了,但是xgb模型仍舊報錯,這是因為模型訓練的資料不能是object,需要float或者int,這個坎是繞不過去啦。但是嘗試 float(x).或者traindata_df["logremove_caller_fee"].astype(float)是不可以的,具體報錯就不貼上了。

#不可行的嘗試:traindata_df["logremove_caller_fee"] = list(map(lambda x: float(x),traindata_df["logremove_caller_fee"]))

大招,輸入輸入這個命令,它會把object物件進行替換,且很智慧,原本這一行是float和物件的混合,現在就統一變為物件,不影響非物件行:

traindata_df = traindata_df.convert_objects(convert_numeric=True)
traindata_df = pd.read_csv(train_path, sep=',',index_col='user_id', low_memory=False)
print(traindata_df.info())

...
mean_service_caller_time_fee    714686 non-null float64
remove_caller_fee               714686 non-null float64
logremove_caller_fee            713855 non-null float64
dtypes: float64(14), int64(50)