1. 程式人生 > >降維例項之主成分分析

降維例項之主成分分析

資料集來源:https://www.kaggle.com/psparks/instacart-market-basket-analysis

思路:

 

例項程式碼:

import pandas as pd
from sklearn.decomposition import PCA

def main():
    '''
    降維例項:主成分分析
    :return: None
    '''
    # 讀取資料
    prior = pd.read_csv("order_products__prior.csv")
    products 
= pd.read_csv("products.csv") orders = pd.read_csv("orders.csv") aisles = pd.read_csv("aisles.csv") # 合併資料 _mg = pd.merge(prior, products, on=['product_id', 'product_id']) _mg = pd.merge(_mg, orders, on=['order_id', 'order_id']) mt = pd.merge(_mg, aisles, on=['aisle_id'
, 'aisle_id']) # print(mt.head(10)) # 交叉表 cross = pd.crosstab(mt['user_id'], mt['aisle']) # print(cross) pca = PCA(n_components=0.9) data = pca.fit_transform(cross) print(data) print(data.shape) return None if __name__ == '__main__': main()

執行結果:

從結果中可以看出資料的維數降到了27