1. 程式人生 > >R語言之因子分析

R語言之因子分析

因子分析的主要用途

1減少分析變數個數
2 通過對變數間相關關係的探測,將原始變數分組,即將相關性高的變數分為一組,用
共性因子來代替該變數
3使問題背後的業務因素的意義更加清晰呈現

解釋:

使能解釋某一因素的相關性很高的變數分為一組(例如文課因子,理科因子),例如某一因子,其中的文科相關的變數前面的載荷因子很大,那麼這些變數可以歸結為一個因子即文學因子,有的時候一個因子裡面的變數的載荷係數分佈比較均勻差異不大,我們可以對載荷因子乘上正交矩陣,實現旋轉,原因是模型很穩定。

FL APP AA LA SC LC HON SMS EXP DRV AMB GSP POT KJ SUIT
6 7 2 5 8 7 8 8 3 8 9 7 5 7 10
9 10 5 8 10 9 9 10 5 9 9 8 8 8 10
7 8 3 6 9 8 9 7 4 9 9 8 6 8 10
5 6 8 5 6 5 9 2 8 4 5 8 7 6 5
6 8 8 8 4 4 9 5 8 5 5 8 8 7 7
7 7 7 6 8 7 10 5 9 6 5 8 6 6 6
9 9 8 8 8 8 8 8 10 8 10 8 9 8 10
9 9 9 8 9 9 8 8 10 9 10 9 9 9 10
9 9 7 8 8 8 8 5 9 8 9 8 8 8 10
4 7 10 2 10 10 7 10 3 10 10 10 9 3 10
4 7 10 0 10 8 3 9 5 9 10 8 10 2 5
4 7 10 4 10 10 7 8 2 8 8 10 10 3 7
6 9 8 10 5 4 9 4 4 4 5 4 7 6 8
8 9 8 9 6 3 8 2 5 2 6 6 7 5 6
4 8 8 7 5 4 10 2 7 5 3 6 6 4 6
6 9 6 7 8 9 8 9 8 8 7 6 8 6 10
8 7 7 7 9 5 8 6 6 7 8 6 6 7 8
6 8 8 4 8 8 6 4 3 3 6 7 2 6 4
6 7 8 4 7 8 5 4 4 2 6 8 3 5 4
4 8 7 8 8 9 10 5 2 6 7 9 8 8 9
3 8 6 8 8 8 10 5 3 6 7 8 8 5 8
9 8 7 8 9 10 10 10 3 10 8 10 8 10 8
7 10 7 9 9 9 10 10 3 9 9 10 9 10 8
9 8 7 10 8 10 10 10 2 9 7 9 9 10 8
6 9 7 7 4 5 9 3 2 4 4 4 4 5 4
7 8 7 8 5 4 8 2 3 4 5 6 5 5 6
2 10 7 9 8 9 10 5 3 5 6 7 6 4 5
6 3 5 3 5 3 5 0 0 3 3 0 0 5 0
4 3 4 3 3 0 0 0 0 4 4 0 0 5 0
4 6 5 6 9 4 10 3 1 3 3 2 2 7 3
5 5 4 7 8 4 10 3 2 5 5 3 4 8 3
3 3 5 7 7 9 10 3 2 5 3 7 5 5 2
2 3 5 7 7 9 10 3 2 2 3 6 4 5 2
3 4 6 4 3 3 8 1 1 3 3 3 2 5 2
6 7 4 3 3 0 9 0 1 0 2 3 1 5 3
9 8 5 5 6 6 8 2 2 2 4 5 6 6 3
4 9 6 4 10 8 8 9 1 3 9 7 5 3 2
4 9 6 6 9 9 7 9 1 2 10 8 5 5 2
10 6 9 10 9 10 10 10 10 10 8 10 10 10 10
10 6 9 10 9 10 10 10 10 10 10 10 10 10 10
10 7 8 0 2 1 2 0 10 2 0 3 0 0 10
10 3 8 0 1 1 0 0 10 0 0 0 0 0 10
3 4 9 8 2 4 5 3 6 2 1 3 3 3 8
7 7 7 6 9 8 8 6 8 8 10 8 8 6 5
9 6 10 9 7 7 10 2 1 5 5 7 8 4 5
9 8 10 10 7 9 10 3 1 5 7 9 9 4 4
0 7 10 3 5 0 10 0 0 2 2 0 0 0 0
0 6 10 1 5 0 10 0 0 2 2 0 0 0 0
匯入資料


先做因子分析,分為5個因子

factanal(~.,factors = 5,data = applicant)

Call:
factanal(x = ~., factors = 5, data = applicant)


Uniquenesses:
   FL   APP    AA    LA    SC    LC   HON   SMS   EXP   DRV   AMB   GSP 
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119 
  POT    KJ  SUIT 
0.084 0.005 0.267 


Loadings:
     Factor1 Factor2 Factor3 Factor4 Factor5
FL    0.127   0.722   0.102  -0.117         
APP   0.451   0.134   0.270   0.206   0.258 
AA            0.129           0.686         
LA    0.222   0.246   0.827                 
SC    0.917           0.167                 
LC    0.851   0.125   0.279          -0.420 
HON   0.228  -0.220   0.777                 
SMS   0.880   0.266   0.111                 
EXP           0.773           0.171         
DRV   0.754   0.393   0.199           0.114 
AMB   0.909   0.187   0.112           0.165 
GSP   0.783   0.295   0.354   0.148  -0.181 
POT   0.717   0.362   0.446   0.267         
KJ    0.418   0.399   0.563  -0.585         
SUIT  0.351   0.764           0.148         


               Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings      5.490   2.507   2.188   1.028   0.331
Proportion Var   0.366   0.167   0.146   0.069   0.022
Cumulative Var   0.366   0.533   0.679   0.748   0.770

分析因子,找到比較能進行業務解釋的主要因子,載荷係數不能很好的進行解釋,可以採取對載荷係數進行旋轉

Call:
factanal(x = ~., factors = 5, data = applicant)
Uniquenesses:
   FL   APP    AA    LA    SC    LC   HON   SMS   EXP   DRV   AMB   GSP 
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119 
  POT    KJ  SUIT 
0.084 0.005 0.267 
Loadings:
     Factor1 Factor2 Factor3 Factor4 Factor5
FL    0.127   0.722   0.102  -0.117         
APP   0.451   0.134   0.270   0.206   0.258 
AA            0.129           0.686         
LA    0.222   0.246   0.827                 
SC    0.917           0.167                 
LC    0.851   0.125   0.279          -0.420 
HON   0.228  -0.220   0.777                 
SMS   0.880   0.266   0.111                 
EXP           0.773           0.171         
DRV   0.754   0.393   0.199           0.114 
AMB   0.909   0.187   0.112           0.165 
GSP   0.783   0.295   0.354   0.148  -0.181 
POT   0.717   0.362   0.446   0.267         
KJ    0.418   0.399   0.563  -0.585         
SUIT  0.351   0.764           0.148         

               Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings      5.490   2.507   2.188   1.028   0.331
Proportion Var   0.366   0.167   0.146   0.069   0.022
Cumulative Var   0.366   0.533   0.679   0.748   0.770

在對原始資料進行轉換成有5個因子組成的新變數

> fa <- factanal(~.,factors = 5,data=applicant,scores = "regression")
> fa


Call:
factanal(x = ~., factors = 5, data = applicant, scores = "regression")


Uniquenesses:
   FL   APP    AA    LA    SC    LC   HON   SMS   EXP   DRV   AMB   GSP 
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119 
  POT    KJ  SUIT 
0.084 0.005 0.267 


Loadings:
     Factor1 Factor2 Factor3 Factor4 Factor5
FL    0.127   0.722   0.102  -0.117         
APP   0.451   0.134   0.270   0.206   0.258 
AA            0.129           0.686         
LA    0.222   0.246   0.827                 
SC    0.917           0.167                 
LC    0.851   0.125   0.279          -0.420 
HON   0.228  -0.220   0.777                 
SMS   0.880   0.266   0.111                 
EXP           0.773           0.171         
DRV   0.754   0.393   0.199           0.114 
AMB   0.909   0.187   0.112           0.165 
GSP   0.783   0.295   0.354   0.148  -0.181 
POT   0.717   0.362   0.446   0.267         
KJ    0.418   0.399   0.563  -0.585         
SUIT  0.351   0.764           0.148         


               Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings      5.490   2.507   2.188   1.028   0.331
Proportion Var   0.366   0.167   0.146   0.069   0.022
Cumulative Var   0.366   0.533   0.679   0.748   0.770


Test of the hypothesis that 5 factors are sufficient.
The chi square statistic is 60.97 on 40 degrees of freedom.
The p-value is 0.0179 
> fa$scores
        Factor1     Factor2      Factor3     Factor4     Factor5
1   0.800717544  0.18668478 -0.851460896 -1.02805665  0.52205818
2   1.116241580  0.47700243  0.001629454 -0.43629124  0.36113830
3   0.879369406  0.29478854 -0.314716179 -1.02965924  0.33082062
4  -0.523388290  0.43753019  0.560799973  0.25097714  0.40522224
5  -0.846808386  1.21550502  1.085816718  0.45502930  1.07029291
6   0.003185837  0.27885951  0.243258421  0.12109434 -0.27226717
7   0.703922279  1.33861950  0.111053822  0.01088589  0.64206809
8   0.896108099  1.37342978  0.232713178 -0.35982102  0.35349535
9   0.455395763  1.17038462  0.244111085 -0.19242716  0.17911705
10  1.843009744 -0.18285199 -1.451198021  1.43700462  0.02806712
11  1.781056933 -0.22818096 -2.089052424  1.48488398  0.95136053
12  1.403740004 -0.53727939 -0.605003245  1.66579885 -0.39726150
13 -0.838419356  0.45881416  1.103624446  0.56651271  0.93295036
14 -0.765006924  0.30471946  0.836846379  0.95059097  1.57972186
15 -0.948618470  0.14818660  0.761841309  1.18822261  0.41131508
16  0.670346434  0.62562847 -0.275204781  0.28878032 -0.58926497
17  0.308422895  0.41267618 -0.135936211 -0.42814543  1.56803728
18  0.295571360 -0.46281204 -0.728936475 -1.16500937 -1.31962760
19  0.184298026 -0.21956636 -0.825069511 -0.55179100 -1.51372977
20  0.372855990  0.03579921  1.021768751 -0.31575185 -0.57686133
21  0.402538565 -0.46066903  0.589465103  0.86890942 -0.14399060
22  0.927698958  0.60250660  0.673815357 -1.14036612 -0.33194886
23  0.931887149  0.47998903  0.990087831 -0.83545711  0.59726767
24  0.585842905  0.66244517  1.202381977 -0.86524797 -0.62640797
25 -0.798685118 -0.23749354  0.480395810  0.05836780 -0.34004466
26 -0.781584615  0.15450648  0.518109233  0.43972335  0.54299850
27  0.372094679 -1.12074511  0.595496294  0.95822481 -1.09577912
28 -0.948459790 -0.81527906 -0.922269220 -1.77326821 -0.35738899
29 -1.213604298 -0.13087983 -1.503827903 -1.92660112  1.10293874
30 -0.484101748 -1.21181187  0.367311077 -1.68214807  0.52310776
31 -0.397976947 -0.69999678  0.622508916 -1.63410675  1.01452372
32 -0.173713063 -1.10108335  0.608407659 -0.12745790 -2.26318679
33 -0.256281136 -1.33038515  0.560781422 -0.40338235 -2.53222879
34 -1.183167544 -0.49422381 -0.008024526 -0.81458656 -0.10274921
35 -1.665000629 -0.33809240  0.059885560 -0.88500404  1.19916519
36 -0.514907574 -0.19120917  0.370984444 -0.45888647 -0.61709089
37  1.321163395 -1.72531922 -1.052264142  0.39130820  0.21100177
38  1.237411113 -1.32542774 -0.635392434 -0.30674409 -0.33706339
39  0.687041364  1.55535677  0.960484537 -0.39006072 -0.30682688
40  0.923821287  1.49189358  0.792049416 -0.40120250  0.04739454
41 -1.740662685  1.55241481 -2.301529543  1.10997918 -0.54355251
42 -1.946665869  1.74562267 -2.660094728  0.70599496 -1.13003023
43 -1.601371579  0.70912444 -0.096524877  0.77685896 -1.29579679
44  0.960832458  0.10579350 -0.315491319  0.20738885  0.51793979
45 -0.309142913 -0.38559454  1.071709950  1.49287914 -0.45272845
46  0.155406236 -0.52531271  1.194207841  1.80067575 -0.93031173
47 -1.182605366 -2.06429652 -0.395197469  1.05832124  1.50773379
48 -1.099807701 -2.02977096 -0.694352061  0.86306058  1.47640173

檢視各個樣本哪些因子比較大