1. 程式人生 > >R語言實戰k-means聚類和關聯規則演算法

R語言實戰k-means聚類和關聯規則演算法

1、R語言關於k-means聚類

資料集格式如下所示:

,河東路與嶴東路&河東路與聚賢橋路,河東路與嶴東路&新悅路與嶴東路,河東路與嶴東路&火炬路與聚賢橋路,河東路與嶴東路&火炬路與匯智橋路,河東路與嶴東路&匯智橋與智力島路,新悅路與嶴東路&火炬路與聚賢橋路,新悅路與嶴東路&河東路與聚賢橋路,新悅路與嶴東路&河東路與嶴東路,新悅路與嶴東路&匯智橋與智力島路,新悅路與嶴東路&火炬路與匯智橋路,河東路與聚賢橋路&新悅路與嶴東路,河東路與聚賢橋路&火炬路與聚賢橋路,河東路與聚賢橋路&河東路與嶴東路,河東路與聚賢橋路&匯智橋與智力島路,河東路與聚賢橋路&火炬路與匯智橋路,火炬路與匯智橋路&新悅路與嶴東路,火炬路與匯智橋路&火炬路與聚賢橋路,火炬路與匯智橋路&匯智橋與智力島路,火炬路與匯智橋路&河東路與聚賢橋路,火炬路與匯智橋路&河東路與嶴東路,匯智橋與智力島路&新悅路與嶴東路,匯智橋與智力島路&火炬路與聚賢橋路,匯智橋與智力島路&火炬路與匯智橋路,匯智橋與智力島路&河東路與嶴東路,匯智橋與智力島路&河東路與聚賢橋路,火炬路與聚賢橋路&新悅路與嶴東路,火炬路與聚賢橋路&河東路與嶴東路,火炬路與聚賢橋路&河東路與聚賢橋路,火炬路與聚賢橋路&匯智橋與智力島路,火炬路與聚賢橋路&火炬路與匯智橋路
藍魯BP9G39,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
藍魯B7M827,1,23,0,1,0,0,2,55,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
藍魯BQ3M79,0,11,0,0,0,0,1,10,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
藍魯BU008P,0,4,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
藍魯BW6710,14,0,0,0,0,0,0,0,0,0,0,0,14,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0
藍魯BS180G,0,1,0,0,0,0,0,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
藍魯B3HU73,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

程式碼:
library(fpc)
data<-read.csv('x.csv')
df<-data[2:31]
set.seed(252964) 
(kmeans <- kmeans(na.omit(df), 100)) 
plotcluster(na.omit(df), kmeans$cluster)   #作圖
kmeans           #表示檢視聚類結果
kmeans$cluster   #表示檢視聚類結果
kmeans$center    #表示檢視聚類中心
write.csv(kmeans$cluster,'100classes.csv') #將聚類的結果寫入到檔案中
2、R語言關聯規則

資料集格式

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0
0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
每列代表一個屬性,表示出現這個屬性,每行代表記錄數

程式碼如下:

library(arules)
groceries <- read.transactions("groceries.csv")
summary(groceries)
</pre><pre code_snippet_id="1620120" snippet_file_name="blog_20160322_6_7367204" name="code" class="html">/*Apriori演算法*/
frequentsets=eclat(Groceries,parameter=list(support=0.05,maxlen=10)) #求頻繁項集
inspect(frequentsets[1:10]) #察看求得的頻繁項集
inspect(sort(frequentsets,by=”support”)[1:10]) #根據支援度對求得的頻繁項集排序並察看(等價於inspect(sort(frequentsets)[1:10])
</pre><pre code_snippet_id="1620120" snippet_file_name="blog_20160322_8_2841846" name="code" class="html">/*Eclat演算法*/
<p>rules=apriori(Groceries,parameter=list(support=0.01,confidence=0.01)) #求關聯規則</p><p>summary(rules) #察看求得的關聯規則之摘要</p><p>x=subset(rules,subset=rhs%in%”whole milk”&lift>=1.2) #求所需要的關聯規則子集</p><p>inspect(sort(x,by=”support”)[1:5]) #根據支援度對求得的關聯規則子集排序並察看</p><div>
</div>