題目:模擬產生統計專業同學的名單(學號區分),記錄數學分析、線性代數、概率統計三科成績,然後進行一些統計分析
- > num=seq(10378001,10378100)
- > num
- [1] 10378001 10378002 10378003 10378004 10378005 10378006 10378007 10378008
- [9] 10378009 10378010 10378011 10378012 10378013 10378014 10378015 10378016
- [17] 10378017 10378018 10378019 10378020 10378021 10378022 10378023 10378024
- [25] 10378025 10378026 10378027 10378028 10378029 10378030 10378031 10378032
- [33] 10378033 10378034 10378035 10378036 10378037 10378038 10378039 10378040
- [41] 10378041 10378042 10378043 10378044 10378045 10378046 10378047 10378048
- [49] 10378049 10378050 10378051 10378052 10378053 10378054 10378055 10378056
- [57] 10378057 10378058 10378059 10378060 10378061 10378062 10378063 10378064
- [65] 10378065 10378066 10378067 10378068 10378069 10378070 10378071 10378072
- [73] 10378073 10378074 10378075 10378076 10378077 10378078 10378079 10378080
- [81] 10378081 10378082 10378083 10378084 10378085 10378086 10378087 10378088
- [89] 10378089 10378090 10378091 10378092 10378093 10378094 10378095 10378096
- [97] 10378097 10378098 10378099 10378100
用runif(產生均勻分佈的隨機數)和rnorm(產生正態分佈的隨機數)
- > x1=round(runif(100,min=80,max=100))
- > x1
- [1] 81 94 98 86 86 95 88 90 93 86 87 93 93 85 85 87 84 93
- [19] 99 85 99 80 88 93 82 86 89 83 96 99 89 92 87 87 83 86
- [37] 89 88 85 92 86 84 87 86 88 94 89 93 95 99 99 92 89 100
- [55] 92 98 82 88 83 83 94 91 84 81 88 92 98 83 94 95 99 95
- [73] 81 82 86 94 85 83 81 87 98 90 81 81 90 85 80 92 98 82
- [91] 96 96 91 95 80 88 84 87 93 96
- > x2=round(rnorm(100,mean=80,sd=7))
- > x2
- [1] 72 67 83 81 82 81 73 73 74 84 72 86 87 79 85 70 76 93 73 85 89 77 75 72 82
- [26] 83 85 82 79 88 86 87 83 72 76 90 85 77 81 77 94 74 61 76 92 77 77 74 87 94
- [51] 87 81 66 76 73 75 81 84 89 70 73 86 81 80 79 81 82 74 75 65 77 75 75 87 90
- [76] 74 84 71 85 89 79 80 79 77 90 77 83 80 78 94 85 81 83 82 87 84 86 89 83 75
- > x3=round(rnorm(100,mean=83,sd=18))
- > x3
- [1] 85 107 96 83 82 60 68 106 52 78 114 78 74 80 76 121 84 90
- [19] 66 105 104 110 94 68 80 84 84 103 99 98 101 82 91 71 96 74
- [37] 82 115 77 70 84 82 74 88 83 100 92 70 77 98 103 58 79 85
- [55] 45 63 101 66 60 70 77 67 83 90 79 100 105 76 103 95 82 78
- [73] 72 54 64 83 85 92 93 120 100 98 82 73 93 110 90 102 81 98
- [91] 91 53 103 74 59 91 110 71 76 92
- > x3[which(x3>100)]=100 #將大於100分的成績換成100分
- > x3
- [1] 85 100 96 83 82 60 68 100 52 78 100 78 74 80 76 100 84 90
- [19] 66 100 100 100 94 68 80 84 84 100 99 98 100 82 91 71 96 74
- [37] 82 100 77 70 84 82 74 88 83 100 92 70 77 98 100 58 79 85
- [55] 45 63 100 66 60 70 77 67 83 90 79 100 100 76 100 95 82 78
- [73] 72 54 64 83 85 92 93 100 100 98 82 73 93 100 90 100 81 98
- [91] 91 53 100 74 59 91 100 71 76 92
合成數據框並儲存到硬碟
- > x=data.frame(num,x1,x2,x3)
- > x
- num x1 x2 x3
- 1 10378001 81 72 85
- 2 10378002 94 67 100
- 3 10378003 98 83 96
- 4 10378004 86 81 83
- 5 10378005 86 82 82
- 6 10378006 95 81 60
- 7 10378007 88 73 68
- 8 10378008 90 73 100
- 9 10378009 93 74 52
- 10 10378010 86 84 78
- 11 10378011 87 72 100
- 12 10378012 93 86 78
- 13 10378013 93 87 74
- 14 10378014 85 79 80
- 15 10378015 85 85 76
- 16 10378016 87 70 100
- 17 10378017 84 76 84
- 18 10378018 93 93 90
- 19 10378019 99 73 66
- 20 10378020 85 85 100
- 21 10378021 99 89 100
- 22 10378022 80 77 100
- 23 10378023 88 75 94
- 24 10378024 93 72 68
- 25 10378025 82 82 80
- 26 10378026 86 83 84
- 27 10378027 89 85 84
- 28 10378028 83 82 100
- 29 10378029 96 79 99
- 30 10378030 99 88 98
- 31 10378031 89 86 100
- 32 10378032 92 87 82
- 33 10378033 87 83 91
- 34 10378034 87 72 71
- 35 10378035 83 76 96
- 36 10378036 86 90 74
- 37 10378037 89 85 82
- 38 10378038 88 77 100
- 39 10378039 85 81 77
- 40 10378040 92 77 70
- 41 10378041 86 94 84
- 42 10378042 84 74 82
- 43 10378043 87 61 74
- 44 10378044 86 76 88
- 45 10378045 88 92 83
- 46 10378046 94 77 100
- 47 10378047 89 77 92
- 48 10378048 93 74 70
- 49 10378049 95 87 77
- 50 10378050 99 94 98
- 51 10378051 99 87 100
- 52 10378052 92 81 58
- 53 10378053 89 66 79
- 54 10378054 100 76 85
- 55 10378055 92 73 45
- 56 10378056 98 75 63
- 57 10378057 82 81 100
- 58 10378058 88 84 66
- 59 10378059 83 89 60
- 60 10378060 83 70 70
- 61 10378061 94 73 77
- 62 10378062 91 86 67
- 63 10378063 84 81 83
- 64 10378064 81 80 90
- 65 10378065 88 79 79
- 66 10378066 92 81 100
- 67 10378067 98 82 100
- 68 10378068 83 74 76
- 69 10378069 94 75 100
- 70 10378070 95 65 95
- 71 10378071 99 77 82
- 72 10378072 95 75 78
- 73 10378073 81 75 72
- 74 10378074 82 87 54
- 75 10378075 86 90 64
- 76 10378076 94 74 83
- 77 10378077 85 84 85
- 78 10378078 83 71 92
- 79 10378079 81 85 93
- 80 10378080 87 89 100
- 81 10378081 98 79 100
- 82 10378082 90 80 98
- 83 10378083 81 79 82
- 84 10378084 81 77 73
- 85 10378085 90 90 93
- 86 10378086 85 77 100
- 87 10378087 80 83 90
- 88 10378088 92 80 100
- 89 10378089 98 78 81
- 90 10378090 82 94 98
- 91 10378091 96 85 91
- 92 10378092 96 81 53
- 93 10378093 91 83 100
- 94 10378094 95 82 74
- 95 10378095 80 87 59
- 96 10378096 88 84 91
- 97 10378097 84 86 100
- 98 10378098 87 89 71
- 99 10378099 93 83 76
- 100 10378100 96 75 92
- > write.table(x,file="mark.txt",col.names=F,row.name=F,sep=" ")
計算各科平均分
- > mean(x)
- [1] NA
- Warning message:
- In mean.default(x) : 引數不是數值也不是邏輯值:回覆NA
- > colMeans(x)
- num x1 x2 x3
- 10378050.50 89.24 80.25 83.68
- > colMeans(x)[c("x1","x2","x3")]
- x1 x2 x3
- 89.24 80.25 83.68
- > apply(x,2,mean)
- num x1 x2 x3
- 10378050.50 89.24 80.25 83.68
求各科最高最低分
- > apply(x,2,max)
- num x1 x2 x3
- 10378100 100 94 100
- > apply(x,2,min)
- num x1 x2 x3
- 10378001 80 61 45
求每人的總分
- > apply(x[c("x1","x2","x3")],1,sum)
- [1] 238 261 277 250 250 236 229 263 219 248 259 257 254 244 246 257 244 276
- [19] 238 270 288 257 257 233 244 253 258 265 274 285 275 261 261 230 255 250
- [37] 256 265 243 239 264 240 222 250 263 271 258 237 259 291 286 231 234 261
- [55] 210 236 263 238 232 223 244 244 248 251 246 273 280 233 269 255 258 248
- [73] 228 223 240 251 254 246 259 276 277 268 242 231 273 262 253 272 257 274
- [91] 272 230 274 251 226 263 270 247 252 263
求總分最高的同學
- > which.max(apply(x[c("x1","x2","x3")],1,sum))
- [1] 50
- > x$num[which.max(apply(x[c("x1","x2","x3")],1,sum))]
- [1] 10378050
對x1進行直方圖分析
- > hist(x$x1)
探索各科成績的關聯關係
- > plot(x1,x2)
- > plot(x$x1,x$x2)
列聯表分析
- > table(x$x1)
- 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 98 99 100
- 3 6 4 6 4 6 8 7 7 5 3 2 6 7 5 5 4 5 6 1
- > barplot(table(x$x1))
餅圖
- > pie(table(x$x1))
箱線圖
- > boxplot(x$x1,x$x2,x$x3)
- > boxplot(x[2:4],col=c("red","green","blue"),notch=T)#顏色設定
- > boxplot(x$x1,x$x2,x$x3,horizontal=T)#水平放置
星相圖
- > stars(x[c("x1","x2","x3")])
- > stars(x[c("x1","x2","x3")],full=T,draw.segment=T)#雷達圖
- > stars(x[c("x1","x2","x3")],full=F,draw.segment=T)#雷達圖(半圓)
臉譜圖
- > library(aplpack)
- 載入需要的程輯包:tcltk
- > faces(x[c("x1","x2","x3")])
- effect of variables:
- modified item Var
- "height of face " "x1"
- "width of face " "x2"
- "structure of face" "x3"
- "height of mouth " "x1"
- "width of mouth " "x2"
- "smiling " "x3"
- "height of eyes " "x1"
- "width of eyes " "x2"
- "height of hair " "x3"
- "width of hair " "x1"
- "style of hair " "x2"
- "height of nose " "x3"
- "width of nose " "x1"
- "width of ear " "x2"
- "height of ear " "x3"
其它臉譜圖
- > library(TeachingDemos)
- 載入程輯包:‘TeachingDemos’
- The following objects are masked from ‘package:aplpack’:
- faces, slider
- > faces2(x)
莖葉圖
- > stem(x$x1)
- The decimal point is at the |
- 80 | 000000000
- 82 | 0000000000
- 84 | 0000000000
- 86 | 000000000000000
- 88 | 000000000000
- 90 | 00000
- 92 | 0000000000000
- 94 | 0000000000
- 96 | 0000
- 98 | 00000000000
- 100 | 0
QQ圖
可用於判斷是否正態分佈
直線的斜率是標準差,截距是均值
點的分佈越是接近直線,則越接近正態分佈
- > qqnorm(x1)
- > qqline(x1)
- > qqnorm(x3)
- > qqline(x3)
散點圖的進一步設定
- plot(x$x1,x$x2
- main="數學分析與線性代數成績的關係",
- xlab="數學分析",
- ylab="線性代數",
- xlim=c(0,100),
- ylim=c(0,100),
- xaxs="i",#Set x axis style as internal
- yaxs="i",#Set y axis style as internal
- col="red",#Set the color of plotting symbol to red
- pch=19)#Set the ploting symbol to filled dots