題目:模擬產生統計專業同學的名單(學號區分),記錄數學分析、線性代數、概率統計三科成績,然後進行一些統計分析

  1. > num=seq(10378001,10378100)
  2. > num
  3. [1] 10378001 10378002 10378003 10378004 10378005 10378006 10378007 10378008
  4. [9] 10378009 10378010 10378011 10378012 10378013 10378014 10378015 10378016
  5. [17] 10378017 10378018 10378019 10378020 10378021 10378022 10378023 10378024
  6. [25] 10378025 10378026 10378027 10378028 10378029 10378030 10378031 10378032
  7. [33] 10378033 10378034 10378035 10378036 10378037 10378038 10378039 10378040
  8. [41] 10378041 10378042 10378043 10378044 10378045 10378046 10378047 10378048
  9. [49] 10378049 10378050 10378051 10378052 10378053 10378054 10378055 10378056
  10. [57] 10378057 10378058 10378059 10378060 10378061 10378062 10378063 10378064
  11. [65] 10378065 10378066 10378067 10378068 10378069 10378070 10378071 10378072
  12. [73] 10378073 10378074 10378075 10378076 10378077 10378078 10378079 10378080
  13. [81] 10378081 10378082 10378083 10378084 10378085 10378086 10378087 10378088
  14. [89] 10378089 10378090 10378091 10378092 10378093 10378094 10378095 10378096
  15. [97] 10378097 10378098 10378099 10378100

用runif(產生均勻分佈的隨機數)和rnorm(產生正態分佈的隨機數)

  1. > x1=round(runif(100,min=80,max=100))
  2. > x1
  3. [1] 81 94 98 86 86 95 88 90 93 86 87 93 93 85 85 87 84 93
  4. [19] 99 85 99 80 88 93 82 86 89 83 96 99 89 92 87 87 83 86
  5. [37] 89 88 85 92 86 84 87 86 88 94 89 93 95 99 99 92 89 100
  6. [55] 92 98 82 88 83 83 94 91 84 81 88 92 98 83 94 95 99 95
  7. [73] 81 82 86 94 85 83 81 87 98 90 81 81 90 85 80 92 98 82
  8. [91] 96 96 91 95 80 88 84 87 93 96
  9. > x2=round(rnorm(100,mean=80,sd=7))
  10. > x2
  11. [1] 72 67 83 81 82 81 73 73 74 84 72 86 87 79 85 70 76 93 73 85 89 77 75 72 82
  12. [26] 83 85 82 79 88 86 87 83 72 76 90 85 77 81 77 94 74 61 76 92 77 77 74 87 94
  13. [51] 87 81 66 76 73 75 81 84 89 70 73 86 81 80 79 81 82 74 75 65 77 75 75 87 90
  14. [76] 74 84 71 85 89 79 80 79 77 90 77 83 80 78 94 85 81 83 82 87 84 86 89 83 75
  15. > x3=round(rnorm(100,mean=83,sd=18))
  16. > x3
  17. [1] 85 107 96 83 82 60 68 106 52 78 114 78 74 80 76 121 84 90
  18. [19] 66 105 104 110 94 68 80 84 84 103 99 98 101 82 91 71 96 74
  19. [37] 82 115 77 70 84 82 74 88 83 100 92 70 77 98 103 58 79 85
  20. [55] 45 63 101 66 60 70 77 67 83 90 79 100 105 76 103 95 82 78
  21. [73] 72 54 64 83 85 92 93 120 100 98 82 73 93 110 90 102 81 98
  22. [91] 91 53 103 74 59 91 110 71 76 92
  23. > x3[which(x3>100)]=100 #將大於100分的成績換成100分
  24. > x3
  25. [1] 85 100 96 83 82 60 68 100 52 78 100 78 74 80 76 100 84 90
  26. [19] 66 100 100 100 94 68 80 84 84 100 99 98 100 82 91 71 96 74
  27. [37] 82 100 77 70 84 82 74 88 83 100 92 70 77 98 100 58 79 85
  28. [55] 45 63 100 66 60 70 77 67 83 90 79 100 100 76 100 95 82 78
  29. [73] 72 54 64 83 85 92 93 100 100 98 82 73 93 100 90 100 81 98
  30. [91] 91 53 100 74 59 91 100 71 76 92

合成數據框並儲存到硬碟

  1. > x=data.frame(num,x1,x2,x3)
  2. > x
  3. num x1 x2 x3
  4. 1 10378001 81 72 85
  5. 2 10378002 94 67 100
  6. 3 10378003 98 83 96
  7. 4 10378004 86 81 83
  8. 5 10378005 86 82 82
  9. 6 10378006 95 81 60
  10. 7 10378007 88 73 68
  11. 8 10378008 90 73 100
  12. 9 10378009 93 74 52
  13. 10 10378010 86 84 78
  14. 11 10378011 87 72 100
  15. 12 10378012 93 86 78
  16. 13 10378013 93 87 74
  17. 14 10378014 85 79 80
  18. 15 10378015 85 85 76
  19. 16 10378016 87 70 100
  20. 17 10378017 84 76 84
  21. 18 10378018 93 93 90
  22. 19 10378019 99 73 66
  23. 20 10378020 85 85 100
  24. 21 10378021 99 89 100
  25. 22 10378022 80 77 100
  26. 23 10378023 88 75 94
  27. 24 10378024 93 72 68
  28. 25 10378025 82 82 80
  29. 26 10378026 86 83 84
  30. 27 10378027 89 85 84
  31. 28 10378028 83 82 100
  32. 29 10378029 96 79 99
  33. 30 10378030 99 88 98
  34. 31 10378031 89 86 100
  35. 32 10378032 92 87 82
  36. 33 10378033 87 83 91
  37. 34 10378034 87 72 71
  38. 35 10378035 83 76 96
  39. 36 10378036 86 90 74
  40. 37 10378037 89 85 82
  41. 38 10378038 88 77 100
  42. 39 10378039 85 81 77
  43. 40 10378040 92 77 70
  44. 41 10378041 86 94 84
  45. 42 10378042 84 74 82
  46. 43 10378043 87 61 74
  47. 44 10378044 86 76 88
  48. 45 10378045 88 92 83
  49. 46 10378046 94 77 100
  50. 47 10378047 89 77 92
  51. 48 10378048 93 74 70
  52. 49 10378049 95 87 77
  53. 50 10378050 99 94 98
  54. 51 10378051 99 87 100
  55. 52 10378052 92 81 58
  56. 53 10378053 89 66 79
  57. 54 10378054 100 76 85
  58. 55 10378055 92 73 45
  59. 56 10378056 98 75 63
  60. 57 10378057 82 81 100
  61. 58 10378058 88 84 66
  62. 59 10378059 83 89 60
  63. 60 10378060 83 70 70
  64. 61 10378061 94 73 77
  65. 62 10378062 91 86 67
  66. 63 10378063 84 81 83
  67. 64 10378064 81 80 90
  68. 65 10378065 88 79 79
  69. 66 10378066 92 81 100
  70. 67 10378067 98 82 100
  71. 68 10378068 83 74 76
  72. 69 10378069 94 75 100
  73. 70 10378070 95 65 95
  74. 71 10378071 99 77 82
  75. 72 10378072 95 75 78
  76. 73 10378073 81 75 72
  77. 74 10378074 82 87 54
  78. 75 10378075 86 90 64
  79. 76 10378076 94 74 83
  80. 77 10378077 85 84 85
  81. 78 10378078 83 71 92
  82. 79 10378079 81 85 93
  83. 80 10378080 87 89 100
  84. 81 10378081 98 79 100
  85. 82 10378082 90 80 98
  86. 83 10378083 81 79 82
  87. 84 10378084 81 77 73
  88. 85 10378085 90 90 93
  89. 86 10378086 85 77 100
  90. 87 10378087 80 83 90
  91. 88 10378088 92 80 100
  92. 89 10378089 98 78 81
  93. 90 10378090 82 94 98
  94. 91 10378091 96 85 91
  95. 92 10378092 96 81 53
  96. 93 10378093 91 83 100
  97. 94 10378094 95 82 74
  98. 95 10378095 80 87 59
  99. 96 10378096 88 84 91
  100. 97 10378097 84 86 100
  101. 98 10378098 87 89 71
  102. 99 10378099 93 83 76
  103. 100 10378100 96 75 92
  104. > write.table(x,file="mark.txt",col.names=F,row.name=F,sep=" ")

計算各科平均分

  1. > mean(x)
  2. [1] NA
  3. Warning message:
  4. In mean.default(x) : 引數不是數值也不是邏輯值:回覆NA
  5. > colMeans(x)
  6. num x1 x2 x3
  7. 10378050.50 89.24 80.25 83.68
  8. > colMeans(x)[c("x1","x2","x3")]
  9. x1 x2 x3
  10. 89.24 80.25 83.68
  11. > apply(x,2,mean)
  12. num x1 x2 x3
  13. 10378050.50 89.24 80.25 83.68

求各科最高最低分

  1. > apply(x,2,max)
  2. num x1 x2 x3
  3. 10378100 100 94 100
  4. > apply(x,2,min)
  5. num x1 x2 x3
  6. 10378001 80 61 45

求每人的總分

  1. > apply(x[c("x1","x2","x3")],1,sum)
  2. [1] 238 261 277 250 250 236 229 263 219 248 259 257 254 244 246 257 244 276
  3. [19] 238 270 288 257 257 233 244 253 258 265 274 285 275 261 261 230 255 250
  4. [37] 256 265 243 239 264 240 222 250 263 271 258 237 259 291 286 231 234 261
  5. [55] 210 236 263 238 232 223 244 244 248 251 246 273 280 233 269 255 258 248
  6. [73] 228 223 240 251 254 246 259 276 277 268 242 231 273 262 253 272 257 274
  7. [91] 272 230 274 251 226 263 270 247 252 263

求總分最高的同學

  1. > which.max(apply(x[c("x1","x2","x3")],1,sum))
  2. [1] 50
  3. > x$num[which.max(apply(x[c("x1","x2","x3")],1,sum))]
  4. [1] 10378050

對x1進行直方圖分析

  1. > hist(x$x1)

探索各科成績的關聯關係

  1. > plot(x1,x2)
  2. > plot(x$x1,x$x2)

列聯表分析

  1. > table(x$x1)
  2.  
  3. 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 98 99 100
  4. 3 6 4 6 4 6 8 7 7 5 3 2 6 7 5 5 4 5 6 1
  5. > barplot(table(x$x1))

餅圖

  1. > pie(table(x$x1))

箱線圖

  1. > boxplot(x$x1,x$x2,x$x3)

  1. > boxplot(x[2:4],col=c("red","green","blue"),notch=T)#顏色設定

  1. > boxplot(x$x1,x$x2,x$x3,horizontal=T)#水平放置

星相圖

  1. > stars(x[c("x1","x2","x3")])

  1. > stars(x[c("x1","x2","x3")],full=T,draw.segment=T)#雷達圖

  1. > stars(x[c("x1","x2","x3")],full=F,draw.segment=T)#雷達圖(半圓)

臉譜圖

  1. > library(aplpack)
  2. 載入需要的程輯包:tcltk
  3. > faces(x[c("x1","x2","x3")])
  4. effect of variables:
  5. modified item Var
  6. "height of face " "x1"
  7. "width of face " "x2"
  8. "structure of face" "x3"
  9. "height of mouth " "x1"
  10. "width of mouth " "x2"
  11. "smiling " "x3"
  12. "height of eyes " "x1"
  13. "width of eyes " "x2"
  14. "height of hair " "x3"
  15. "width of hair " "x1"
  16. "style of hair " "x2"
  17. "height of nose " "x3"
  18. "width of nose " "x1"
  19. "width of ear " "x2"
  20. "height of ear " "x3"

其它臉譜圖

  1. > library(TeachingDemos)
  2.  
  3. 載入程輯包:‘TeachingDemos
  4.  
  5. The following objects are masked from package:aplpack’:
  6.  
  7. faces, slider
  8.  
  9. > faces2(x)

莖葉圖

  1. > stem(x$x1)
  2.  
  3. The decimal point is at the |
  4.  
  5. 80 | 000000000
  6. 82 | 0000000000
  7. 84 | 0000000000
  8. 86 | 000000000000000
  9. 88 | 000000000000
  10. 90 | 00000
  11. 92 | 0000000000000
  12. 94 | 0000000000
  13. 96 | 0000
  14. 98 | 00000000000
  15. 100 | 0

QQ圖

可用於判斷是否正態分佈

直線的斜率是標準差,截距是均值

點的分佈越是接近直線,則越接近正態分佈

  1. > qqnorm(x1)
  2. > qqline(x1)
  3. > qqnorm(x3)
  4. > qqline(x3)

散點圖的進一步設定

  1. plot(x$x1,x$x2
  2. main="數學分析與線性代數成績的關係",
  3. xlab="數學分析",
  4. ylab="線性代數",
  5. xlim=c(0,100),
  6. ylim=c(0,100),
  7. xaxs="i",#Set x axis style as internal
  8. yaxs="i",#Set y axis style as internal
  9. col="red",#Set the color of plotting symbol to red
  10. pch=19)#Set the ploting symbol to filled dots