R語言學習之

阿新 • • 發佈：2019-01-29

可擴充套件的時間序列xts

R的極客理想系列文章，涵蓋了R的思想，使用，工具，創新等的一系列要點，以我個人的學習和體驗去詮釋R的強大。

R語言作為統計學一門語言，一直在小眾領域閃耀著光芒。直到大資料的爆發，R語言變成了一門炙手可熱的資料分析的利器。隨著越來越多的工程背景的人的加入，R語言的社群在迅速擴大成長。現在已不僅僅是統計領域，教育，銀行，電商，網際網路….都在使用R語言。

要成為有理想的極客，我們不能停留在語法上，要掌握牢固的數學，概率，統計知識，同時還要有創新精神，把R語言發揮到各個領域。讓我們一起動起來吧，開始R的極客理想。

關於作者：

張丹(Conan), 程式設計師Java,R,PHP,Javascript

weibo：@Conan_Z
email: [email protected]

前言

本文是繼R語言zoo時間序列基礎庫的擴充套件實現。看上去簡單的時間序列，內藏複雜的規律。zoo作為時間序列的基礎庫，是面向通用的設計，可以用來定義股票資料，也可以分析天氣資料。但由於業務行為的不同，我們需要更多的輔助函式，來幫助我們更高效的完成任務。

xts擴充套件了zoo，提供更多的資料處理和資料變換的函式。

xts介紹
xts安裝
xts資料結構
xts的API介紹
xts使用

1. xts介紹

xts是對時間序列資料(zoo)的一種擴充套件實現，目標是為了統一時間序列的操作介面。實際上，xts型別繼承了zoo型別，豐富了時間序列資料處理的函式，API定義更貼近使用者，更實用，更簡單！

2. xts安裝

系統環境

Win7 64bit
R: 3.0.1 x86_64-w64-mingw32/x64 b4bit

xts安裝


> install.packages("xts")
also installing the dependency ‘zoo’

trying URL 'http://mirror.bjtu.edu.cn/cran/bin/windows/contrib/3.0/zoo_1.7-10.zip'
Content type 'application/zip' length 875046 bytes (854 Kb)
opened URL
downloaded 854 Kb

trying URL 'http://mirror.bjtu.edu.cn/cran/bin/windows/contrib/3.0/xts_0.9-7.zip'
Content type 'application/zip' length 661664 bytes (646 Kb)
opened URL
downloaded 646 Kb

package ‘zoo’ successfully unpacked and MD5 sums checked
package ‘xts’ successfully unpacked and MD5 sums checked

3. xts資料結構

xts擴充套件zoo的基礎結構，由3部分組合。

索引部分：時間型別向量
資料部分：以矩陣為基礎型別，支援可以與矩陣相互轉換的任何型別
屬性部分：附件資訊，包括時區，索引時間型別的格式等

4. xts的API介紹

xts基礎

xts: 定義xts資料型別，繼承zoo型別
coredata.xts: 對xts部分資料賦值
xtsAttributes: xts物件屬性賦值
[.xts: 用[]語法，取資料子集
dimnames.xts: xts維度名賦值
sample_matrix: 測試資料集，包括180條xts物件的記錄，matrix型別
xtsAPI: C語言API介面

型別轉換

as.xts: 轉換物件到xts(zoo)型別
as.xts.methods: 轉換物件到xts函式
plot.xts: 為plot函式，提供xts的介面作圖
.parseISO8601: 把字串(ISO8601格式)輸出為，POSIXct型別的，包括開始時間和結束時間的list物件
firstof: 建立一個開始時間，POSIXct型別
lastof: 建立一個結束時間，POSIXct型別
indexClass: 取索引型別
.indexDate: 取索引的
.indexday: 索引的日值
.indexyday: 索引的年(日)值
.indexmday: 索引的月(日)值
.indexwday: 索引的周(日)值
.indexweek: 索引的周值
.indexmon: 索引的月值
.indexyear: 索引的年值
.indexhour: 索引的時值
.indexmin: 索引的分值
.indexsec: 索引的秒值

資料處理

align.time: 以下一個時間對齊資料，秒，分鐘，小時
endpoints: 按時間單元提取索引資料
merge.xts: 合併多個xts物件，重寫zoo::merge.zoo函式
rbind.xts: 資料按行合併，為rbind函式，提供xts的介面
split.xts: 資料分隔，為split函式，提供xts的介面
na.locf.xts: 替換NA值，重寫zoo:na.locf函式

資料統計

apply.daily: 按日分割資料，執行函式
apply.weekly: 按周分割資料，執行函式
apply.monthly: 按月分割資料，執行函式
apply.quarterly: 按季分割資料，執行函式
apply.yearly: 按年分割資料，執行函式
to.period: 按期間分割資料
period.apply: 按期間執行自定義函式
period.max: 按期間計算最大值
period.min: 按期間計算最小值
period.prod: 按期間計算指數
period.sum: 按期間求和
nseconds: 計算資料集，包括多少秒
nminutes: 計算資料集，包括多少分
nhours: 計算資料集，包括多少時
ndays: 計算資料集，包括多少日
nweeks: 計算資料集，包括多少周
nmonths: 計算資料集，包括多少月
nquarters: 計算資料集，包括多少季
nyears: 計算資料集，包括多少年
periodicity: 檢視時間序列的期間

輔助工具

first: 從開始到結束，設定條件取子集
last: 從結束到開始，設定條件取子集
timeBased: 判斷是否是時間型別
timeBasedSeq: 建立時間的序列
diff.xts: 計算步長和差分
isOrdered: 檢查向量是否是順序的
make.index.unique: 強制時間唯一，增加毫秒隨機數
axTicksByTime: 計算X軸刻度標記位置按時間描述
indexTZ: 查詢xts物件的時區

5. xts使用

1). xts型別基本操作
2). xts的作圖
3). xts型別轉換
4). xts資料處理
5). xts資料統計計算
6). xts時間序列工具使用

1). xts型別基本操作

測試資料集sample_matrix


> library(xts)
> data(sample_matrix)
> head(sample_matrix)
               Open     High      Low    Close
2007-01-02 50.03978 50.11778 49.95041 50.11778
2007-01-03 50.23050 50.42188 50.23050 50.39767
2007-01-04 50.42096 50.42096 50.26414 50.33236
2007-01-05 50.37347 50.37347 50.22103 50.33459
2007-01-06 50.24433 50.24433 50.11121 50.18112
2007-01-07 50.13211 50.21561 49.99185 49.99185

定義xts型別物件


> sample.xts <- as.xts(sample_matrix, descr='my new xts object')
> class(sample.xts)
[1] "xts" "zoo"

> str(sample.xts)
An ‘xts’ object on 2007-01-02/2007-06-30 containing:
  Data: num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:4] "Open" "High" "Low" "Close"
  Indexed by objects of class: [POSIXct,POSIXt] TZ: 
  xts Attributes:  
List of 1
 $ descr: chr "my new xts object"

> head(sample.xts)
               Open     High      Low    Close
2007-01-02 50.03978 50.11778 49.95041 50.11778
2007-01-03 50.23050 50.42188 50.23050 50.39767
2007-01-04 50.42096 50.42096 50.26414 50.33236
2007-01-05 50.37347 50.37347 50.22103 50.33459
2007-01-06 50.24433 50.24433 50.11121 50.18112
2007-01-07 50.13211 50.21561 49.99185 49.99185

> attr(sample.xts,'descr')
[1] "my new xts object"

xts資料查詢


> head(sample.xts['2007'])
               Open     High      Low    Close
2007-01-02 50.03978 50.11778 49.95041 50.11778
2007-01-03 50.23050 50.42188 50.23050 50.39767
2007-01-04 50.42096 50.42096 50.26414 50.33236
2007-01-05 50.37347 50.37347 50.22103 50.33459
2007-01-06 50.24433 50.24433 50.11121 50.18112
2007-01-07 50.13211 50.21561 49.99185 49.99185

> head(sample.xts['2007-03/'])
               Open     High      Low    Close
2007-03-01 50.81620 50.81620 50.56451 50.57075
2007-03-02 50.60980 50.72061 50.50808 50.61559
2007-03-03 50.73241 50.73241 50.40929 50.41033
2007-03-04 50.39273 50.40881 50.24922 50.32636
2007-03-05 50.26501 50.34050 50.26501 50.29567
2007-03-06 50.27464 50.32019 50.16380 50.16380

> head(sample.xts['2007-03-06/2007'])
               Open     High      Low    Close
2007-03-06 50.27464 50.32019 50.16380 50.16380
2007-03-07 50.14458 50.20278 49.91381 49.91381
2007-03-08 49.93149 50.00364 49.84893 49.91839
2007-03-09 49.92377 49.92377 49.74242 49.80712
2007-03-10 49.79370 49.88984 49.70385 49.88698
2007-03-11 49.83062 49.88295 49.76031 49.78806

> sample.xts['2007-01-03']
              Open     High     Low    Close
2007-01-03 50.2305 50.42188 50.2305 50.39767

2). 操作xts的作圖

曲線圖


> data(sample_matrix)
> plot(sample_matrix)

> plot(as.xts(sample_matrix))
Warning message:
In plot.xts(as.xts(sample_matrix)) :
  only the univariate series will be plotted

K線圖


> plot(as.xts(sample_matrix), type='candles')

3). xts型別轉換

分別建立首尾時間：firstof, lastof


> firstof(2000)
[1] "2000-01-01 CST"

> firstof(2005,01,01)
[1] "2005-01-01 CST"

> lastof(2007)
[1] "2007-12-31 23:59:59.99998 CST"

> lastof(2007,10)
[1] "2007-10-31 23:59:59.99998 CST"

建立首尾時間


> .parseISO8601('2000')
$first.time
[1] "2000-01-01 CST"

$last.time
[1] "2000-12-31 23:59:59.99998 CST"

> .parseISO8601('2000-05/2001-02')
$first.time
[1] "2000-05-01 CST"

$last.time
[1] "2001-02-28 23:59:59.99998 CST"

> .parseISO8601('2000-01/02')
$first.time
[1] "2000-01-01 CST"

$last.time
[1] "2000-02-29 23:59:59.99998 CST"

> .parseISO8601('T08:30/T15:00')
$first.time
[1] "1970-01-01 08:30:00 CST"

$last.time
[1] "1970-12-31 15:00:59.99999 CST"

取索引型別


> x <- timeBasedSeq('2010-01-01/2010-01-02 12:00')
> x <- xts(1:length(x), x)

> head(x)
                    [,1]
2010-01-01 00:00:00    1
2010-01-01 00:01:00    2
2010-01-01 00:02:00    3
2010-01-01 00:03:00    4
2010-01-01 00:04:00    5
2010-01-01 00:05:00    6

> indexClass(x)
[1] "POSIXt"  "POSIXct"

索引時間格式化


> indexFormat(x) <- "%Y-%b-%d %H:%M:%OS3"
> head(x)
                          [,1]
2010-一月-01 00:00:00.000    1
2010-一月-01 00:01:00.000    2
2010-一月-01 00:02:00.000    3
2010-一月-01 00:03:00.000    4
2010-一月-01 00:04:00.000    5
2010-一月-01 00:05:00.000    6

取索引時間


> .indexhour(head(x))
[1] 0 0 0 0 0 0

> .indexmin(head(x))
[1] 0 1 2 3 4 5

4). xts資料處理
資料對齊


> x <- Sys.time() + 1:30

#整10秒對齊
> align.time(x, 10)
 [1] "2013-11-18 15:42:30 CST" "2013-11-18 15:42:30 CST"
 [3] "2013-11-18 15:42:30 CST" "2013-11-18 15:42:40 CST"
 [5] "2013-11-18 15:42:40 CST" "2013-11-18 15:42:40 CST"
 [7] "2013-11-18 15:42:40 CST" "2013-11-18 15:42:40 CST"
 [9] "2013-11-18 15:42:40 CST" "2013-11-18 15:42:40 CST"
[11] "2013-11-18 15:42:40 CST" "2013-11-18 15:42:40 CST"
[13] "2013-11-18 15:42:40 CST" "2013-11-18 15:42:50 CST"
[15] "2013-11-18 15:42:50 CST" "2013-11-18 15:42:50 CST"
[17] "2013-11-18 15:42:50 CST" "2013-11-18 15:42:50 CST"
[19] "2013-11-18 15:42:50 CST" "2013-11-18 15:42:50 CST"
[21] "2013-11-18 15:42:50 CST" "2013-11-18 15:42:50 CST"
[23] "2013-11-18 15:42:50 CST" "2013-11-18 15:43:00 CST"
[25] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[27] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[29] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"

#整60秒對齊
> align.time(x, 60)
 [1] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
 [3] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
 [5] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
 [7] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
 [9] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[11] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[13] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[15] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[17] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[19] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[21] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[23] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[25] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[27] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"
[29] "2013-11-18 15:43:00 CST" "2013-11-18 15:43:00 CST"

按時間分割資料，並計算


> xts.ts <- xts(rnorm(231),as.Date(13514:13744,origin="1970-01-01"))
> apply.monthly(xts.ts,mean)
                  [,1]
2007-01-31  0.17699984
2007-02-28  0.30734220
2007-03-31 -0.08757189
2007-04-30  0.18734688
2007-05-31  0.04496954
2007-06-30  0.06884836
2007-07-31  0.25081814
2007-08-19 -0.28845938

> apply.monthly(xts.ts,function(x) var(x))
                [,1]
2007-01-31 0.9533217
2007-02-28 0.9158947
2007-03-31 1.2821450
2007-04-30 1.2805976
2007-05-31 0.9725438
2007-06-30 1.5228904
2007-07-31 0.8737030
2007-08-19 0.8490521

> apply.quarterly(xts.ts,mean)
                 [,1]
2007-03-31 0.12642053
2007-06-30 0.09977926
2007-08-19 0.04589268

> apply.yearly(xts.ts,mean)
                 [,1]
2007-08-19 0.09849522

按期間分隔：to.period


> data(sample_matrix)
> to.period(sample_matrix)
           sample_matrix.Open sample_matrix.High sample_matrix.Low sample_matrix.Close
2007-01-31           50.03978           50.77336          49.76308            50.22578
2007-02-28           50.22448           51.32342          50.19101            50.77091
2007-03-31           50.81620           50.81620          48.23648            48.97490
2007-04-30           48.94407           50.33781          48.80962            49.33974
2007-05-31           49.34572           49.69097          47.51796            47.73780
2007-06-30           47.74432           47.94127          47.09144            47.76719
> class(to.period(sample_matrix))
[1] "matrix"

> samplexts <- as.xts(sample_matrix)
> to.period(samplexts)
           samplexts.Open samplexts.High samplexts.Low samplexts.Close
2007-01-31       50.03978       50.77336      49.76308        50.22578
2007-02-28       50.22448       51.32342      50.19101        50.77091
2007-03-31       50.81620       50.81620      48.23648        48.97490
2007-04-30       48.94407       50.33781      48.80962        49.33974
2007-05-31       49.34572       49.69097      47.51796        47.73780
2007-06-30       47.74432       47.94127      47.09144        47.76719
> class(to.period(samplexts))
[1] "xts" "zoo"

按期間分割索引資料


> data(sample_matrix)

> endpoints(sample_matrix)
[1]   0  30  58  89 119 150 180

> endpoints(sample_matrix, 'days',k=7)
 [1]   0   6  13  20  27  34  41  48  55  62  69  76  83  90  97 104 111 118 125
[20] 132 139 146 153 160 167 174 180

> endpoints(sample_matrix, 'weeks')
 [1]   0   7  14  21  28  35  42  49  56  63  70  77  84  91  98 105 112 119 126
[20] 133 140 147 154 161 168 175 180

> endpoints(sample_matrix, 'months')
[1]   0  30  58  89 119 150 180

資料合併：按列合併


> (x <- xts(4:10, Sys.Date()+4:10))
           [,1]
2013-11-22    4
2013-11-23    5
2013-11-24    6
2013-11-25    7
2013-11-26    8
2013-11-27    9
2013-11-28   10

> (y <- xts(1:6, Sys.Date()+1:6))
           [,1]
2013-11-19    1
2013-11-20    2
2013-11-21    3
2013-11-22    4
2013-11-23    5
2013-11-24    6

> merge(x,y)
            x  y
2013-11-19 NA  1
2013-11-20 NA  2
2013-11-21 NA  3
2013-11-22  4  4
2013-11-23  5  5
2013-11-24  6  6
2013-11-25  7 NA
2013-11-26  8 NA
2013-11-27  9 NA
2013-11-28 10 NA

#取索引將領合併
> merge(x,y, join='inner')
           x y
2013-11-22 4 4
2013-11-23 5 5
2013-11-24 6 6

#以左側為基礎合併
> merge(x,y, join='left')
            x  y
2013-11-22  4  4
2013-11-23  5  5
2013-11-24  6  6
2013-11-25  7 NA
2013-11-26  8 NA
2013-11-27  9 NA
2013-11-28 10 NA

資料合併：按行合併


> x <- xts(1:3, Sys.Date()+1:3)

> rbind(x,x)
           [,1]
2013-11-19    1
2013-11-19    1
2013-11-20    2
2013-11-20    2
2013-11-21    3
2013-11-21    3

資料切片：按行切片


> data(sample_matrix)
> x <- as.xts(sample_matrix)

按月切片
> split(x)[[1]]
               Open     High      Low    Close
2007-01-02 50.03978 50.11778 49.95041 50.11778
2007-01-03 50.23050 50.42188 50.23050 50.39767
2007-01-04 50.42096 50.42096 50.26414 50.33236
2007-01-05 50.37347 50.37347 50.22103 50.33459
2007-01-06 50.24433 50.24433 50.11121 50.18112
2007-01-07 50.13211 50.21561 49.99185 49.99185
2007-01-08 50.03555 50.10363 49.96971 49.98806
2007-01-09 49.99489 49.99489 49.80454 49.91333
2007-01-10 49.91228 50.13053 49.91228 49.97246
2007-01-11 49.88529 50.23910 49.88529 50.23910
2007-01-12 50.21258 50.35980 50.17176 50.28519
2007-01-13 50.32385 50.48000 50.32385 50.41286
2007-01-14 50.46359 50.62395 50.46359 50.60145
2007-01-15 50.61724 50.68583 50.47359 50.48912
2007-01-16 50.62024 50.73731 50.56627 50.67835
2007-01-17 50.74150 50.77336 50.44932 50.48644
2007-01-18 50.48051 50.60712 50.40269 50.57632
2007-01-19 50.41381 50.55627 50.41278 50.41278
2007-01-20 50.35323 50.35323 50.02142 50.02142
2007-01-21 50.16188 50.42090 50.16044 50.42090
2007-01-22 50.36008 50.43875 50.21129 50.21129
2007-01-23 50.03966 50.16961 50.03670 50.16961
2007-01-24 50.10953 50.26942 50.06387 50.23145
2007-01-25 50.20738 50.28268 50.12913 50.24334
2007-01-26 50.16008 50.16008 49.94052 50.07024
2007-01-27 50.06041 50.09777 49.97267 50.01091
2007-01-28 49.96586 50.00217 49.87468 49.88096
2007-01-29 49.85624 49.93038 49.76308 49.91875
2007-01-30 49.85477 50.02180 49.77242 50.02180
2007-01-31 50.07049 50.22578 50.07049 50.22578

按周切片
> split(x, f="weeks")[[1]]
               Open     High      Low    Close
2007-01-02 50.03978 50.11778 49.95041 50.11778
2007-01-03 50.23050 50.42188 50.23050 50.39767
2007-01-04 50.42096 50.42096 50.26414 50.33236
2007-01-05 50.37347 50.37347 50.22103 50.33459
2007-01-06 50.24433 50.24433 50.11121 50.18112
2007-01-07 50.13211 50.21561 49.99185 49.99185
2007-01-08 50.03555 50.10363 49.96971 49.98806
> split(x, f="weeks")[[2]]
               Open     High      Low    Close
2007-01-09 49.99489 49.99489 49.80454 49.91333
2007-01-10 49.91228 50.13053 49.91228 49.97246
2007-01-11 49.88529 50.23910 49.88529 50.23910
2007-01-12 50.21258 50.35980 50.17176 50.28519
2007-01-13 50.32385 50.48000 50.32385 50.41286
2007-01-14 50.46359 50.62395 50.46359 50.60145
2007-01-15 50.61724 50.68583 50.47359 50.48912

NA值處理


> x <- xts(1:10, Sys.Date()+1:10)
> x[c(1,2,5,9,10)] <- NA
> x
           [,1]
2013-11-19   NA
2013-11-20   NA
2013-11-21    3
2013-11-22    4
2013-11-23   NA
2013-11-24    6
2013-11-25    7
2013-11-26    8
2013-11-27   NA
2013-11-28   NA

#取前一個
> na.locf(x)
           [,1]
2013-11-19   NA
2013-11-20   NA
2013-11-21    3
2013-11-22    4
2013-11-23    4
2013-11-24    6
2013-11-25    7
2013-11-26    8
2013-11-27    8
2013-11-28    8

#取後一個
> na.locf(x, fromLast=TRUE)
           [,1]
2013-11-19    3
2013-11-20    3
2013-11-21    3
2013-11-22    4
2013-11-23    6
2013-11-24    6
2013-11-25    7
2013-11-26    8
2013-11-27   NA
2013-11-28   NA

5). xts資料統計計算

取開始時間，結束時間


> xts.ts <- xts(rnorm(231),as.Date(13514:13744,origin="1970-01-01"))

> start(xts.ts)
[1] "2007-01-01"
> end(xts.ts)
[1] "2007-08-19"

> periodicity(xts.ts)
Daily periodicity from 2007-01-01 to 2007-08-19

計算時間區間


> data(sample_matrix)
> ndays(sample_matrix)
[1] 180
> nweeks(sample_matrix)
[1] 26
> nmonths(sample_matrix)
[1] 6
> nquarters(sample_matrix)
[1] 2
> nyears(sample_matrix)
[1] 1

按期間計算統計指標


> zoo.data <- zoo(rnorm(31)+10,as.Date(13514:13744,origin="1970-01-01"))

#按周獲得期間
> ep <- endpoints(zoo.data,'weeks')
> ep
 [1]   0   7  14  21  28  35  42  49  56  63  70  77  84  91  98 105 112 119
[19] 126 133 140 147 154 161 168 175 182 189 196 203 210 217 224 231

#計算周的均值
> period.apply(zoo.data, INDEX=ep, FUN=function(x) mean(x))
2007-01-07 2007-01-14 2007-01-21 2007-01-28 2007-02-04 2007-02-11 2007-02-18 
 10.200488   9.649387  10.304151   9.864847  10.382943   9.660175   9.857894 
2007-02-25 2007-03-04 2007-03-11 2007-03-18 2007-03-25 2007-04-01 2007-04-08 
 10.495037   9.569531  10.292899   9.651616  10.089103   9.961048  10.304860 
2007-04-15 2007-04-22 2007-04-29 2007-05-06 2007-05-13 2007-05-20 2007-05-27 
  9.658432   9.887531  10.608082   9.747787  10.052955   9.625730  10.430030 
2007-06-03 2007-06-10 2007-06-17 2007-06-24 2007-07-01 2007-07-08 2007-07-15 
  9.814703  10.224869   9.509881  10.187905  10.229310  10.261725   9.855776 
2007-07-22 2007-07-29 2007-08-05 2007-08-12 2007-08-19 
  9.445072  10.482020   9.844531  10.200488   9.649387 

#計算周的最大值
> head(period.max(zoo.data, INDEX=ep))
               [,1]
2007-01-07 12.05912
2007-01-14 10.79286
2007-01-21 11.60658
2007-01-28 11.63455
2007-02-04 12.05912
2007-02-11 10.67887

#計算周的最小值
> head(period.min(zoo.data, INDEX=ep))
               [,1]
2007-01-07 8.874509
2007-01-14 8.534655
2007-01-21 9.069773
2007-01-28 8.461555
2007-02-04 9.421085
2007-02-11 8.534655

#計算周的一個指數值
> head(period.prod(zoo.data, INDEX=ep))
               [,1]
2007-01-07 11140398
2007-01-14  7582350
2007-01-21 11930334
2007-01-28  8658933
2007-02-04 12702505
2007-02-11  7702767

6). xts時間序列工具使用

檢查時間型別


> timeBased(Sys.time())
[1] TRUE
> timeBased(Sys.Date())
[1] TRUE
> timeBased(200701)
[1] FALSE

建立時間序列


#按年
> timeBasedSeq('1999/2008')
 [1] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
 [6] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"

#按月
> head(timeBasedSeq('199901/2008'))
[1] "十二月 1998" "一月 1999"   "二月 1999"   "三月 1999"   "四月 1999"  
[6] "五月 1999" 

#按日
> head(timeBasedSeq('199901/2008/d'),40)
 [1] "十二月 1998" "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
 [6] "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
[11] "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
[16] "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
[21] "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
[26] "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"   "一月 1999"  
[31] "一月 1999"   "一月 1999"   "二月 1999"   "二月 1999"   "二月 1999"  
[36] "二月 1999"   "二月 1999"   "二月 1999"   "二月 1999"   "二月 1999" 

#按數量建立，100分鐘的資料集
> timeBasedSeq('20080101 0830',length=100)
$from
[1] "2008-01-01 08:30:00 CST"
$to
[1] NA
$by
[1] "mins"
$length.out
[1] 100

按索引取資料first, last


> x <- xts(1:100, Sys.Date()+1:100)

> head(x)
           [,1]
2013-11-19    1
2013-11-20    2
2013-11-21    3
2013-11-22    4
2013-11-23    5
2013-11-24    6

> first(x, 10)
           [,1]
2013-11-19    1
2013-11-20    2
2013-11-21    3
2013-11-22    4
2013-11-23    5
2013-11-24    6
2013-11-25    7
2013-11-26    8
2013-11-27    9
2013-11-28   10

> first(x, '1 day')
           [,1]
2013-11-19    1

> last(x, '1 weeks')
           [,1]
2014-02-24   98
2014-02-25   99
2014-02-26  100

計算步長和差分


> x <- xts(1:5, Sys.Date()+1:5)
#正向
> lag(x)
           [,1]
2013-11-19   NA
2013-11-20    1
2013-11-21    2
2013-11-22    3
2013-11-23    4

#反向
> lag(x, k=-1, na.pad=FALSE) 
           [,1]
2013-11-19    2
2013-11-20    3
2013-11-21    4
2013-11-22    5

#1階差分
> diff(x)
           [,1]
2013-11-19   NA
2013-11-20    1
2013-11-21    1
2013-11-22    1
2013-11-23    1

#2階差分
> diff(x, lag=2)
           [,1]
2013-11-19   NA
2013-11-20   NA
2013-11-21    2
2013-11-22    2
2013-11-23    2

檢查向量是否排序好的


> isOrdered(1:10, increasing=TRUE)
[1] TRUE

> isOrdered(1:10, increasing=FALSE)
[1] FALSE

> isOrdered(c(1,1:10), increasing=TRUE)
[1] FALSE

> isOrdered(c(1,1:10), increasing=TRUE, strictly=FALSE)
[1] TRUE

強制唯一索引


> x <- xts(1:5, as.POSIXct("2011-01-21") + c(1,1,1,2,3)/1e3)
> x
                        [,1]
2011-01-21 00:00:00.000    1
2011-01-21 00:00:00.000    2
2011-01-21 00:00:00.000    3
2011-01-21 00:00:00.002    4
2011-01-21 00:00:00.003    5

> make.index.unique(x)
                           [,1]
2011-01-21 00:00:00.000999    1
2011-01-21 00:00:00.001000    2
2011-01-21 00:00:00.001001    3
2011-01-21 00:00:00.002000    4
2011-01-21 00:00:00.003000    5

查詢xts物件時區


> x <- xts(1:10, Sys.Date()+1:10)

> indexTZ(x)
[1] "UTC"
> tzone(x)
[1] "UTC"

> str(x)
An ‘xts’ object on 2013-11-19/2013-11-28 containing:
  Data: int [1:10, 1] 1 2 3 4 5 6 7 8 9 10
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
 NULL

xts給了zoo型別時間序列更多的API支援，這樣我們就有了更方便的工具，可以做各種的時間序列的轉換和變形了。

構造一個xts方法：假如data.csv有兩列，第一列是類似於20140506的日期，第二列是對應的資料，則
ret <- read.csv(file = "data.csv",header = TRUE) 
ret <- xts(ret[, -1], order.by=as.Date(as.character(ret[, 1]),format="%Y%m%d"))
具體可以在R控制檯輸入?"xts"檢視幫助。

R語言學習之

可擴充套件的時間序列xts

1. xts介紹

2. xts安裝

3. xts資料結構

4. xts的API介紹

5. xts使用

R語言學習之矩陣的建立

R語言學習之簡單線性迴歸

R語言學習之基礎知識一

R語言學習之基本語法

R語言學習之

R語言學習之聚類分析

R語言學習筆記之三

R語言學習筆記之五

R語言學習筆記之七

R語言學習筆記之set.seed()函式與table函式

R語言學習筆記——melt()函式之整齊資料

R語言學習筆記之相關性矩陣分析及其視覺化

R語言學習系列(資料探勘之決策樹演算法實現--ID3程式碼篇)

R語言學習筆記之: 論如何正確把EXCEL檔案餵給R處理

R語言學習筆記之apply、lapply、sapply、mapply、tapply函式詳解

R語言學習筆記-Error in ts(x):對象不是矩陣問題解決

c語言學習之選擇結構程序設計（第三天）

R語言學習-for循環

R語言學習-while循環

R語言學習-repeat循環

R語言學習之

可擴充套件的時間序列xts

1. xts介紹

2. xts安裝

3. xts資料結構

4. xts的API介紹

5. xts使用

相關推薦