1. 程式人生 > >時間序列例子--ARIMA怎樣預測外樣本、一步or多步

時間序列例子--ARIMA怎樣預測外樣本、一步or多步

https://machinelearningmastery.com/make-sample-forecasts-arima-python/

1.劃分訓練集測試集、這裡講最後7天的氣溫當做測試集

# split the dataset
from pandas import Series
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
split_point = len(series) - 7
dataset, validation = series[0:split_point], series[split_point:]
print('Dataset %d, Validation %d' % (len(dataset), len(validation)))
dataset.to_csv('dataset.csv')
validation.to_csv('validation.csv')

2.1用forecat預測一步

The result of the forecast() function is an array containing the forecast value, the standard error of the forecast, and the confidence interval information. Now, we are only interested in the first element of this forecast, as follows.

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value減了之後要加回來再算mse衡量預測的好壞
# history[-interval]代表倒數第幾個
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]


# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# one-step out-of sample forecast一步預測
forecast = model_fit.forecast()[0]
# invert the differenced forecast to something usable
forecast = inverse_difference(X, forecast, days_in_year)
print('Forecast: %f' % forecast)

結果:Forecast: 14.861669

之後拿這個結果去與測試集上進行對比即可

2.2用predict

The statsmodel ARIMAResults object also provides a predict() function for making forecasts.

The predict function can be used to predict arbitrary in-sample and out-of-sample time steps, including the next out-of-sample forecast time step.

The predict function requires a start and an end to be specified, these can be the indexes of the time steps relative to the beginning of the training data used to fit the model

 

1

2

3

4

# one-step out of sample forecast

start_index = len(differenced)

end_index = len(differenced)

forecast = model_fit.predict(start=start_index, end=end_index)

The start and end can also be a datetime string or a “datetime” type; for example:

 

1

2

3

start_index = '1990-12-25'

end_index = '1990-12-25'

forecast = model_fit.predict(start=start_index, end=end_index)

 

1

2

3

4

from pandas import datetime

start_index = datetime(1990, 12, 25)

end_index = datetime(1990, 12, 26)

forecast = model_fit.predict(start=start_index, end=end_index)

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy
from pandas import datetime

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# one-step out of sample forecast
start_index = len(differenced)
end_index = len(differenced)
forecast = model_fit.predict(start=start_index, end=end_index)
# invert the differenced forecast to something usable
forecast = inverse_difference(X, forecast, days_in_year)
print('Forecast: %f' % forecast)

Forecast: 14.861669

可以看出來predict更靈活,可以指定位置

3.1多步用forcast

這裡要改變一下inverted

# multi-step out-of-sample forecast
forecast = model_fit.forecast(steps=7)[0]

# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

解釋一下:history[-interval]代表倒數第幾個,本來預測最後一個,加上history[-interval]就可以,

                可是這個是多步啊,所以倒數第二個要加上history[-(interval+1)]

               但是  我每一步都history。append就不用該變原來程式碼啦

完整程式碼:

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# multi-step out-of-sample forecast
forecast = model_fit.forecast(steps=7)[0]
# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

Day 1: 14.861669
Day 2: 15.628784
Day 3: 13.331349
Day 4: 11.722413
Day 5: 10.421523
Day 6: 14.415549
Day 7: 12.674711

3.2用predict

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# multi-step out-of-sample forecast
start_index = len(differenced)
end_index = start_index + 6
forecast = model_fit.predict(start=start_index, end=end_index)
# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

Using time step indexes, we can specify the end index as 6 more time steps in the future; for example:

 

1

2

3

4

# multi-step out-of-sample forecast

start_index = len(differenced)

end_index = start_index + 6

forecast = model_fit.predict(start=start_index, end=end_index)

 Day 1: 14.861669
Day 2: 15.628784
Day 3: 13.331349
Day 4: 11.722413
Day 5: 10.421523
Day 6: 14.415549
Day 7: 12.674711

注:我其實沒有明白這個多步預測的原理是啥子,我猜測之前講的模型2,

      因為第2個樣本的t-1時刻我們不知道啊,這個時候沒法滾動了,可能只利用之前預測的當做輸入