ARIMA model

ARMA to ARIMA

  • When there is a trend in data we take differences
  • ARIMA – Auto regressive Integrated Moving average
  • Integrated term includes order of difference, In the example below it is d=2

arima

Below is the sample github gist and output pdf is avaialble at ARIMA model.pdf


title: "ARIMA model"
author: "Archit Vora"
date: "April 3, 2018"
output: html_document
“`{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
“`
We will try to fit ARIMA model to BJsales dataset from 'dataset' package in R.
Here is the plot of it.
“`{r}
library(datasets)
plot(BJsales)
“`
Mean in changing over time and seems time series is not statioary. Let's take the difference.
“`{r}
plot(diff(BJsales))
“`
Seems It is still not staionary, let's take one more diff.
“`{r}
plot(diff(diff(BJsales)))
“`
Now it seems stationary. Let's plot acf and pacf of this doubly differenced series.
“`{r}
focus<- diff(diff(BJsales))
acf(focus)
pacf(focus)
“`
From ACF lag 1, 8 and 11 seems significant, while from PACF seems lag 1, 2, 3, 10 and 19 seems significant.
Keeping parsimonious principle in mind we shall consider order of 0 and 1 for MA terms and oreder of (0, 1, 2, 3) for AR terms.
Now let's try differenct models and check their AIC.
“`{r}
d <- 2
for (p in 0:3){
for (q in 0:1){
if(p+d+q<6){ # Ensures simple model
mm<- arima(x = BJsales, order = c(p, d, q))
pval <- Box.test(mm$residuals, lag = log(length(mm$residuals)))
sse <- sum(mm$residuals^2)
aic <- mm$aic
cat(p, d, q, "AIC = ", aic, "SSE = ", sse, "P-value = ", pval$p.value, "\n")
}
}
}
“`
Seems like (0, 2, 1) has smallest AIC but does not have significant p-value.
Let's leave it for this post and plots residual.
“`{r}
model <- arima(x = BJsales, order = c(0, 2, 1))
par(mfrow = c(2, 2))
plot(model$residuals)
acf(model$residuals)
pacf(model$residuals)
qqnorm(model$residuals)
“`
There seems no significant correlation as well as QQ-plot also looks okay.
Now let's plot the forecast.
“`{r}
library(forecast)
fc <- forecast(model, h=20)
plot(fc)
“`
All this can also be done by auto.arima routine of forecast package.
“`{r}
model <- auto.arima(BJsales, seasonal = FALSE)
model
fc <- forecast(model, h = 20)
plot(fc)
“`
auto.arima has come up with different model, but it is okay.

view raw

arima.Rmd

hosted with ❤ by GitHub

Fitting AR Processes

Yule Walker Equation in Matrix Form

 

ym1

  • If we write and above equation for k=1, 2, . . ., n and use the fact that ρ(k) = ρ(-k), we can write it in a matrix form.
  • Using the data we have we can estimate values of ρ  (auto correlation coefficients)
  • acf() routine in R gives us that
  • Using values of ρ we can then estimate values of Φ (parameters of AR process)

 

ym2

  • Above is an example for AR process
  • We can solve these equation for values of Φ1, Φ2 and Φ3

 

Reference:

Moving average and Auto-regressive Processes

Moving Average Processes MA(q)

  • Stock price depends on announcements of last two days
  • Auto correlation function cuts off at q

maq

 

Auto regressive Processes AR(p)

 

1

2

3

 

  • Below are the plots for AR(2) process
  • Depending upon the value of phi1 and phi2 ACF has alternative positive and negative values

 

56

 

Writing AR(p) process as MA process by substituting values of X(t-1). And yes phi is constant, we don’t need phi1, phi2 anymore.

78

 

Mean, variance and auto-correlation of AR(p) process, we have assumed Z = Norm(0, sigma2)

9

 

 

ACF of AR-p using Yule-Walker Equation

  • It is a method of solving difference equation in recursive relation
  • We first obtained auxiliary equation (also known as characteristic equation) which is polynomial and find root of that
  • Using these root we get weighted geometric series and find weights using some initial condition
  • We had learned in mathematics that this way of solving difference equation also related to solving differential equations
  • In the course they had solved it for Fibonacci series and root had come out to be golden ratio
  • For AR(p) ACF comes out to be difference equation, solving which can give us ACF for different values of lag

 

 

Reference

https://www.coursera.org/learn/practical-time-series-analysis/home/welcome

https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model

 

 

 

 

 

Stationarity Conditions for MA(q) and AR(p) Processes

Sequence and Series

Convergent Sequence

1/2, 2/3, 3/4, . . . , n/(n+1)

Divergent Sequence

3, 9, 27, . . . . , 3^n

Series => Partial Sum of sequence

Convergent Series => if sum converges

Convergence Test

  • Integral Test
  • Comparison Test
  • Limit comparison test
  • Alternating Series Test
  • Ratio test
  • Root test
Geometric Series

  • a, ar, ar^2, . . . , ar^n
  • Convergent if r < 1
Representing function as (geometric) series

seriesRepresntation

Backward shift operator

  • B^kX(t) = X(t – k)

backOp

Invertibility

  • Two models have same ACF
  • Given ACF how to find out the model
  • We will go for model that is invertible
  • We can invert MA(1) into AR(∞)
  • Inverting is basically act of expanding function in geometric series
  • It is possible when growth r<1
  • Out of two models only one satisfies this condition
  • We will select that model given ACF

Conditions for Invertibility[MA(q)] and Stationarity [AR(p)]

dual

How to check if series is both invertible and stationary

  • Check AR(p) polynomial for stationarity
  • Check MA(q) polynomial for invertibility
  • Both should hold

Reference

https://www.coursera.org/learn/practical-time-series-analysis/exam/ITocA/series-backward-shift-operator-invertibility-and-duality

 

[Time Series] Correlation and Stationarity

Co-variance vs Correlation

  • Correlation is co-variance divided by standard deviation of both variables
  • Hence it is independent of units and is always between -1 and 1, which makes comparison easier
  • Formula on the right is time series specific
    • It is auto correlation coefficient at lag k
    • It is define as ration of auto-correlation at lag k divide by auto-correlation at lag 0
    • This values are plotted on correlogram  (See one for MA(2) process below)

acf

 

Stationary Time Series

  • No systematic change in mean (No trend)
  • No systematic change in Variance
  • No periodic variation (Seasonality)

If time series is not stationary we apply several transformation to make it stationary.

For example applying difference operator to random walk makes it stationary.

 

 

Random Walk

  • Previous value of noise
  • If first value is zero then current value is summation of all the noises so far
  • X(t) = X(t-1) + Z(t)
  • Z(t) = Normal (mu, sigma2)
  • if X(0) = 0 then X(t) = sum(Z(k)) k form 0 to t
  • Expectation[X(t)] = t*mu   – –  Changes with time
  • Variance[X(t)] = t*sigma2   – – Changes with time
  • Not a stationary process
  • let Y(t) = X(t) – X(t-1) = Z(t)  – – Y(t) is a stationary process

 

Example of Stationary Process

Moving average and Auto regressive processes described here can be stationary under conditions described here.

 

 

References

 

Further reading

 

 

 

 

Time series week 1

  • Plotting in R
  • Linear regression properly fitted or not
    • Residue are important thing to observed
    • Q-Q plots for normality test
    • Residues over time
      • Zoomed in residues over time
  • Hypothesis test
    • One, two sided t test
    • Confidence interval
      • Where we think mean lies
      • If it dose not contain 0 we tend to reject null hypothesis (Very broad statement, but I think you got the concept)
  • Correlation function
    • Which quarter data false

 

Ref : https://www.coursera.org/learn/practical-time-series-analysis/home/welcome