ARMA to ARIMA
- When there is a trend in data we take differences
- ARIMA – Auto regressive Integrated Moving average
- Integrated term includes order of difference, In the example below it is d=2

Below is the sample github gist and output pdf is avaialble at ARIMA model.pdf
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| — | |
| title: "ARIMA model" | |
| author: "Archit Vora" | |
| date: "April 3, 2018" | |
| output: html_document | |
| — | |
| “`{r setup, include=FALSE} | |
| knitr::opts_chunk$set(echo = TRUE) | |
| “` | |
| We will try to fit ARIMA model to BJsales dataset from 'dataset' package in R. | |
| Here is the plot of it. | |
| “`{r} | |
| library(datasets) | |
| plot(BJsales) | |
| “` | |
| Mean in changing over time and seems time series is not statioary. Let's take the difference. | |
| “`{r} | |
| plot(diff(BJsales)) | |
| “` | |
| Seems It is still not staionary, let's take one more diff. | |
| “`{r} | |
| plot(diff(diff(BJsales))) | |
| “` | |
| Now it seems stationary. Let's plot acf and pacf of this doubly differenced series. | |
| “`{r} | |
| focus<- diff(diff(BJsales)) | |
| acf(focus) | |
| pacf(focus) | |
| “` | |
| From ACF lag 1, 8 and 11 seems significant, while from PACF seems lag 1, 2, 3, 10 and 19 seems significant. | |
| Keeping parsimonious principle in mind we shall consider order of 0 and 1 for MA terms and oreder of (0, 1, 2, 3) for AR terms. | |
| Now let's try differenct models and check their AIC. | |
| “`{r} | |
| d <- 2 | |
| for (p in 0:3){ | |
| for (q in 0:1){ | |
| if(p+d+q<6){ # Ensures simple model | |
| mm<- arima(x = BJsales, order = c(p, d, q)) | |
| pval <- Box.test(mm$residuals, lag = log(length(mm$residuals))) | |
| sse <- sum(mm$residuals^2) | |
| aic <- mm$aic | |
| cat(p, d, q, "AIC = ", aic, "SSE = ", sse, "P-value = ", pval$p.value, "\n") | |
| } | |
| } | |
| } | |
| “` | |
| Seems like (0, 2, 1) has smallest AIC but does not have significant p-value. | |
| Let's leave it for this post and plots residual. | |
| “`{r} | |
| model <- arima(x = BJsales, order = c(0, 2, 1)) | |
| par(mfrow = c(2, 2)) | |
| plot(model$residuals) | |
| acf(model$residuals) | |
| pacf(model$residuals) | |
| qqnorm(model$residuals) | |
| “` | |
| There seems no significant correlation as well as QQ-plot also looks okay. | |
| Now let's plot the forecast. | |
| “`{r} | |
| library(forecast) | |
| fc <- forecast(model, h=20) | |
| plot(fc) | |
| “` | |
| All this can also be done by auto.arima routine of forecast package. | |
| “`{r} | |
| model <- auto.arima(BJsales, seasonal = FALSE) | |
| model | |
| fc <- forecast(model, h = 20) | |
| plot(fc) | |
| “` | |
| auto.arima has come up with different model, but it is okay. |

