Skip to content
Snippets Groups Projects
Commit 95e229f5 authored by Berend Bouvy's avatar Berend Bouvy
Browse files

Merge remote-tracking branch 'origin/main' into release

parents 48ccc59f 9244316a
No related branches found
No related tags found
No related merge requests found
Pipeline #269579 passed
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
(AR)= (AR)=
# AR process # AR process
The code on this page can be used interactively: click {fa}`rocket` --> {guilabel}`Live Code` in the top right corner, then wait until the message {guilabel}`Python interaction ready!` appears. The code on this page can be used interactively: click {fa}`rocket` --> {guilabel}`Live Code` in the top right corner, then wait until the message {guilabel}`Python interaction ready!` appears.
The main goal is to introduce the AutoRegressive (AR) model to describe a **stationary stochastic process**. Hence the AR model can be applied on time series where e.g. trend and seasonality are not present / removed, and only noise remains, or after applying other methods [to obtain a stationary time series](stationarize). The main goal is to introduce the AutoRegressive (AR) model to describe a **stationary stochastic process**. Hence the AR model can be applied on time series where e.g. trend and seasonality are not present / removed, and only noise remains, or after applying other methods [to obtain a stationary time series](stationarize).
## Process definition ## Process definition
In an AR model, we forecast the variable of interest using a linear combination of its past values. A zero mean AR process of orders $p$ can be written as follows: In an AR model, we forecast the variable of interest using a linear combination of its past values. A zero mean AR process of orders $p$ can be written as follows:
$$S_t = \overbrace{\phi_1S_{t-1}+...+\phi_pS_{t-p}}^{\text{AR process}} + e_t $$ $$S_t = \overbrace{\phi_1S_{t-1}+...+\phi_pS_{t-p}}^{\text{AR process}} + e_t $$
or as or as
$$S_t = \sum_{i=1}^p \phi_iS_{t-i}+e_t$$ $$S_t = \sum_{i=1}^p \phi_iS_{t-i}+e_t$$
Each observation is made up of a **random error** $e_t$ at that epoch, a linear combination of **past observations**. The errors $e_t$ are uncorrelated purely random noise process, known also as white noise. We note the process should still be stationary, satisfying Each observation is made up of a **random error** $e_t$ at that epoch, a linear combination of **past observations**. The errors $e_t$ are uncorrelated purely random noise process, known also as white noise. We note the process should still be stationary, satisfying
$$\mathbb{E}(S_t)=0, \hspace{20px} \mathbb{D}(S_t)=\sigma^2,\quad \forall t$$ $$\mathbb{E}(S_t)=0, \hspace{20px} \mathbb{D}(S_t)=\sigma^2,\quad \forall t$$
This indicates that parts of the total variability of the process come from the signal and noise of past epochs, and only a (small) portion belongs to the noise of that epoch (denoted as $e_t$). This indicates that parts of the total variability of the process come from the signal and noise of past epochs, and only a (small) portion belongs to the noise of that epoch (denoted as $e_t$).
### First-order AR(1) process ### First-order AR(1) process
We will just focus on explaining $p=1$, i.e. the AR(1) process. A **zero-mean first order autoregressive** process can be written as follows We will just focus on explaining $p=1$, i.e. the AR(1) process. A **zero-mean first order autoregressive** process can be written as follows
$$ S_t = \phi S_{t-1}+e_t, \hspace{20px} -1\leq\phi<1, \hspace{20px} t=2,...,m $$ $$ S_t = \phi S_{t-1}+e_t, \hspace{20px} -1\leq\phi<1, \hspace{20px} t=2,...,m $$
where $e_t$ is an i.i.d. noise process, e.g. distributed as $e_t\sim N(0,\sigma_{e}^2)$. See later the definition of $\sigma_{e}^2$. where $e_t$ is an i.i.d. noise process, e.g. distributed as $e_t\sim N(0,\sigma_{e}^2)$. See later the definition of $\sigma_{e}^2$.
:::{card} Exercise :::{card} Exercise
In a zero-mean first order autoregressive process, abbreviated as AR(1), we have $m=3$ observations, $\phi=0.8$, and the generated white noise errors are $e = [e_1,\, e_2,\, e_3]^T=[1,\, 2,\, -1]^T$. What is the generated AR(1) process $S = [S_1,\, S_2,\, S_3]^T$? In a zero-mean first order autoregressive process, abbreviated as AR(1), we have $m=3$ observations, $\phi=0.8$, and the generated white noise errors are $e = [e_1,\, e_2,\, e_3]^T=[1,\, 2,\, -1]^T$. What is the generated AR(1) process $S = [S_1,\, S_2,\, S_3]^T$?
a. $S = \begin{bmatrix}1 & 2.8 & 1.24\end{bmatrix}^T$ a. $S = \begin{bmatrix}1 & 2.8 & 1.24\end{bmatrix}^T$
b. $S = \begin{bmatrix} 0 & 2 & 0.6 \end{bmatrix}^T$ b. $S = \begin{bmatrix} 0 & 2 & 0.6 \end{bmatrix}^T$
c. $S = \begin{bmatrix} 1 & 2 & -1 \end{bmatrix}^T$ c. $S = \begin{bmatrix} 1 & 2 & -1 \end{bmatrix}^T$
```{admonition} Solution ```{admonition} Solution
:class: tip, dropdown :class: tip, dropdown
The correct answer is **a**. The AR(1) process can be initialized as $S_1=e_1=1$. The next values can be obtained through: The correct answer is **a**. The AR(1) process can be initialized as $S_1=e_1=1$. The next values can be obtained through:
$$ S_t = \phi S_{t-1} + e_t $$ $$ S_t = \phi S_{t-1} + e_t $$
Giving $S_2=0.8 S_1 + e_2 = 0.8\cdot 1 + 2 = 2.8$ and $S_3=0.8 S_2 + e_3 = 0.8\cdot 2.8 - 1= 1.24$, so we have: Giving $S_2=0.8 S_1 + e_2 = 0.8\cdot 1 + 2 = 2.8$ and $S_3=0.8 S_2 + e_3 = 0.8\cdot 2.8 - 1= 1.24$, so we have:
$$S = \begin{bmatrix}1 & 2.8 & 1.24\end{bmatrix}^T $$ $$S = \begin{bmatrix}1 & 2.8 & 1.24\end{bmatrix}^T $$
``` ```
::: :::
### Formulation ### Formulation
Initializing $S_1=e_1$, with $\mathbb{E}(S_1)=\mathbb{E}(e_1)=0$ and $\mathbb{D}(S_1)=\mathbb{D}(e_1)=\sigma^2$. Following this, multiple applications of the above "autoregressive" formula ($S_t = \phi S_{t-1} + e_t$) gives: Initializing $S_1=e_1$, with $\mathbb{E}(S_1)=\mathbb{E}(e_1)=0$ and $\mathbb{D}(S_1)=\mathbb{D}(e_1)=\sigma^2$. Following this, multiple applications of the above "autoregressive" formula ($S_t = \phi S_{t-1} + e_t$) gives:
$$ \begin{align*} $$ \begin{align*}
S_1&=e_1\\ S_1&=e_1\\
S_2&=\phi S_1+e_2\\ S_2&=\phi S_1+e_2\\
S_3 &= \phi S_2+e_3 = \phi^2S_1+\phi e_2+e_3\\ S_3 &= \phi S_2+e_3 = \phi^2S_1+\phi e_2+e_3\\
&\vdots\\ &\vdots\\
S_m &= \phi S_{m-1} + e_m = \phi^{m-1}S_1+\phi^{m-2}e_2+...+\phi e_{m-1}+e_m S_m &= \phi S_{m-1} + e_m = \phi^{m-1}S_1+\phi^{m-2}e_2+...+\phi e_{m-1}+e_m
\end{align*} $$ \end{align*} $$
of which we still have (in order to impose the *stationarity*): of which we still have (in order to impose the *stationarity*):
$$\mathbb{E}(S_t)=0 \hspace{5px}\text{and}\hspace{5px} \mathbb{D}(S_t)=\sigma^2, \hspace{10px} t=1,...,m$$ $$\mathbb{E}(S_t)=0 \hspace{5px}\text{and}\hspace{5px} \mathbb{D}(S_t)=\sigma^2, \hspace{10px} t=1,...,m$$
All the error components, $e_t$, are uncorrelated such that $Cov(e_t,e_{t+\tau})=0$ if $\tau \neq 0$, and with variance $\sigma_{e}^2$ which still needs to be determined. All the error components, $e_t$, are uncorrelated such that $Cov(e_t,e_{t+\tau})=0$ if $\tau \neq 0$, and with variance $\sigma_{e}^2$ which still needs to be determined.
### Autocovariance ### Autocovariance
The mean of the process is zero and, therefore: The mean of the process is zero and, therefore:
$$\mathbb{E}(S_t) = \mathbb{E}(\phi S_{t-1}+e_t) = \phi\mathbb{E}(S_{t-1})+\mathbb{E}(e_t) = 0$$ $$\mathbb{E}(S_t) = \mathbb{E}(\phi S_{t-1}+e_t) = \phi\mathbb{E}(S_{t-1})+\mathbb{E}(e_t) = 0$$
The variance of the process should remain constant as: The variance of the process should remain constant as:
$$\mathbb{D}(S_t) = \mathbb{D}(\phi S_{t-1} +e_t) \Leftrightarrow \sigma^2=\phi^2\sigma^2+\sigma_{e}^2, \hspace{10px} t\geq 2$$ $$\mathbb{D}(S_t) = \mathbb{D}(\phi S_{t-1} +e_t) \Leftrightarrow \sigma^2=\phi^2\sigma^2+\sigma_{e}^2, \hspace{10px} t\geq 2$$
resulting in resulting in
$$\sigma_{e}^2 = \sigma^2 (1-\phi^2)$$ $$\sigma_{e}^2 = \sigma^2 (1-\phi^2)$$
indicating that $\sigma_{e}^2$ is smaller than $\sigma^2$. indicating that $\sigma_{e}^2$ is smaller than $\sigma^2$.
The autocovariance (covariance between $S_t$ and $S_{t+\tau}$) is The autocovariance (covariance between $S_t$ and $S_{t+\tau}$) is
$$ \begin{align*} $$ \begin{align*}
c_{\tau}&=\mathbb{E}(S_t S_{t+\tau})-\mu^2 =\mathbb{E}(S_t S_{t+\tau})\\ c_{\tau}&=\mathbb{E}(S_t S_{t+\tau})-\mu^2 =\mathbb{E}(S_t S_{t+\tau})\\
&= \mathbb{E}(S_t(\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...)) = \phi^\tau\mathbb{E}(S_t^2)=\sigma^2\phi^\tau &= \mathbb{E}(S_t(\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...)) = \phi^\tau\mathbb{E}(S_t^2)=\sigma^2\phi^\tau
\end{align*} $$ \end{align*} $$
In the derivation above we used that: In the derivation above we used that:
$$ \begin{align*} $$ \begin{align*}
S_{t+\tau}=\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...+e_{t+\tau} S_{t+\tau}=\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}
\end{align*} $$ \end{align*} $$
and the fact that $S_t$ and $e_{t+\tau}$ are uncorrelated for $\tau \neq 0$. and the fact that $S_t$ and $e_{t+\tau}$ are uncorrelated for $\tau \neq 0$.
```{admonition} Derivation (optional) ```{admonition} Derivation (optional)
:class: tip, dropdown :class: tip, dropdown
$$ \begin{align*} $$ \begin{align*}
S_{t+\tau}&= \phi^{t+\tau-1}S_1 + \phi^{t+\tau-2}e_2+...+ \phi^{\tau} e_{t}+ \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}\\ S_{t+\tau}&= \phi^{t+\tau-1}S_1 + \phi^{t+\tau-2}e_2+...+ \phi^{\tau} e_{t}+ \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}\\
&= \phi^{\tau} \left(\phi^{t-1}S_1 + \phi^{t-2}e_2+...+ e_{t}\right)+ \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}\\ &= \phi^{\tau} \left(\phi^{t-1}S_1 + \phi^{t-2}e_2+...+ e_{t}\right)+ \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}\\
&=\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...+e_{t+\tau} &=\phi^\tau S_t + \phi^{\tau-1} e_{t+1}+...+e_{t+\tau}
\end{align*} $$ \end{align*} $$
``` ```
### Model structure of AR(1) ### Model structure of AR(1)
$$\mathbb{E}(S) = \mathbb{E}\begin{bmatrix}S_1\\ S_2\\ \vdots\\ S_m\end{bmatrix} = \begin{bmatrix}0\\ 0\\ \vdots\\ 0\end{bmatrix}, \hspace{15px} \mathbb{D}(S)=\Sigma_{S}=\sigma^2 \begin{bmatrix}1&\phi&...&\phi^{m-1}\\ \phi&1&...&\phi^{m-2}\\ \vdots&\vdots&\ddots&\vdots\\ \phi^{m-1}&\phi^{m-2}&...&1\end{bmatrix}$$ $$\mathbb{E}(S) = \mathbb{E}\begin{bmatrix}S_1\\ S_2\\ \vdots\\ S_m\end{bmatrix} = \begin{bmatrix}0\\ 0\\ \vdots\\ 0\end{bmatrix}, \hspace{15px} \mathbb{D}(S)=\Sigma_{S}=\sigma^2 \begin{bmatrix}1&\phi&...&\phi^{m-1}\\ \phi&1&...&\phi^{m-2}\\ \vdots&\vdots&\ddots&\vdots\\ \phi^{m-1}&\phi^{m-2}&...&1\end{bmatrix}$$
* Autocovariance function $\implies$ $c_{\tau}=\sigma^2\phi^\tau$ * Autocovariance function $\implies$ $c_{\tau}=\sigma^2\phi^\tau$
* Normalized autocovariance function (ACF) $\implies$ $\rho_\tau=c_{\tau}/c_0=\phi^\tau$ * Normalized autocovariance function (ACF) $\implies$ $\rho_\tau=c_{\tau}/c_0=\phi^\tau$
* Larger value of $\phi$ indicates a long-memory random process * Larger value of $\phi$ indicates a long-memory random process
* If $\phi=0$, this is called *purely random process* (white noise) * If $\phi=0$, this is called *purely random process* (white noise)
* ACF is even, $c_{\tau}=c_{-\tau}=c_{|\tau|}$ and so is $\rho_{\tau}=\rho_{-\tau}=\rho_{|\tau|}$ * ACF is even, $c_{\tau}=c_{-\tau}=c_{|\tau|}$ and so is $\rho_{\tau}=\rho_{-\tau}=\rho_{|\tau|}$
Later in this section we will see how the coefficient $\phi$ can be estimated. Later in this section we will see how the coefficient $\phi$ can be estimated.
## Simulated example ## Simulated example
If you have run the python code on this page, an interactive plot will be displayed below. You can change the value of $\phi$ and the number of observations $m$ to see how the AR(1) process changes. At the start, the process is initialized with $\phi = 0.8$. Try moving the slider and see the response of the ACF; pay special attention when $\phi=0$ and when $\phi$ becomes negative. If you have run the python code on this page, an interactive plot will be displayed below. You can change the value of $\phi$ and the number of observations $m$ to see how the AR(1) process changes. At the start, the process is initialized with $\phi = 0.8$. Try moving the slider and see the response of the ACF; pay special attention when $\phi=0$ and when $\phi$ becomes negative.
Lastly, focus on the case where $\phi=1$ and $\phi=-1$. What do you observe? You will notice that the function will "explode". This makes intuitive sense, since the effect of the previous epoch is not dampened, but rather amplified. This also means that the process is not stationary anymore. So, the AR(1) process is stationary if $|\phi|<1$. Lastly, focus on the case where $\phi=1$ and $\phi=-1$. What do you observe? You will notice that the function will "explode". This makes intuitive sense, since the effect of the previous epoch is not dampened, but rather amplified. This also means that the process is not stationary anymore. So, the AR(1) process is stationary if $|\phi|<1$.
%% Cell type:code id: tags:auto-execute-page,thebe-remove-input-init %% Cell type:code id: tags:auto-execute-page,thebe-remove-input-init
``` python ``` python
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf from statsmodels.graphics.tsaplots import plot_acf
import ipywidgets as widgets import ipywidgets as widgets
from ipywidgets import interact from ipywidgets import interact
def acfplot(phi=0.8): def acfplot(phi=0.8):
# Parameters for the AR(1) process # Parameters for the AR(1) process
sigma = 1 # Standard deviation of the noise sigma = 1 # Standard deviation of the noise
n = 500 # Length of the time series n = 500 # Length of the time series
# Initialize the process # Initialize the process
np.random.seed(42) # For reproducibility np.random.seed(42) # For reproducibility
X = np.zeros(n) X = np.zeros(n)
X[0] = np.random.normal(0, sigma) # Initial value X[0] = np.random.normal(0, sigma) # Initial value
# Generate the AR(1) process and noise series # Generate the AR(1) process and noise series
noise = np.random.normal(0, sigma, n) # Pre-generate noise for the second plot noise = np.random.normal(0, sigma, n) # Pre-generate noise for the second plot
for t in range(1, n): for t in range(1, n):
X[t] = phi * X[t-1] + noise[t] # Use pre-generated noise X[t] = phi * X[t-1] + noise[t] # Use pre-generated noise
# Create the 2x1 subplot # Create the 2x1 subplot
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=False) fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=False)
# Plot the AR(1) process in the first subplot # Plot the AR(1) process in the first subplot
ax1.plot(X, label=f"AR(1) Process with φ={round(phi, 2)}") ax1.plot(X, label=f"AR(1) Process with φ={round(phi, 2)}")
ax1.set_ylabel("Value") ax1.set_ylabel("Value")
ax1.set_title("Simulated AR(1) Process") ax1.set_title("Simulated AR(1) Process")
ax1.set_xlabel("Time")
ax1.set_ylim(-4, 4) ax1.set_ylim(-4, 4)
ax1.legend() ax1.legend()
ax1.grid(True) ax1.grid(True)
# Plot the white noise in the second subplot # Plot the white noise in the second subplot
lags = 20 lags = 20
plot_acf(X, ax=ax2, lags=lags, title="ACF of White Noise") plot_acf(X, ax=ax2, lags=lags, title="ACF of White Noise")
ax2.set_xlabel("Time") ax2.set_xlabel("Lag")
ax2.set_ylabel("ACF Value") ax2.set_ylabel("ACF Value")
ax2.set_title("ACF of AR(1) Process") ax2.set_title("ACF of AR(1) Process")
ax2.set_xlim(-0.5, lags) ax2.set_xlim(-0.5, lags)
ax2.grid(True) ax2.grid(True)
# Display the plot # Display the plot
plt.tight_layout() plt.tight_layout()
plt.show() plt.show()
interact(acfplot, phi=widgets.FloatSlider(value=0.8, min=-1, max=1.0, step=0.1, description='Phi:')); interact(acfplot, phi=widgets.FloatSlider(value=0.8, min=-1, max=1.0, step=0.1, description='Phi:'));
``` ```
%% Output
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Estimation of coefficients of AR process ## Estimation of coefficients of AR process
If the values of $p$ and of the AR($p$) process are known, the question is: **how can we estimate the coefficients $\phi_1,...,\phi_p$** If the values of $p$ and of the AR($p$) process are known, the question is: **how can we estimate the coefficients $\phi_1,...,\phi_p$**
Here, we only elaborate on AR(1) using best linear unbiased estimation (BLUE) to estimate $\phi_1$. The method can be generalized to estimate the parameters of an AR($p$) process. Here, we only elaborate on AR(1) using best linear unbiased estimation (BLUE) to estimate $\phi_1$. The method can be generalized to estimate the parameters of an AR($p$) process.
**Example: Parameter estimation of AR(1)** **Example: Parameter estimation of AR(1)**
The AR(1) process is of the form The AR(1) process is of the form
$$S_t=\phi_1 S_{t-1}+e_t$$ $$S_t=\phi_1 S_{t-1}+e_t$$
In order to estimate the $\phi_i$ we can set up the following linear model of observation equations (starting from $t=2$): In order to estimate the $\phi_i$ we can set up the following linear model of observation equations (starting from $t=2$):
$$\begin{bmatrix}S_2 \\ S_3 \\ \vdots \\ S_m \end{bmatrix} = \begin{bmatrix}S_1 \\S_2 \\ \vdots\\ S_{m-1} \end{bmatrix}\begin{bmatrix}\phi_1 \end{bmatrix} + \begin{bmatrix}e_{2} \\ e_{3}\\ \vdots \\ e_{m} \end{bmatrix}$$ $$\begin{bmatrix}S_2 \\ S_3 \\ \vdots \\ S_m \end{bmatrix} = \begin{bmatrix}S_1 \\S_2 \\ \vdots\\ S_{m-1} \end{bmatrix}\begin{bmatrix}\phi_1 \end{bmatrix} + \begin{bmatrix}e_{2} \\ e_{3}\\ \vdots \\ e_{m} \end{bmatrix}$$
The BLUE estimator of $\phi$ is given by: The BLUE estimator of $\phi$ is given by:
$$\hat{\phi}=(\mathrm{A}^T\mathrm{A})^{-1}\mathrm{A}^TS$$ $$\hat{\phi}=(\mathrm{A}^T\mathrm{A})^{-1}\mathrm{A}^TS$$
Where $\mathrm{A}=\begin{bmatrix}S_1 & S_2 & \cdots & S_{m-1}\end{bmatrix}^T$ and $S=\begin{bmatrix}S_2 & S_3 & \cdots & S_m\end{bmatrix}^T$. Where $\mathrm{A}=\begin{bmatrix}S_1 & S_2 & \cdots & S_{m-1}\end{bmatrix}^T$ and $S=\begin{bmatrix}S_2 & S_3 & \cdots & S_m\end{bmatrix}^T$.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment