Before we can look into the modelling of a stochastic process using an Autoregressive (AR) model, we first need to introduce the autocovariance function (ACF) for a stationary time series, and describe the relationship between ACF and a power spectral density (PSD).
As in the Chapter on #TODO (add reference to obs theory), the variance component is often determined based on the precision of an observation (at a given epoch), and the covariance components quantitatively indicate the statistical dependence (or independence) between observations. In this case, dependence is inherently introduced by the physical processes that produce the signal (of which our time series is a sample), and in fact our time series methods seek to (mathematically) account for this.
As in [Observation theory](../observation_theory/01_Introduction.md), the variance component is often determined based on the precision of an observation (at a given epoch), and the covariance components quantitatively indicate the statistical dependence (or independence) between observations. In this case, dependence is inherently introduced by the physical processes that produce the signal (of which our time series is a sample), and in fact our time series methods seek to (mathematically) account for this.
## Autocovariance and autocorrelation
...
...
@@ -51,7 +51,10 @@ Prove that $Cov(S_t, S_{t-\tau}) =Cov(S_t, S_{t+\tau})$:
"* **Signal** - the meaningful information that we want to detect: deterministic characteristics by means of mathematical expressions to capture for example trend, seasonality and offsets.\n",
"\n",
"* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise needs to be accounted for in prediction, see later {ref}`AR`. \n",
"* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise needs to be accounted for in predictions, see later {ref}`AR`. \n",
"\n",
"The example in {numref}`signal_noise` shows that the *signal* can be described by $\\cos(2\\pi t f) + \\sin(2\\pi t f)$. The stochastic model (assuming independent normally distributed observations) would be a scaled identity matrix with variance equal to 1 (middle panel) and 9 (bottom panel), respectively. The signal of interest has been entirely hidden in the background noise in the bottom panel. Techniques from signal processing can be used to detect the frequency.\n",
"\n",
...
...
@@ -61,7 +61,7 @@
"Most notable, all observations are uncorrelated (off-diagonal elements of the covariance matrix are equal to 0). When we compute the PSD, the resulting density will be flat over the entire range of frequencies. In other words, a white noise process has equal energy over all frequencies, just like white light. We will show this in the interactive plot at the bottom of this page.\n",
"\n",
"### Colored noise\n",
"In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The PSD #TODO(add ref to psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically with frequency, and the blue noise process has a PSD that increases with frequency. \n",
"In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The [PSD](../signal/spectral_est.md#power-spectral-density-psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically with frequency, and the blue noise process has a PSD that increases with frequency. \n",
"\n"
]
},
...
...
%% Cell type:markdown id: tags:
(noiseandstoch)=
# Noise and stochastic model
The code on this page can be used interactively: click {fa}`rocket` --> {guilabel}`Live Code` in the top right corner, then wait until the message {guilabel}`Python interaction ready!` appears.
In the previous section you have learned about the different components that can be present in a time series. Removing all these component, i.e. the functional model, we will be left with the residual term $\epsilon(t)$. In this section we will take a closer look at the difference between signal and noise, and introduce other types of noise than the traditional white noise.
## Additional concepts
In [Signal Processing](SP) the data is just considered to be the signal of interest, whereas here we assume the data is "contaminated" with noise, i.e.
$$Y = \text{signal} + \text{noise} $$
Time series analysis means understanding patterns and, hence, extracting the **signal of interest** from the noisy data.
### Signal and noise
How can we describe both signal and noise?
***Signal** - the meaningful information that we want to detect: deterministic characteristics by means of mathematical expressions to capture for example trend, seasonality and offsets.
***Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise needs to be accounted for in prediction, see later {ref}`AR`.
***Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise needs to be accounted for in predictions, see later {ref}`AR`.
The example in {numref}`signal_noise` shows that the *signal* can be described by $\cos(2\pi t f) + \sin(2\pi t f)$. The stochastic model (assuming independent normally distributed observations) would be a scaled identity matrix with variance equal to 1 (middle panel) and 9 (bottom panel), respectively. The signal of interest has been entirely hidden in the background noise in the bottom panel. Techniques from signal processing can be used to detect the frequency.
```{figure} ./figs/signal_noise.png
:name: signal_noise
:width: 600px
:align: center
Example of a time series (top graph) affected by noise with different strengths (middle and bottom figures). Note the different scales on the vertical axes.
```
#### Signal to noise ratio
In signal processing the signal to noise ratio is commonly used to report on the amount of noise present in the model. If we analyze the model $Y = signal + noise$, then Y is a random variable with $E[Y] = E[signal] = \mu$, and its variance $D(Y) = D(noise) = \sigma^2$. Using this the signal to noise ratio is often defined as:
$$ SNR = \frac{\mu}{\sigma}$$
<!-- or alternatively as:
$$ SNR = \frac{\mu^2}{\sigma^2}$$ -->
The signal to noise ratio is a measure of how much the signal stands out from the noise. The higher the signal to noise ratio, the more the signal stands out from the noise. Better equipment or more data can increase the signal to noise ratio.
## Different types of noise
In the ideal case, when the signal is removed, you are left with white noise. A white noise stochastic model has the following properties:
Most notable, all observations are uncorrelated (off-diagonal elements of the covariance matrix are equal to 0). When we compute the PSD, the resulting density will be flat over the entire range of frequencies. In other words, a white noise process has equal energy over all frequencies, just like white light. We will show this in the interactive plot at the bottom of this page.
### Colored noise
In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The PSD #TODO(add ref to psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically with frequency, and the blue noise process has a PSD that increases with frequency.
In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The [PSD](../signal/spectral_est.md#power-spectral-density-psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically with frequency, and the blue noise process has a PSD that increases with frequency.
If you are interested, you can read more about the different types of noise in the [Wikipedia article](https://en.wikipedia.org/wiki/Colors_of_noise). In here you can also listen to the different types of noise, which might give you a better understanding of the differences.