added reference links

f98a3d67 · Isabel Slingerland · 8907ee09 · f98a3d67 · f98a3d67 · f98a3d67
Commit f98a3d67 authored 3 months ago by Isabel Slingerland
--- a/book/time_series/acf.md
+++ b/book/time_series/acf.md
@@ -3,7 +3,7 @@

 Before we can look into the modelling of a stochastic process using an Autoregressive (AR) model, we first need to introduce the autocovariance function (ACF) for a stationary time series, and describe the relationship between ACF and a power spectral density (PSD).

-As in the Chapter on #TODO (add reference to obs theory), the variance component is often determined based on the precision of an observation (at a given epoch), and the covariance components quantitatively indicate the statistical dependence (or independence) between observations. In this case, dependence is inherently introduced by the physical processes that produce the signal (of which our time series is a sample), and in fact our time series methods seek to (mathematically) account for this.
+As in [Observation theory](../observation_theory/01_Introduction.md), the variance component is often determined based on the precision of an observation (at a given epoch), and the covariance components quantitatively indicate the statistical dependence (or independence) between observations. In this case, dependence is inherently introduced by the physical processes that produce the signal (of which our time series is a sample), and in fact our time series methods seek to (mathematically) account for this.

 ## Autocovariance and autocorrelation

@@ -51,7 +51,10 @@ Prove that $Cov(S_t, S_{t-\tau}) =Cov(S_t, S_{t+\tau})$:
 :class: tip, dropdown

 From the definition of covariance, we know that
-$$ Cov(a,b) = Cov(b,a)$$
+
+$$
+Cov(a,b) = Cov(b,a)
+$$

 Hence, we have that


--- a/book/time_series/modelling.md
+++ b/book/time_series/modelling.md
@@ -27,11 +27,11 @@ Recall that the BLUE of $\mathrm{x}$ is:

 $$\hat{X}=(\mathrm{A}^T\Sigma_{Y}^{-1}\mathrm{A})^{-1}\mathrm{A}^T\Sigma_{Y}^{-1}Y,\hspace{10px}\Sigma_{\hat{X}}=(\mathrm{A}^T\Sigma_{Y}^{-1}\mathrm{A})^{-1}$$

-The BLUE of $Y$ and $\epsilon$ is
+The BLUE of $Y$  is:

 $$\hat{Y}=\mathrm{A}\hat{X},\hspace{10px}\Sigma_{\hat{Y}}=\mathrm{A}\Sigma_{\hat{X}}\mathrm{A}^T$$

-and
+and $\epsilon$ is:

 $$\hat{\epsilon}=Y-\hat{Y},\hspace{10px}\Sigma_{\hat{\epsilon}}=\Sigma_{Y}-\Sigma_{\hat{Y}}$$


--- a/book/time_series/noise.ipynb
+++ b/book/time_series/noise.ipynb
@@ -24,7 +24,7 @@
    "\n",
    "* **Signal** - the meaningful information that we want to detect: deterministic characteristics by means of mathematical expressions to capture for example trend, seasonality and offsets.\n",
    "\n",
-    "* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise  needs to be accounted for in prediction, see later {ref}`AR`. \n",
+    "* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise  needs to be accounted for in predictions, see later {ref}`AR`. \n",
    "\n",
    "The example in {numref}`signal_noise` shows that the *signal* can be described by $\\cos(2\\pi t f) + \\sin(2\\pi t f)$. The stochastic model (assuming independent normally distributed observations) would be a scaled identity matrix with variance equal to 1 (middle panel) and 9 (bottom panel), respectively. The signal of interest has been entirely hidden in the background noise in the bottom panel. Techniques from signal processing can be used to detect the frequency.\n",
    "\n",
@@ -61,7 +61,7 @@
    "Most notable, all observations are uncorrelated (off-diagonal elements of the covariance matrix are equal to 0). When we compute the PSD, the resulting density will be flat over the entire range of frequencies. In other words, a white noise process has equal energy over all frequencies, just like white light. We will show this in the interactive plot at the bottom of this page.\n",
    "\n",
    "### Colored noise\n",
-    "In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The PSD #TODO(add ref to psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically  with frequency, and the blue noise process has a PSD that increases with frequency. \n",
+    "In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The [PSD](../signal/spectral_est.md#power-spectral-density-psd) is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically  with frequency, and the blue noise process has a PSD that increases with frequency. \n",
    "\n"
   ]
  },

 %% Cell type:markdown id: tags:

 (noiseandstoch)=
 # Noise and stochastic model
 The code on this page can be used interactively: click {fa}`rocket` --> {guilabel}`Live Code` in the top right corner, then wait until the message {guilabel}`Python interaction ready!` appears.

 In the previous section you have learned about the different components that can be present in a time series. Removing all these component, i.e. the functional model, we will be left with the residual term $\epsilon(t)$. In this section we will take a closer look at the difference between signal and noise, and introduce other types of noise than the traditional white noise.

 ## Additional concepts

 In [Signal Processing](SP) the data is just considered to be the signal of interest, whereas here we assume the data is "contaminated" with noise, i.e.

 $$Y = \text{signal} + \text{noise} $$

 Time series analysis means understanding patterns and, hence, extracting the **signal of interest** from the noisy data.

 ### Signal and noise

 How can we describe both signal and noise?

 * **Signal** - the meaningful information that we want to detect: deterministic characteristics by means of mathematical expressions to capture for example trend, seasonality and offsets.

-* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise  needs to be accounted for in prediction, see later {ref}`AR`.
+* **Noise** - random and undesired fluctuation that interferes with the signal: stochastic process are needed to describe this. Parts of the time-correlated noise  needs to be accounted for in predictions, see later {ref}`AR`.

 The example in {numref}`signal_noise` shows that the *signal* can be described by $\cos(2\pi t f) + \sin(2\pi t f)$. The stochastic model (assuming independent normally distributed observations) would be a scaled identity matrix with variance equal to 1 (middle panel) and 9 (bottom panel), respectively. The signal of interest has been entirely hidden in the background noise in the bottom panel. Techniques from signal processing can be used to detect the frequency.

 ```{figure} ./figs/signal_noise.png
 :name: signal_noise
 :width: 600px
 :align: center

 Example of a time series (top graph) affected by noise with different strengths (middle and bottom figures). Note the different scales on the vertical axes.
 ```

 #### Signal to noise ratio
 In signal processing the signal to noise ratio is commonly used to report on the amount of noise present in the model. If we analyze the model $Y = signal + noise$, then Y is a random variable with $E[Y] = E[signal] = \mu$, and its variance $D(Y) = D(noise) = \sigma^2$. Using this the signal to noise ratio is often defined as:

 $$ SNR = \frac{\mu}{\sigma}$$
 <!-- or alternatively as:

 $$ SNR = \frac{\mu^2}{\sigma^2}$$ -->

 The signal to noise ratio is a measure of how much the signal stands out from the noise. The higher the signal to noise ratio, the more the signal stands out from the noise. Better equipment or more data can increase the signal to noise ratio.

 ## Different types of noise
 In the ideal case, when the signal is removed, you are left with white noise. A white noise stochastic model has the following properties:

 $$
 \mathbb{E}(Y) =  \mathbb{E} \left[\begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_m \end{array}\right] = \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \end{array}\right]
 $$

 and

 $$
 \mathbb{D}(Y) =  \Sigma_{Y} = \sigma^2 \left[\begin{array}{ccc} 1 & 0 & \ldots{} & 0 \\ 0 & 1 & \ldots{} & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots{} & 1 \end{array}\right]
 $$
 Most notable, all observations are uncorrelated (off-diagonal elements of the covariance matrix are equal to 0). When we compute the PSD, the resulting density will be flat over the entire range of frequencies. In other words, a white noise process has equal energy over all frequencies, just like white light. We will show this in the interactive plot at the bottom of this page.

 ### Colored noise
-In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The PSD #TODO(add ref to psd)is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically  with frequency, and the blue noise process has a PSD that increases with frequency.
+In time series it is not guarantied that the individual observations are uncorrelated. At the bottom of this page you will find an interactive plot. You can select four different types of noise: white, pink, red and blue. The noise processes are plotted in combination with the PSD. The [PSD](../signal/spectral_est.md#power-spectral-density-psd) is a measure of the power of the signal at different frequencies. The white noise process has a flat PSD, while the other noise processes have a different shape. The pink noise process has a PSD that decreases with frequency, the red noise process has a PSD that decreases quadratically  with frequency, and the blue noise process has a PSD that increases with frequency.


 %% Cell type:code id: tags:auto-execute-page,thebe-remove-input-init

 ``` python
 ## create a white noise signal and plot it
 import numpy as np
 import matplotlib.pyplot as plt
 import ipywidgets as widgets

 # create a white noise signal
 np.random.seed(0)
 N = 1000
 x = np.random.randn(N)

 # Function to generate pink noise
 def pink_noise(N):
    uneven = N % 2
    X = np.random.randn(N//2+1+uneven) + 1j * np.random.randn(N//2+1+uneven)
    S = np.sqrt(np.arange(len(X)) + 1.)  # +1 to avoid divide by zero
    y = (np.fft.irfft(X/S)).real
    if uneven:
        y = y[:-1]
    return y

 # Function to generate red (brown) noise
 def red_noise(N):
    return np.cumsum(np.random.randn(N))

 # Function to generate blue noise
 def blue_noise(N):
    uneven = N % 2
    X = np.random.randn(N//2+1+uneven) + 1j * np.random.randn(N//2+1+uneven)
    S = np.sqrt(np.arange(len(X)))  # no +1 here
    y = (np.fft.irfft(X*S)).real
    if uneven:
        y = y[:-1]
    return y
 # BEGIN: white_noise function
 def white_noise(N):
    return np.random.randn(N)


 # Generate different noise signals
 pink = pink_noise(N)
 red = red_noise(N)
 blue = blue_noise(N)
 x = white_noise(N)
 ```

 %% Cell type:code id: tags:thebe-remove-input-init,auto-execute-page

 ``` python
 noise_options = ['Pink Noise', 'Red Noise', 'Blue Noise', 'White Noise']

 # Create a dropdown menu for noise types
 dropdown = widgets.Dropdown(
    options=noise_options,
    value='White Noise',
    description='Noise Type:',
 )

 # Function to update the plot based on selected noise type
 def update_plot_dropdown(noise_type):

    plt.figure(figsize=(12, 4))
    plt.subplot(2, 1, 1)

    if noise_type == 'Pink Noise':
        plt.plot(pink, label='Pink Noise')
    elif noise_type == 'Red Noise':
        plt.plot(red, label='Red Noise')
    elif noise_type == 'Blue Noise':
        plt.plot(blue, label='Blue Noise')
    elif noise_type == 'White Noise':
        plt.plot(x, label='White Noise')

    plt.title(f'{noise_type} Signal')
    plt.xlabel('Time Index')
    plt.ylabel('Amplitude')
    plt.legend()
    plt.grid()

    plt.subplot(2, 1, 2)
    if noise_type == 'Pink Noise':
        plt.psd(pink, NFFT=2048, Fs=1, color='r', label='Pink Noise')
    elif noise_type == 'Red Noise':
        plt.psd(red, NFFT=2048, Fs=1, color='r', label='Red Noise')
    elif noise_type == 'Blue Noise':
        plt.psd(blue, NFFT=2048, Fs=1, color='r', label='Blue Noise')
    elif noise_type == 'White Noise':
        plt.psd(x, NFFT=2048, Fs=1, color='r', label='White Noise')

    # plt.yscale('log')
    plt.show()

 widgets.interactive(update_plot_dropdown, noise_type=dropdown)
 ```

 %% Output

    interactive(children=(Dropdown(description='Noise Type:', index=3, options=('Pink Noise', 'Red Noise', 'Blue N…

 %% Cell type:markdown id: tags:

 ```{note}
 If you are interested, you can read more about the different types of noise in the [Wikipedia article](https://en.wikipedia.org/wiki/Colors_of_noise). In here you can also listen to the different types of noise, which might give you a better understanding of the differences.
 ```