Skip to content
Snippets Groups Projects
Commit c4981926 authored by Robert Lanzafame's avatar Robert Lanzafame
Browse files

WS 1.3

parent a19b51ce
No related branches found
No related tags found
No related merge requests found
Pipeline #247499 passed
%% Cell type:markdown id: tags:
# Workshop 2: Is it Melting?
<h1 style="position: absolute; display: flex; flex-grow: 0; flex-shrink: 0; flex-direction: row-reverse; top: 60px;right: 30px; margin: 0; border: 0">
<style>
.markdown {width:100%; position: relative}
article { position: relative }
</style>
<img src="https://gitlab.tudelft.nl/mude/public/-/raw/main/tu-logo/TU_P1_full-color.png" style="width:100px; height: auto; margin: 0" />
<img src="https://gitlab.tudelft.nl/mude/public/-/raw/main/mude-logo/MUDE_Logo-small.png" style="width:100px; height: auto; margin: 0" />
</h1>
<h2 style="height: 10px">
</h2>
*[CEGM1000 MUDE](http://mude.citg.tudelft.nl/): Week 1.3. Wednesday, Sep 18, 2024.*
%% Cell type:markdown id: tags:
In this notebook you will fit a model to a time series of height observations of a point on a glacier, to assess whether it is melting.
**Learning objectives:**
- apply least-squares (LS) and best linear unbiased (BLU) estimation
- evaluate the precision of the estimated parameters
- discuss the differences between least-squares and best linear unbiased estimation
%% Cell type:markdown id: tags:
You have 12 monthly measurements of the height of a point on a glacier. The measurements are obtained from a satellite laser altimeter. The observations are given in the code below.
We will fit a model assuming a linear trend (constant velocity) to account for the melting, plus an annual signal due to the seasonal impact on ice.
The precision (1 $\sigma$) of the first six observations is 0.7 meter, the last six observations have a precision of 2 meter due to a change in the settings of the instrument. All observations are mutually independent.
%% Cell type:code id: tags:
``` python
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt
# this will print all float arrays with 3 decimal places
float_formatter = "{:.3f}".format
np.set_printoptions(formatter={'float_kind':float_formatter})
```
%% Cell type:markdown id: tags:
## Part 1: Observation model
First we will construct the observation model, based on the following information that is provided: times of observations, observed heights and number of observations.
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
$\textbf{Task 1.1}:$
Complete the definition of the observation time array <code>times</code> and define the other variables such that the print statements correctly describe the design matrix <code>A</code>.
</p>
</div>
%% Cell type:code id: tags:
``` python
y = [59.82, 57.20, 59.09, 59.49, 59.68, 59.34, 60.95, 59.34, 55.38, 54.33, 48.71, 48.47]
t = YOUR_CODE_HERE
number_of_observations = YOUR_CODE_HERE
number_of_parameters = YOUR_CODE_HERE
print(f'Dimensions of the design matrix A:')
print(f' {YOUR_CODE_HERE} rows')
print(f' {YOUR_CODE_HERE} columns')
print(f'Dimensions of the design matrix A:')
print(f' {number_of_observations:3d} rows')
print(f' {number_of_parameters:3d} columns')
```
%% Cell type:markdown id: tags:
Next, we need to assemble the design matrix in Python. Rather than enter each value manually, we have a few tricks to make this easier. Here is an example Python cell that illustrates how one-dimensional arrays (i.e., columns or rows) can be assembled into a matrix in two ways:
1. As diagonal elements, and
2. As parallel elements (columns, in this case)
Note also the use of `np.ones()` to quickly make a row or column of any size with the same values in each element.
%% Cell type:code id: tags:
``` python
column_1 = np.array([33333, 0, 0])
column_2 = 99999*np.ones(3)
example_1 = np.diagflat([column_1, column_2])
example_2 = np.column_stack((column_1, column_2))
print(example_1, '\n\n', example_2)
```
%% Cell type:markdown id: tags:
<div style="background-color:#facb8e; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 95%"> <p>See documentation on <a href="https://numpy.org/doc/stable/reference/generated/numpy.diagflat.html#numpy.diagflat" target="_blank">np.diagflat</a> and <a href="https://numpy.org/doc/stable/reference/generated/numpy.column_stack.html" target="_blank">np.column_stack</a> if you would like to understand more about these Numpy methods.</p></div>
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
$\textbf{Task 1.2}:$
Complete the $\mathrm{A}$-matrix (design matrix) and covariance matrix $\Sigma_Y$ (stochastic model) as a linear trend with an annual signal. The <code>assert</code> statement is used to confirm that the dimensions of your design matrix are correct (an error will occur if not).
</p>
</div>
%% Cell type:code id: tags:
``` python
A = YOUR_CODE_HERE
Sigma_Y = YOUR_CODE_HERE
assert A.shape == (number_of_observations, number_of_parameters)
```
%% Cell type:markdown id: tags:
## 2. Least-squares and Best linear unbiased (BLU) estimation
The BLU estimator is a *linear* and *unbiased* estimator, which provides the *best* precision, where:
- linear estimation: $\hat{X}$ is a linear function of the observables $Y$,
- unbiased estimation: $\mathbb{E}(\hat{X})=\mathrm{x}$,
- best precision: sum of variances, $\sum_i \sigma_{\hat{x}_i}^2$, is minimized.
The solution $\hat{X}$ is obtained as:
$$
\hat{X} = \left(\mathrm{A}^T\Sigma_Y^{-1} \mathrm{A} \right)^{-1} \mathrm{A}^T\Sigma_Y^{-1}Y
$$
Note that here we are looking at the *estimator*, which is random, since it is expressed as a function of the observable vector $Y$. Once we have a realization $y$ of $Y$, the estimate $\hat{\mathrm{x}}$ can be computed.
It can be shown that the covariance matrix of $\hat{X}$ is given by:
$$
\Sigma_{\hat{X}} = \left(\mathrm{A}^T\Sigma_Y^{-1} \mathrm{A} \right)^{-1}
$$
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
$\textbf{Task 2.1}$
Apply least-squares, and best linear unbiased estimation to estimate $\mathrm{x}$. The code cell below outlines how you should compute the inverse of $\Sigma_Y$. Compute the least squares estimates then the best linear unbiased estimate.
</p>
</div>
%% Cell type:markdown id: tags:
<div style="background-color:#facb8e; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 95%"> <p>Remember that the <code>@</code> operator (matmul operator) can be used to easily carry out matrix multiplication for Numpy arrays. The transpose method of an array, <code>my_array.T</code>, is also useful.</p></div>
%% Cell type:code id: tags:
``` python
inv_Sigma_Y = np.linalg.inv(Sigma_Y)
xhat_LS = YOUR_CODE_HERE
xhat_BLU = YOUR_CODE_HERE
print('LS estimates in [m], [m/month], [m], resp.:\t', xhat_LS)
print('BLU estimates in [m], [m/month], [m], resp.:\t', xhat_BLU)
```
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 2.2:</b>
The covariance matrix of least-squares can be obtained as well by applying the covariance propagation law.
Calculate the covariance matrix and vector with standard deviations of both the least-squares and BLU estimates. What is the precision of the estimated parameters? The diagonal of a matrix can be extracted with <a href="https://numpy.org/doc/stable/reference/generated/numpy.diagonal.html#numpy.diagonal" target="_blank">np.diagonal</a>.
</p>
</div>
%% Cell type:markdown id: tags:
<div style="background-color:#facb8e; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 95%"> <p><em>Hint: define an intermediate variable first to collect some of the repetitive matrix terms.</em></p></div>
%% Cell type:code id: tags:
``` python
LT = YOUR_CODE_HERE
Sigma_xhat_LS = LT YOUR_CODE_HERE
std_xhat_LS = YOUR_CODE_HERE
Sigma_xhat_BLU = YOUR_CODE_HERE
std_xhat_BLU = YOUR_CODE_HERE
print(f'Precision of LS estimates in [m], [m/month], [m], resp.:', std_xhat_LS)
print(f'Precision of BLU estimates in [m], [m/month], [m], resp.:', std_xhat_BLU)
```
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 2.3:</b>
We are mostly interested in the melting rate. Discuss the estimated melting rate with respect to its precision.
</p>
</div>
%% Cell type:markdown id: tags:
_Write your answer here._
%% Cell type:markdown id: tags:
## 3. Residuals and plot with observations and fitted models.
Run the code below (no changes needed) to calculate the weighted squared norm of residuals with both estimation methods, and create plots of the observations and fitted models.
%% Cell type:code id: tags:
``` python
eTe_LS = (y - A @ xhat_LS).T @ (y - A @ xhat_LS)
eTe_BLU = (y - A @ xhat_BLU).T @ inv_Sigma_Y @ (y - A @ xhat_BLU)
print(f'Weighted squared norm of residuals with LS estimation: {eTe_LS:.3f}')
print(f'Weighted squared norm of residuals with BLU estimation: {eTe_BLU:.3f}')
plt.figure()
plt.rc('font', size=14)
plt.plot(times, y, 'kx', label='observations')
plt.plot(times, A @ xhat_LS, color='r', label='LS')
plt.plot(times, A @ xhat_BLU, color='b', label='BLU')
plt.xlim(-0.2, (number_of_observations - 1) + 0.2)
plt.xlabel('time [months]')
plt.ylabel('height [meters]')
plt.legend(loc='best');
```
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 3.1:</b>
<ul>
<li>Explain the difference between the fitted models.</li>
<li>Can we see from this figure which model fits better (without information about the stochastic model)?</li>
</ul>
</p>
</div>
%% Cell type:markdown id: tags:
_Write your answer here._
%% Cell type:markdown id: tags:
## 4. Confidence bounds
In the code below you need to calculate the confidence bounds, e.g., ```CI_yhat_BLU``` is a vector with the values $k\cdot\sigma_{\hat{y}_i}$ for BLUE. This will then be used to plot the confidence intervals:
$$
\hat{y}_i \pm k\cdot\sigma_{\hat{y}_i}
$$
Recall that $k$ can be calculated from $P(Z < k) = 1-\frac{1}{2}\alpha$.
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
$\textbf{Task 4.1}$
Complete the code below to calculate the 98% confidence intervals of both the observations $y$ <b>and</b> the adjusted observations $\hat{y}$.
Use <code>norm.ppf</code> to compute $k_{98}$. Also try different percentages.
</p>
</div>
%% Cell type:code id: tags:
``` python
yhat_LS = YOUR_CODE_HERE
Sigma_Yhat_LS = YOUR_CODE_HERE
yhat_BLU = YOUR_CODE_HERE
Sigma_Yhat_BLU = YOUR_CODE_HERE
alpha = YOUR_CODE_HERE
k98 = YOUR_CODE_HERE
CI_y_LS = k98
CI_y_BLU = YOUR_CODE_HERE
CI_yhat_LS = YOUR_CODE_HERE
CI_yhat_BLU = YOUR_CODE_HERE
```
%% Cell type:markdown id: tags:
You can directly run the code below to create the plots.
%% Cell type:code id: tags:
``` python
plt.figure(figsize = (10,4))
plt.rc('font', size=14)
plt.subplot(121)
plt.errorbar(times, y, yerr = CI_y_LS, fmt='', capsize=5, linestyle='', color='red', alpha=0.6)
plt.plot(times, y, 'kx', label='observations')
plt.plot(times, yhat_LS, color='r', label='LS')
plt.plot(times, yhat_LS + CI_yhat_LS, 'r:', label=f'{100*(1-alpha):.1f}% conf.')
plt.plot(times, yhat_LS - CI_yhat_LS, 'r:')
plt.xlabel('time [months]')
plt.ylabel('height [meters]')
plt.legend(loc='best')
plt.subplot(122)
plt.plot(times, y, 'kx', label='observations')
plt.errorbar(times, y, yerr = CI_y_BLU, fmt='', capsize=5, linestyle='', color='blue', alpha=0.6)
plt.plot(times, yhat_BLU, color='b', label='BLU')
plt.plot(times, yhat_BLU + CI_yhat_BLU, 'b:', label=f'{100*(1-alpha):.1f}% conf.')
plt.plot(times, yhat_BLU - CI_yhat_BLU, 'b:')
plt.xlim(-0.2, (number_of_observations - 1) + 0.2)
plt.xlabel('time [months]')
plt.legend(loc='best');
```
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 4.2:</b>
Discuss the shape of the confidence bounds. Do you think the model (linear trend + annual signal) is a good choice?
</p>
</div>
%% Cell type:markdown id: tags:
_Write your answer here._
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 4.3:</b>
What is the BLU-estimated melting rate and the amplitude of the annual signal and their 98% confidence interval? Hint: extract the estimated values and standard deviations from <code>xhat_BLU</code> and <code>std_xhat_BLU</code>, respectively.
</p>
</div>
%% Cell type:code id: tags:
``` python
rate =YOUR_CODE_HERE
CI_rate = YOUR_CODE_HERE
amplitude = YOUR_CODE_HERE
CI_amplitude = YOUR_CODE_HERE
print(f'The melting rate is {rate:.3f} ± {CI_rate:.3f} m/month (98% confidence level)')
print(f'The amplitude of the annual signal is {amplitude:.3f} ± {CI_amplitude:.3f} m (98% confidence level)')
```
%% Cell type:markdown id: tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<p>
<b>Task 4.4:</b>
Can we conclude the glacier is melting due to climate change?
</p>
</div>
%% Cell type:markdown id: tags:
_Write your answer here._
%% Cell type:markdown id: tags:
**End of notebook.**
<h2 style="height: 60px">
</h2>
<h3 style="position: absolute; display: flex; flex-grow: 0; flex-shrink: 0; flex-direction: row-reverse; bottom: 60px; right: 50px; margin: 0; border: 0">
<style>
.markdown {width:100%; position: relative}
article { position: relative }
</style>
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/">
<img alt="Creative Commons License" style="border-width:; width:88px; height:auto; padding-top:10px" src="https://i.creativecommons.org/l/by/4.0/88x31.png" />
</a>
<a rel="TU Delft" href="https://www.tudelft.nl/en/ceg">
<img alt="TU Delft" style="border-width:0; width:100px; height:auto; padding-bottom:0px" src="https://gitlab.tudelft.nl/mude/public/-/raw/main/tu-logo/TU_P1_full-color.png" />
</a>
<a rel="MUDE" href="http://mude.citg.tudelft.nl/">
<img alt="MUDE" style="border-width:0; width:100px; height:auto; padding-bottom:0px" src="https://gitlab.tudelft.nl/mude/public/-/raw/main/mude-logo/MUDE_Logo-small.png" />
</a>
</h3>
<span style="font-size: 75%">
&copy; Copyright 2024 <a rel="MUDE" href="http://mude.citg.tudelft.nl/">MUDE</a> TU Delft. This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY 4.0 License</a>.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment