Skip to content
Snippets Groups Projects
Commit c796977d authored by Kwangjin Lee's avatar Kwangjin Lee
Browse files

Update Analysis_Solution.ipynb

parent 3518a4a2
No related branches found
No related tags found
No related merge requests found
Pipeline #261121 failed
%% Cell type:markdown id:785b4e1d tags:
# GA 1.3: Modelling Road Deformation using Non-Linear Least-Squares
# GA 1.3: Modelling Road Deformation using Non-Linear Least-Squares test file
<h1 style="position: absolute; display: flex; flex-grow: 0; flex-shrink: 0; flex-direction: row-reverse; top: 60px;right: 30px; margin: 0; border: 0">
.markdown {width:100%; position: relative}
article { position: relative }
<img src="" style="width:100px" />
<img src="" style="width:100px" />
<h2 style="height: 10px">
*[CEGM1000 MUDE]( Week 1.3.testfile Due: Friday, September 20, 2024.*
*[CEGM1000 MUDE]( Week 1.3. Due: Friday, September 20, 2024.*
%% Cell type:markdown id:4ad8b9cb tags:
<div style="background-color:#ffa6a6; color: black; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px; width: 95%"><p><b>Note:</b> don't forget to read the "Assignment Context" section of the README, it contains important information to understand this analysis.</p></div>
%% Cell type:code id:181ccfd5 tags:
``` python
import numpy as np
from scipy import interpolate
from scipy.stats import norm
import pandas as pd
import matplotlib.pyplot as plt
from functions import *
%% Output
<Token var=<ContextVar name='format_options' default={'edgeitems': 3, 'threshold': 1000, 'floatmode': 'maxprec', 'precision': 8, 'suppress': False, 'linewidth': 75, 'nanstr': 'nan', 'infstr': 'inf', 'sign': '-', 'formatter': None, 'legacy': 9223372036854775807, 'override_repr': None} at 0x0000026B5AF069D0> at 0x0000026B0DBBC4C0>
%% Cell type:markdown id:5ca94e0e tags:
## Part 0: Dictionary Review
As described above, several functions in this assignment require the use of a Python dictionary to make it easier to keep track of important data, variables and results for the various _models_ we will be constructing and validating.
_It may be useful to revisit PA 1.1, where there was a brief infroduction to dictionaires. That PA contains all the dictionary info you need for GA 1.3. A [read-only copy is here]( and [the source code (notebook) is here](
%% Cell type:markdown id:d8c39791 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
$\textbf{Task 0.1}$
Read and run the cell below to make sure you remember how to use a dictionary.
Modify the function to print some of the other key-value pairs of the dictionary.
<em>It may also be useful to use the cell below when working on later tasks in this assignment.</em>
%% Cell type:code id:8683b5ed tags:
``` python
my_dictionary = {'key1': 'value1',
'key2': 'value2',
'name': 'Dictionary Example',
'a_list': [1, 2, 3],
'an_array': np.array([1, 2, 3]),
'a_string': 'hello'
def function_that_uses_my_dictionary(d):
if 'new_key' in d:
print('new_key exists and has value:', d['new_key'])
%% Output
Dictionary Example
[1, 2, 3]
[1 2 3]
%% Cell type:markdown id:86bc7f97 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
$\textbf{Task 0.2}$
Test your knowledge by adding a new key <code>new_key</code> and then executing the function to print the value.
%% Cell type:code id:41c56f43 tags:
``` python
# function_that_uses_my_dictionary(my_dictionary)
my_dictionary['new_key'] = 'new_value'
%% Output
Dictionary Example
[1, 2, 3]
[1 2 3]
new_key exists and has value: new_value
%% Cell type:markdown id:160d6250 tags:
## Task 1: Preparing the data
Within this assignment you will work with two types of data: InSAR data and GNSS data. The cell below will load the data and visualize the observed displacements time. In this task we use the package `pandas`, which is really useful for handling time series. We will learn how to use it later in the quarter; for now, you only need to recognize that it imports the data as a `dataframe` object, which we then convert into a numpy array using the code below.
%% Cell type:markdown id:02b12781 tags:
<div style="background-color:#facb8e; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px"> <p>Tip: note that we have converted all observations to millimeters.</p></div>
%% Cell type:code id:1f28eba3 tags:
``` python
gnss = pd.read_csv('./data/gnss_observations.csv')
times_gnss = pd.to_datetime(gnss['times'])
y_gnss = (gnss['observations[m]']).to_numpy()*1000
insar = pd.read_csv('./data/insar_observations.csv')
times_insar = pd.to_datetime(insar['times'])
y_insar = (insar['observations[m]']).to_numpy()*1000
gw = pd.read_csv('./data/groundwater_levels.csv')
times_gw = pd.to_datetime(gw['times'])
y_gw = (gw['observations[mm]']).to_numpy()
%% Cell type:markdown id:aa8906a9-2ebe-4432-b4f3-8bf074d6b181 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 1.1:</b>
Once you have used the cell above to import the data, investigate the data sets using the code cell below. Then provide some relevant summary information in the Markdown cell.
<em>Hint: at the least, you should be able to tell how many data points are in each data set and get an understanding of the mean and standard deviation of each. Make sure you compare the different datasets and use consistent units.</em>
%% Cell type:markdown id:12321724-dc58-42ee-9bd5-44265d3bc921 tags:
<div style="background-color:#facb8e; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px"> <p>The code below gives some examples of the quantitative and qualitative ways you could have looked at the data. It is more than you were expected to do; the important thing is that you showed the ability to learn something about the data and describe aspects that are relevant to our problem. We use a dictionary to easily access the different data series using their names, which are entered as the dictionary keys (also not expected of you, but it's hopefully fun to learn useful tricks).</div>
%% Cell type:code id:9f025cfc-4f89-492d-ac26-f5b6381d0c70 tags:
``` python
data_list = ['y_gnss', 'y_insar', 'y_gw']
data_dict = {data_list[0]: y_gnss,
data_list[1]: y_insar,
data_list[2]: y_gw}
def print_summary(data):
'''Summarize an array with simple print statements.'''
print('Minimum = ', data.min())
print('Maximum = ', data.max())
print('Mean = ', data.mean())
print('Std dev = ', data.std())
print('Shape = ', data.shape)
print('First value = ', data[0])
print('Last value = ', data[-1])
for item in data_list:
print('Summary for array: ', item)
%% Output
Summary for array: y_gnss
Minimum = -77.85967600765021
Maximum = 29.432302555465
Mean = -26.998775875445148
Std dev = 16.2218064476615
Shape = (730,)
First value = -13.980633493923001
Last value = -38.6733705713608
Summary for array: y_insar
Minimum = -37.339155096180406
Maximum = -3.7915269917409
Mean = -25.459757789872686
Std dev = 6.8998022892131585
Shape = (61,)
First value = -3.7915269917409
Last value = -30.2754656176263
Summary for array: y_gw
Minimum = -166.784
Maximum = -102.044
Mean = -127.70472
Std dev = 16.822297827633417
Shape = (25,)
First value = -109.698
Last value = -117.268
%% Cell type:code id:71cf2133-37a6-4536-82a6-42a46b8a1c66 tags:
``` python
times_dict = {data_list[0]: times_gnss,
data_list[1]: times_insar,
data_list[2]: times_gw}
def plot_data(times, data, label):
plt.plot(times, data, 'co', mec='black')
plt.ylabel('Data [mm]')
for i in range(3):
%% Output
%% Cell type:markdown id:a9c02e8f-81f9-41a3-b894-c23dd9617207 tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
There are a lot more GNSS data points than InSAR or groundwater. The GNSS observations also have more noise, and what seem to be outliers. In this case the mean and standard deviation do not mean much, because there is clearly a trend with time. We can at least confirm that the time periods of measurements overlap, although the intervals between measurements is certainly not uniform (note that you don't need to do anything with the times, since they are pandas time series and we have not covered them yet).
%% Cell type:markdown id:9fe5a729 tags:
You may have noticed that the groundwater data is available for different times than the GNSS and InSAR data. You will therefore have to *interpolate* the data to the same times for a further analysis. You can use the SciPy function ```interpolate.interp1d``` (read its [documentation](
The cells below do the following:
1. Define a function to convert the time unit
2. Convert the time stamps for all data
3. Use `interp1d` to interpolate the groundwater measurements at the time of the satellite measurements
%% Cell type:code id:f02ed4c4 tags:
``` python
def to_days_years(times):
'''Convert the observation times to days and years.'''
times_datetime = pd.to_datetime(times)
time_diff = (times_datetime - times_datetime[0])
days_diff = (time_diff / np.timedelta64(1,'D')).astype(int)
days = days_diff.to_numpy()
years = days/365
return days, years
%% Cell type:code id:edf14892 tags:
``` python
days_gnss, years_gnss = to_days_years(times_gnss)
days_insar, years_insar = to_days_years(times_insar)
days_gw, years_gw = to_days_years(times_gw)
interp = interpolate.interp1d(days_gw, y_gw)
GW_at_GNSS_times = interp(days_gnss)
GW_at_InSAR_times = interp(days_insar)
%% Cell type:markdown id:16827704-4fc5-4afb-879e-0ceeac45eb18 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 1.2:</b>
Answer/complete the code and Markdown cells below:
<li>What is <code>interp</code>? (what kind of object is it, and how does it work?)</li>
<li>How did the groundwater observation array change? Be quantitative. </li>
%% Cell type:code id:6cdfb46b-1324-4c2b-8148-5a6a102ede2e tags:
``` python
print('array size of GW_at_GNSS_times', len(GW_at_GNSS_times))
print('array size of GW_at_InSAR_times', len(GW_at_InSAR_times))
print('array size of GW before interpolation', len(y_gw))
print('\nFirst values of times_gw:')
print('\nFirst values of y_gw:')
print('\nFirst values of times_gnss:')
print('\nFirst values of GW_at_GNSS_times:')
%% Output
array size of GW_at_GNSS_times 730
array size of GW_at_InSAR_times 61
array size of GW before interpolation 25
First values of times_gw:
0 2017-01-01
1 2017-02-01
Name: times, dtype: datetime64[ns]
First values of y_gw:
[-109.698 -102.044]
First values of times_gnss:
0 2017-01-01
1 2017-01-02
Name: times, dtype: datetime64[ns]
First values of GW_at_GNSS_times:
[-109.698 -109.451]
%% Cell type:markdown id:3b8c68eb-9774-4c3c-91da-f29f035b178c tags:
**Write your answer in this Markdown cell.**
%% Cell type:markdown id:dcc0e2a3-3f34-4bde-af54-37923cb803cd tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<li><code>interp</code> is a function that will return a value (gw level) for the input(s) (date(s)). The interpolated value is found by linearly interpolating between the two nearest times in the gw observations.</li>
<li>The observation arrays of <code>GW_at_GNSS_times</code> and <code>GW_at_INSAR_times</code> changed in size to match the size of the GNSS and InSAR observations, respectively.</li>
%% Cell type:markdown id:349ebd38-d9e1-49b4-b6bd-f5dc399727e0 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 1.3:</b>
Create a single plot to compare observed displacement for the GNSS and InSAR data sets.
%% Cell type:code id:e868e488 tags:
``` python
# plt.figure(figsize=(15,5))
# 'o', mec='black', label = 'GNSS')
# 'o', mec='black', label = 'InSAR')
# plt.legend()
# plt.ylabel('Displacement [mm]')
# plt.xlabel('Time')
plt.plot(times_gnss, y_gnss, 'o', mec='black', label = 'GNSS')
plt.plot(times_insar, y_insar, 'o', mec='black', label = 'InSAR')
plt.ylabel('Displacement [mm]')
%% Output
%% Cell type:markdown id:c9b45b8e tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 1.4:</b>
Describe the datasets based on the figure above and your observations from the previous tasks. What kind of deformation do you see? And what are the differences between both datasets? Be quantitative.
%% Cell type:markdown id:48bae179 tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
The points obviously show subsidence, the displacement shows a similar pattern for both datasets. The GNSS data is much noisier than InSAR (range is around 60 mm versus only a few mm), but has a higher sampling rate. Also there seem to be more outliers in the GNSS data compared to InSAR, especially at the start of the observation period. InSAR has only observations every 6 days but is less noisy.
%% Cell type:markdown id:f9a7bdd4 tags:
Before we move on, it is time to do a little bit of housekeeping.
Have you found it confusing to keep track of two sets of variables---one for each data type? Let's use a dictionary to store relevant information about each model. We will use this in the plotting functions for this task (and again next week), so make sure you take the time to see what is happening. Review also Part 0 at the top of this notebook if you need a refresher on dictionaries.
%% Cell type:markdown id:cdcf20f7 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 1.5:</b>
Run the cell below to define a dictionary for storing information about the two (future) models.
%% Cell type:code id:2c27b4f3 tags:
``` python
model_insar = {'data_type': 'InSAR',
'groundwater': GW_at_InSAR_times
model_gnss = {'data_type': 'GNSS',
'groundwater': GW_at_GNSS_times
%% Cell type:markdown id:76c9115b tags:
## Task 2: Set-up linear functional model
We want to investigate how we could model the observed displacements of the road. Because the road is built in the Green Heart we expect that the observed displacements are related to the groundwater level. Furthermore, we assume that the displacements can be modeled using a constant velocity. The model is defined as
d = d_0 + vt + k \ \textrm{GW},
where $d$ is the displacement, $t$ is time and $\textrm{GW}$ is the groundwater level (that we assume to be deterministic).
Therefore, the model has 3 unknowns:
1. $d_0$, as the initial displacement at $t_0$;
2. $v$, as the displacement velocity;
3. $k$, as the 'groundwater factor', which can be seen as the response of the soil to changes in the groundwater level.
As a group you will construct the **functional model** that is defined as
\mathbb{E}(Y) = \mathrm{A x}.
%% Cell type:markdown id:f6aca691 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 2.1:</b>
Construct the design matrix $A$ (for both InSAR and GNSS observations), then show the first 5 observations and confirm the dimensions of $A$.
%% Cell type:code id:3a3eb1a1 tags:
``` python
A_insar = np.ones((len(times_insar), 3))
A_insar[:,1] = days_insar
A_insar[:,2] = GW_at_InSAR_times
print ('The first 5 rows of the A matrix (InSAR) are:')
print (A_insar[0:5, :])
print ('The first 5 observations [mm] of y_insar are:')
print (y_insar[0:5])
m_insar = np.shape(A_insar)[0]
n_insar = np.shape(A_insar)[1]
print(f'm = {m_insar} and n = {n_insar}')
%% Output
The first 5 rows of the A matrix (InSAR) are:
[[ 1. 0. -109.698]
[ 1. 12. -106.735]
[ 1. 24. -103.772]
[ 1. 36. -106.536]
[ 1. 48. -117.316]]
The first 5 observations [mm] of y_insar are:
[ -3.792 -5.999 -11.403 -9.92 -11.283]
m = 61 and n = 3
%% Cell type:code id:4bcd395d tags:
``` python
A_gnss = np.ones((len(times_gnss), 3))
A_gnss[:,1] = days_gnss
A_gnss[:,2] = GW_at_GNSS_times
print ('The first 5 rows of the A matrix (GNSS) are:')
print (A_gnss[0:5, :])
print ('\nThe first 5 observations [mm] of y_gnss are:')
print (y_gnss[0:5])
m_gnss = np.shape(A_gnss)[0]
n_gnss = np.shape(A_gnss)[1]
print(f'm = {m_gnss} and n = {n_gnss}')
%% Output
The first 5 rows of the A matrix (GNSS) are:
[[ 1. 0. -109.698]
[ 1. 1. -109.451]
[ 1. 2. -109.204]
[ 1. 3. -108.957]
[ 1. 4. -108.71 ]]
The first 5 observations [mm] of y_gnss are:
[-13.981 10.392 -17.091 -7.924 -14.729]
m = 730 and n = 3
%% Cell type:markdown id:d390f466 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
$\textbf{Task 2.2}$
Answer the following questions:
- What is the dimension of the observables' vector $Y$?
- What are the unknowns of the functional model?
- What is the redundancy for this model?
%% Cell type:markdown id:d40a0ecf tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
For InSAR:
<li>The number of observations is 61.</li>
<li>The number of unknowns is 3.</li>
<li>The redundancy is 58.</li>
<li>The number of observations is 730.</li>
<li>The number of unknowns is 3.</li>
<li>The redundancy is 727.</li>
%% Cell type:markdown id:cde2f4db tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 2.3:</b>
Add the A matrix to the dictionaries for each model. This will be used to plot results later in the notebook.
%% Cell type:code id:396ac3a5 tags:
``` python
# model_insar['A'] = YOUR_CODE_HERE
# model_gnss['A'] = YOUR_CODE_HERE
model_insar['A'] = A_insar
model_gnss['A'] = A_gnss
print("Keys and Values (type) for model_insar:")
for key, value in model_insar.items():
print(f"{key:16s} --> {type(value)}")
print("\nKeys and Values (type) for model_gnss:")
for key, value in model_gnss.items():
print(f"{key:16s} --> {type(value)}")
%% Output
Keys and Values (type) for model_insar:
data_type --> <class 'str'>
y --> <class 'numpy.ndarray'>
times --> <class 'pandas.core.series.Series'>
groundwater --> <class 'numpy.ndarray'>
A --> <class 'numpy.ndarray'>
Keys and Values (type) for model_gnss:
data_type --> <class 'str'>
y --> <class 'numpy.ndarray'>
times --> <class 'pandas.core.series.Series'>
groundwater --> <class 'numpy.ndarray'>
A --> <class 'numpy.ndarray'>
%% Cell type:markdown id:9325d32b tags:
## 3. Set-up stochastic model
We will use the Best Linear Unbiased Estimator (BLUE) to solve for the unknown parameters. Therefore we also need a stochastic model, which is defined as
\mathbb{D}(Y) = \Sigma_{Y}.
where $\Sigma_{Y}$ is the covariance matrix of the observables' vector.
%% Cell type:markdown id:dc3aec4c tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 3.1:</b>
Construct the covariance matrix for each type of data and assume that
- the observables are independent
- the observables are normally distributed
- the observables' standard deviation is
- $\sigma_\textrm{InSAR} = 2$ mm
- $\sigma_\textrm{GNSS} = 15$ mm
%% Cell type:code id:163acdb3 tags:
``` python
std_insar = 2 #mm
Sigma_Y_insar = np.identity(len(times_insar))*std_insar**2
print ('Sigma_Y (InSAR) is defined as:')
print (Sigma_Y_insar)
%% Output
Sigma_Y (InSAR) is defined as:
[[4. 0. 0. ... 0. 0. 0.]
[0. 4. 0. ... 0. 0. 0.]
[0. 0. 4. ... 0. 0. 0.]
[0. 0. 0. ... 4. 0. 0.]
[0. 0. 0. ... 0. 4. 0.]
[0. 0. 0. ... 0. 0. 4.]]
%% Cell type:code id:5d583bd8 tags:
``` python
std_gnss = 15 #mm (corrected from original value of 5 mm)
Sigma_Y_gnss = np.identity(len(times_gnss))*std_gnss**2
print ('\nSigma_Y (GNSS) is defined as:')
print (Sigma_Y_gnss)
%% Output
Sigma_Y (GNSS) is defined as:
[[225. 0. 0. ... 0. 0. 0.]
[ 0. 225. 0. ... 0. 0. 0.]
[ 0. 0. 225. ... 0. 0. 0.]
[ 0. 0. 0. ... 225. 0. 0.]
[ 0. 0. 0. ... 0. 225. 0.]
[ 0. 0. 0. ... 0. 0. 225.]]
%% Cell type:markdown id:b2665071 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
$\textbf{Task 3.2}$
Answer the following questions:
- What information is contained in the covariance matrix?
- How do you implement the assumption that all observations are independent?
- What is the dimension of $\Sigma_{Y}$?
- How do you create $\Sigma_{Y}$?
%% Cell type:markdown id:4322a5b5 tags:
_Write your answer in this cell._
%% Cell type:markdown id:124ddd77 tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
- The covariance matrix contains information on the quality of the observations, where an entry on the diagonal represents the variance of one observation at a particular epoch. If there is an indication that for instance the quality for a particular time interval differs, different $\sigma$ values can be put in the stochastic model for these epochs.
- The off-diagonal terms in the matrix are related to the correlation between observations at different epochs, where a zero value on the off-diagonal indicates zero correlation.
- The dimension of the matrix is 61x61 for InSAR and 730x730 for GNSS.
- See code.
%% Cell type:markdown id:8e87d1fe tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 3.3:</b>
Add <code>Sigma_Y</code> to the dictionaries for each model.
%% Cell type:code id:13cffb4c tags:
``` python
# model_insar['Sigma_Y] = YOUR_CODE_HERE
# model_gnss['Sigma_Y'] = YOUR_CODE_HERE
model_insar['Sigma_Y'] = Sigma_Y_insar
model_gnss['Sigma_Y'] = Sigma_Y_gnss
%% Cell type:markdown id:09e965bf tags:
## 4. Apply best linear unbiased estimation
%% Cell type:markdown id:83ec4a48 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 4.1:</b>
Write a function to apply BLUE in the cell below and use the function to estimate the unknowns for the model using the data.
Compute the modeled displacements ($\hat{\mathrm{y}}$), and corresponding residuals ($\hat{\mathrm{\epsilon}}$), as well as associated values (as requested by the blank code lines).
%% Cell type:markdown id:936d6b0c tags:
<div style="background-color:#facb8e; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px"> <p><strong>Note on code implementation</strong>: you'll see that the functions in this assignment use a dictionary; this greatly reduces the number of input/output variables needed in a function. However, it can make the code inside the function more difficult to read due to the key syntax (e.g., <code>dict['variable_1']</code> versus <code>variable
_1</code>). To make this assignment easier for you to implement we have split these functions into three parts: 1) define variables from the dictionary, 2) perform analysis, 3) add results to the dictionary. Note that this is not the most efficient way to write this code; it is done here specifically for clarity and to help you focus on writing the equations properly and understanding the meaning of each term.</p></div>
%% Cell type:code id:d85b1826 tags:
``` python
def BLUE(d):
"""Calculate the Best Linear Unbiased Estimator
Uses dict as input/output:
- inputs defined from existing values in dict
- outputs defined as new values in dict
y = d['y']
A = d['A']
Sigma_Y = d['Sigma_Y']
# Sigma_X_hat = YOUR_CODE_HERE
# x_hat = YOUR_CODE_HERE
# y_hat = YOUR_CODE_HERE
# e_hat = YOUR_CODE_HERE
# Sigma_Y_hat = YOUR_CODE_HERE
# std_y = YOUR_CODE_HERE
# Sigma_e_hat = YOUR_CODE_HERE
# std_e_hat = YOUR_CODE_HERE
Sigma_X_hat = np.linalg.inv(A.T @ np.linalg.inv(Sigma_Y) @ A)
x_hat = Sigma_X_hat @ A.T @ np.linalg.inv(Sigma_Y) @ y
y_hat = A @ x_hat
e_hat = y - y_hat
Sigma_Y_hat = A @ Sigma_X_hat @ A.T
std_y = np.sqrt(Sigma_Y_hat.diagonal())
Sigma_e_hat = Sigma_Y - Sigma_Y_hat
std_e_hat = np.sqrt(Sigma_e_hat.diagonal())
d['Sigma_X_hat'] = Sigma_X_hat
d['x_hat'] = x_hat
d['y_hat'] = y_hat
d['e_hat'] = e_hat
d['Sigma_Y_hat'] = Sigma_Y_hat
d['std_y'] = std_y
d['Sigma_e_hat'] = Sigma_e_hat
d['std_e_hat'] = std_e_hat
return d
%% Cell type:markdown id:f6c74941 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 4.2:</b>
Now that you have completed the function, apply it to our two models and then print values for the estimated parameters.
%% Cell type:code id:4a592ac1 tags:
``` python
model_insar = BLUE(model_insar)
x_hat_insar = model_insar['x_hat']
print ('The InSAR-estimated offset is', np.round(x_hat_insar[0],3), 'mm')
print ('The InSAR-estimated velocity is', np.round(x_hat_insar[1],4), 'mm/day')
print ('The InSAR-estimated velocity is', np.round(x_hat_insar[1]*365,4), 'mm/year')
print ('The InSAR-estimated GW factor is', np.round(x_hat_insar[2],3), '[-]\n')
model_gnss = BLUE(model_gnss)
x_hat_gnss = model_gnss['x_hat']
print ('The GNSS-estimated offset is', np.round(x_hat_gnss[0],3), 'mm')
print ('The GNSS-estimated velocity is', np.round(x_hat_gnss[1],4), 'mm/day')
print ('The GNSS-estimated velocity is', np.round(x_hat_gnss[1]*365,4), 'mm/year')
print ('The GNSS-estimated GW factor is', np.round(x_hat_gnss[2],3), '[-]')
%% Output
The InSAR-estimated offset is 9.174 mm
The InSAR-estimated velocity is -0.0243 mm/day
The InSAR-estimated velocity is -8.8667 mm/year
The InSAR-estimated GW factor is 0.202 [-]
The GNSS-estimated offset is 1.181 mm
The GNSS-estimated velocity is -0.0209 mm/day
The GNSS-estimated velocity is -7.615 mm/year
The GNSS-estimated GW factor is 0.16 [-]
%% Cell type:markdown id:bef2f3be tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 4.3:</b>
Do the values that you just estimated make sense? Explain, using quantitative results.
<em>Hint: all you need to do is use the figures created above to verify that the parameter values printed above are reasonable (e.g., order of magnitude, units, etc).</em>
%% Cell type:markdown id:f6740791 tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
As long as the velocity is negative and around -0.02 mm/day or -10 mm/yr it makes sense if you compare with what you see in the plots with observations. Since load is applied on soil layers we expect the road to subside. We also expect to see a positive value for the GW factor.
%% Cell type:markdown id:65e42a43 tags:
## 5. Evaluate the precision
%% Cell type:markdown id:68f79fcb tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 5:</b>
What is the precision of the final estimates?
Print the full covariance matrix of your estimates, and give an interpretation of the numbers in the covariance matrix.
%% Cell type:code id:835eefc8 tags:
``` python
Sigma_X_hat_insar = model_insar['Sigma_X_hat']
print ('Covariance matrix of estimated parameters (InSAR):')
print (Sigma_X_hat_insar)
print ('\nThe standard deviation for the InSAR-estimated offset is',
np.round(np.sqrt(Sigma_X_hat_insar[0,0]),3), 'mm')
print ('The standard deviation for the InSAR-estimated velocity is',
np.round(np.sqrt(Sigma_X_hat_insar[1,1]),4), 'mm/day')
print ('The standard deviation for the InSAR-estimated GW factor is',
np.round(np.sqrt(Sigma_X_hat_insar[2,2]),3), '[-]\n')
Sigma_X_hat_gnss = model_gnss['Sigma_X_hat']
print ('Covariance matrix of estimated parameters (GNSS):')
print (Sigma_X_hat_gnss)
print ('\nThe standard deviation for the GNSS-estimated offset is',
np.round(np.sqrt(Sigma_X_hat_gnss[0,0]),3), 'mm')
print ('The standard deviation for the GNSS-estimated velocity is',
np.round(np.sqrt(Sigma_X_hat_gnss[1,1]),4), 'mm/day')
print ('The standard deviation for the GNSS-estimated GW factor is',
np.round(np.sqrt(Sigma_X_hat_gnss[2,2]),3), '[-]')
%% Output
Covariance matrix of estimated parameters (InSAR):
[[ 4.530e+00 -4.173e-04 3.363e-02]
[-4.173e-04 1.472e-06 8.776e-07]
[ 3.363e-02 8.776e-07 2.646e-04]]
The standard deviation for the InSAR-estimated offset is 2.128 mm
The standard deviation for the InSAR-estimated velocity is 0.0012 mm/day
The standard deviation for the InSAR-estimated GW factor is 0.016 [-]
Covariance matrix of estimated parameters (GNSS):
[[ 2.160e+01 -2.244e-03 1.595e-01]
[-2.244e-03 6.945e-06 2.238e-06]
[ 1.595e-01 2.238e-06 1.249e-03]]
The standard deviation for the GNSS-estimated offset is 4.647 mm
The standard deviation for the GNSS-estimated velocity is 0.0026 mm/day
The standard deviation for the GNSS-estimated GW factor is 0.035 [-]
%% Cell type:markdown id:8e62592d tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
As shown above, the standard deviations of the estimated parameters are equal to the square root of the diagonal elements. Compared with the estimated values, the standard deviations seem quite small, except for the estimated offsets. Meaning that the complete estimated model can be shifted up or down.
The off-diagonal elements show the covariances between the estimated parameters, which are non-zeros since the estimates are all computed as function of the same vector of observations and the same model. A different value for the estimated velocity would imply a different value for the GW factor and offset.
%% Cell type:markdown id:886efe26 tags:
## 6. Present and reflect on estimation results
%% Cell type:markdown id:7ef41ce8 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 6.1:</b>
Complete the function below to help us compute the confidence intervals, then apply the function. Use a confidence interval of 96% in your analysis.
<em>Hint: it can be used in exactly the same way as the <code>BLUE</code> function above, although it has one extra input.</em>
%% Cell type:code id:2711da12 tags:
``` python
def get_CI(d, alpha):
"""Compute the confidence intervals.
Uses dict as input/output:
- inputs defined from existing values in dict
- outputs defined as new values in dict
std_e_hat = d['std_e_hat']
std_y = d['std_y']
k = norm.ppf(1 - 0.5*alpha)
CI_y = k*std_y
CI_res = k*std_e_hat
CI_y_hat = k*np.sqrt(d['Sigma_Y_hat'].diagonal())
d['alpha'] = alpha
d['CI_y'] = CI_y
d['CI_res'] = CI_res
d['CI_Y_hat'] = CI_y_hat
return d
%% Cell type:code id:d9a41ea5 tags:
``` python
# model_insar = YOUR_CODE_HERE
# model_gnss = YOUR_CODE_HERE
model_insar = get_CI(model_insar, 0.04)
model_gnss = get_CI(model_gnss, 0.04)
%% Cell type:markdown id:53cf3663 tags:
At this point we have all the important results entered in our dictionary and we will be able to use the plots that have been written for you in the next Tasks. In case you would like to easily see all of the key-value pairs that have been added to the dictionary, you can run the cell below:
%% Cell type:code id:b3bb808e tags:
``` python
print("Keys and Values (type) for model_insar:")
for key, value in model_insar.items():
print(f"{key:16s} --> {type(value)}")
print("\nKeys and Values (type) for model_gnss:")
for key, value in model_gnss.items():
print(f"{key:16s} --> {type(value)}")
%% Output
Keys and Values (type) for model_insar:
data_type --> <class 'str'>
y --> <class 'numpy.ndarray'>
times --> <class 'pandas.core.series.Series'>
groundwater --> <class 'numpy.ndarray'>
A --> <class 'numpy.ndarray'>
Sigma_Y --> <class 'numpy.ndarray'>
Sigma_X_hat --> <class 'numpy.ndarray'>
x_hat --> <class 'numpy.ndarray'>
y_hat --> <class 'numpy.ndarray'>
e_hat --> <class 'numpy.ndarray'>
Sigma_Y_hat --> <class 'numpy.ndarray'>
std_y --> <class 'numpy.ndarray'>
Sigma_e_hat --> <class 'numpy.ndarray'>
std_e_hat --> <class 'numpy.ndarray'>
alpha --> <class 'float'>
CI_y --> <class 'numpy.ndarray'>
CI_res --> <class 'numpy.ndarray'>
CI_Y_hat --> <class 'numpy.ndarray'>
Keys and Values (type) for model_gnss:
data_type --> <class 'str'>
y --> <class 'numpy.ndarray'>
times --> <class 'pandas.core.series.Series'>
groundwater --> <class 'numpy.ndarray'>
A --> <class 'numpy.ndarray'>
Sigma_Y --> <class 'numpy.ndarray'>
Sigma_X_hat --> <class 'numpy.ndarray'>
x_hat --> <class 'numpy.ndarray'>
y_hat --> <class 'numpy.ndarray'>
e_hat --> <class 'numpy.ndarray'>
Sigma_Y_hat --> <class 'numpy.ndarray'>
std_y --> <class 'numpy.ndarray'>
Sigma_e_hat --> <class 'numpy.ndarray'>
std_e_hat --> <class 'numpy.ndarray'>
alpha --> <class 'float'>
CI_y --> <class 'numpy.ndarray'>
CI_res --> <class 'numpy.ndarray'>
CI_Y_hat --> <class 'numpy.ndarray'>
%% Cell type:markdown id:aaf72e41 tags:
<div style="background-color:#AABAB2; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
<b>Task 6.2:</b>
Read the contents of file <code></code> and identify what it is doing: you should be able to recognize that they use our model dictionary as an input and create three different figures. Note also that the function to create the figures have already been imported at the top of this notebook.
Use the functions provided to visualize the results of our two models.
%% Cell type:markdown id:e8da1f6c-23de-4f80-a76c-5f0a9b4020b4 tags:
<div style="background-color:#facb8e; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px"> <p><strong>Note</strong>: remember that you will have to use the same function to look at <em>both</em> models when writing your interpretation in the Report.</p></div>
%% Cell type:code id:ec7c8bef tags:
``` python
# _, _ = plot_model(YOUR_CODE_HERE)
_, _ = plot_model(model_insar)
_, _ = plot_model(model_gnss)
%% Output
%% Cell type:code id:104d155d tags:
``` python
# _, _ = plot_residual(YOUR_CODE_HERE)
_, _ = plot_residual(model_insar)
_, _ = plot_residual(model_gnss)
%% Output
%% Cell type:code id:1dc93ce9 tags:
``` python
# _, _ = plot_residual_histogram(YOUR_CODE_HERE)
_, _ = plot_residual_histogram(model_insar)
_, _ = plot_residual_histogram(model_gnss)
%% Output
The mean value of the InSAR residuals is 0.0 mm
The standard deviation of the InSAR residuals is 3.115 mm
The mean value of the GNSS residuals is -0.0 mm
The standard deviation of the GNSS residuals is 15.393 mm
%% Cell type:markdown id:1ae74dd4 tags:
<div style="background-color:#FAE99E; color: black; width:95%; vertical-align: middle; padding:15px; margin: 10px; border-radius: 10px">
$\textbf{Solution: the True Model}$
The data used in this exercise was generated using Monte Carlo Simulation. It is added to the plots here to illustrate where and how our models differ (it is your job to interpret "why").
%% Cell type:code id:113f3809 tags:
``` python
k_true = 0.15
R_true = -22
a_true = 180
d0_true = 10
disp_insar = (d0_true + R_true*(1 - np.exp(-days_insar/a_true)) +
disp_gnss = (d0_true + R_true*(1 - np.exp(-days_gnss/a_true)) +
plot_model(model_insar, alt_model=('True model', times_insar, disp_insar));
plot_model(model_gnss, alt_model=('True model', times_gnss, disp_gnss));
%% Output
%% Cell type:code id:aa877dd5 tags:
``` python
import ipywidgets as widgets
from ipywidgets import interact
# Function to update the plot based on slider values
def update_plot(x0, x1, x2):
for m in [model_gnss]: #[model_insar, model_gnss]:
plt.plot(m['times'], m['y'], 'o', label=m['data_type'])
plt.ylabel('Displacement [mm]')
y_fit = model_gnss['A'] @ [x0, x1, x2]
if (x0 == 0) & (x1 == 0) & (x2 == 1):
plt.plot(model_gnss['times'], y_fit, 'r', label='Groundwater data', linewidth=2)
plt.plot(model_gnss['times'], y_fit, 'r', label='Fit (GNSS)', linewidth=2)
W = np.linalg.inv(model_gnss['Sigma_Y'])
ss_res = (model_gnss['y'] - y_fit).T @ W @ (model_gnss['y'] - y_fit)
plt.title(f'Mean of squared residuals: {ss_res:.0f}')
# Create sliders for x0, x1, and x2
x0_slider = widgets.FloatSlider(value=0, min=-10, max=10, step=0.1, description='x0')
x1_slider = widgets.FloatSlider(value=0, min=-0.1, max=0.1, step=0.001, description='x1')
x2_slider = widgets.FloatSlider(value=0, min=-1, max=1, step=0.01, description='x2')
# Use interact to create the interactive plot
interact(update_plot, x0=x0_slider, x1=x1_slider, x2=x2_slider)
%% Output
<function __main__.update_plot(x0, x1, x2)>
%% Cell type:code id:d8b31540 tags:
``` python
xhat_slider_plot(model_gnss['A'], model_gnss['y'], model_gnss['times'], model_gnss['Sigma_Y'])
%% Output
%% Cell type:markdown id:3203d779 tags:
**End of notebook.**
<h2 style="height: 60px">
<h3 style="position: absolute; display: flex; flex-grow: 0; flex-shrink: 0; flex-direction: row-reverse; bottom: 60px; right: 50px; margin: 0; border: 0">
.markdown {width:100%; position: relative}
article { position: relative }
<a rel="license" href="">
<img alt="Creative Commons License" style="border-width:; width:88px; height:auto; padding-top:10px" src="" />
<a rel="TU Delft" href="">
<img alt="TU Delft" style="border-width:0; width:100px; height:auto; padding-bottom:0px" src="" />
<a rel="MUDE" href="">
<img alt="MUDE" style="border-width:0; width:100px; height:auto; padding-bottom:0px" src="" />
<span style="font-size: 75%">
&copy; Copyright 2024 <a rel="MUDE" href="">MUDE</a> TU Delft. This work is licensed under a <a rel="license" href="">CC BY 4.0 License</a>.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment