Skip to content
Snippets Groups Projects
Commit c97970a8 authored by Robert Lanzafame's avatar Robert Lanzafame
Browse files

delete files from add-numpy-reference branch

parent 8a12e09d
No related branches found
No related tags found
No related merge requests found
Showing
with 0 additions and 3566 deletions
%% Cell type:markdown id: tags:
# Numpy: How it Works
Introduction
[theory](https://tudelft-citg.github.io/learn-python/05/Theory/01.html)
[quick reference](https://tudelft-citg.github.io/learn-python/05/In_a_Nutshell/01.html)
Exercises: [airpplane velocity](https://tudelft-citg.github.io/learn-python/05/Exercises/01.html) and [bending moment](https://tudelft-citg.github.io/learn-python/05/Exercises/02.html).
%% Cell type:markdown id: tags:
This notebook is based on the Numpy lesson from [Aalto Scientific Computing: Python for Scientific Computing](https://github.com/AaltoSciComp/python-for-scicomp/) and [W3Schools](https://www.w3schools.com/python/numpy/).
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
## See also
* NumPy manual <https://numpy.org/doc/stable/reference/>`
* Basic array class reference <https://numpy.org/doc/stable/reference/arrays.html>
* Indexing <https://numpy.org/doc/stable/reference/arrays.indexing.html>`
* ufuncs <https://numpy.org/doc/stable/reference/ufuncs.html>`
* 2020 Nature paper on NumPy's role and basic concepts <https://www.nature.com/articles/s41586-020-2649-2>`
%% Cell type:markdown id: tags:
## What is an array?
For example, consider `[1, 2.5, 'asdf', False, [1.5, True]]` - this is a Python list but it has different types for every element. When you do math on this, every element has to be handled separately.
Lists may serve the purpose of arrays, but they are slow to process. Numpy aims to provide an array object that is up to 50x faster than traditional Python lists. Numpy is the most used library for scientific computing. Even if you are not using it directly, chances are high that some library uses it in the background.
The array data structure in numpy is called `ndarray`, it provides a lot of supporting functions that make working with `ndarray` very easy.
An array is a ‘grid’ of values, with all the same types. It is indexed by tuples of non negative indices and provides the framework for multiple dimensions. An array has:
- `dtype` - data type. Arrays always contain one type
- `shape` - shape of the data, for example 3×2 or 3×2×500 or even 500 (one dimensional) or [] (zero dimensional).
- `data` - raw data storage in memory. This can be passed to C or Fortran code for efficient calculations.
%% Cell type:markdown id: tags:
## Performance check
To quickly show the fast performances of NumPy arrays, we can compare the results of a basic operations using lists and array. In particular we will compute the square of 10000 elements.
%% Cell type:markdown id: tags:
We first do this using Python lists, by creating a list with values from 0 to 9999, and one ‘empty’ list, to store the result in.
%% Cell type:code id: tags:
``` python
a = list(range(10000))
b = [ 0 ] * 10000
```
%% Cell type:code id: tags:
``` python
%%timeit
for i in range(len(a)):
b[i] = a[i]**2
```
%% Cell type:markdown id: tags:
That looks and feels quite fast. But let’s take a look at how NumPy performs for the same task. We first import the `numpy` module, then we create our *a* and *b* containers again, which are now `ndarray` objects. Finally we perform the square operation.
%% Cell type:code id: tags:
``` python
import numpy as np
a = np.arange(10000)
b = np.zeros(10000)
```
%% Cell type:code id: tags:
``` python
%%timeit
b = a ** 2
```
%% Cell type:markdown id: tags:
We see that working with numpy arrays provides substantial performance improvements.
%% Cell type:markdown id: tags:
> **Note**: To evaluate the time of the computation we used the `%%timeit` command. `%%timeit` is a so-called Jupyter notebook *magic command* which is intiated with a `%` or `%%` prefix for line and cell commands, respectively. This `%%` cell magic has to be the first thing in the Jupyter cell, otherwise it will not work. There are many other interesting magic commands available, such as shown [here](https://towardsdatascience.com/top-8-magic-commands-in-jupyter-notebook-c1582e813560).
%% Cell type:markdown id: tags:
## Creating arrays
%% Cell type:markdown id: tags:
Arrays can be created using many different functions, this section will provide an overview in the many useful ways in which arrays can be created.
%% Cell type:markdown id: tags:
You can create an array from a Python list by using `np.array` and passing a Python list:
%% Cell type:markdown id: tags:
>**Note**: To print the values of variables, we will make use of *f-strings*. F-strings have been introduced in Python 3.6, and they are recommended for print formatting since they improve code readability and are less prone to errors. We use f-strings by adding the letter *f* before the string we want to print, and then entering the name of the variables within curly brackets `{` and `}`. More info can be found [here](https://www.geeksforgeeks.org/formatted-string-literals-f-strings-python/)
%% Cell type:code id: tags:
``` python
a = np.array([1,2,3]) # 1-dimensional array (rank 1)
b = np.array([[1,2,3],[4,5,6]]) # 2-dimensional array (rank 2)
# the print statements use f-strings to format the print output.
print(f'a:{a}\n') # \n creates a new line
print(f'a:\t{a}\n') # \n adds a tab, a specific character for indentation
print(f'b:\n{b}\n')
print(f'shape of a: {a.shape}') # the shape (# rows, # columns)
print(f'shape of b: {b.shape}') # the shape (# rows, # columns)
print(f'size of a: {a.size}') # number of elements in the array b
print(f'size of b: {b.size}') # number of elements in the array b
```
%% Cell type:markdown id: tags:
Often it is useful to create an array with constant values; the following functions can be used to achieve this:
%% Cell type:code id: tags:
``` python
print(np.zeros((2, 3)), '\n') # Create a 2x3 array with all elements set to 0
print(np.ones((1,2)), '\n') # Create a 1x2 array with all elements set to 1
print(np.full((2,2),7), '\n') # Create a 2x2 array with all elements set to 7
print(np.eye(2), '\n') # Create a 2x2 identity matrix
```
%% Cell type:markdown id: tags:
Other common ways to create a vector include using evenly spaced values in an interval or by specifying the data type
%% Cell type:code id: tags:
``` python
a = np.arange(10) # Evenly spaced values in an interval, with default stepsize 1
b = np.linspace(0,9,10) # An array with 10 values between 0 and 9
# (check the difference with np.arange in the next section)
c = np.ones((3, 2), bool) # 3x2 boolean array
print(f'a:\n{a}\n')
print(f'b:\n{b}\n')
print(f'c:\n{c}')
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
---
## Array Data types
What exactly is the difference between `np.arange(10)` and `np.linspace(0,9,10)`?
- ``np.arange(10)`` results in ``array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])`` with dtype **int64**,
- while ``np.linspace(0,9,10)`` results in ``array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])`` with dtype **float64**.
Both ``np.linspace`` and ``np.arange`` take dtype (data type) as an argument and can be adjusted to match each other in that way:
%% Cell type:code id: tags:
``` python
print('As int64:')
print(np.arange(10))
print(np.linspace(0,9,10, dtype=np.int64))
print('\n')
print('As float64:')
print(np.arange(10, dtype=np.float64))
print(np.linspace(0,9,10))
```
%% Cell type:markdown id: tags:
---
In many occasions (especially when something goes different than expected) it is useful to check and control, or change, the datatype of the array:
%% Cell type:code id: tags:
``` python
d = np.ones((3, 2), bool)
print(f'd:\n{d}\n')
print(f'datatype of d:\n{d.dtype}\n')
e = d.astype(int)
print(f'e:\n{e}\n')
print(f'datatype of d:\n{e.dtype}\n')
```
%% Cell type:markdown id: tags:
When converting floats to integers using `.astype()`, all floats in a numpy array are rounded to the largest integer lower than or equal to the float representation:
%% Cell type:code id: tags:
``` python
nums = np.linspace(0,2,11)
print(f'nums:\n{nums}\n')
numsint = nums.astype(np.int64)
print(f'nums as integer:\n{numsint}\n')
```
%% Cell type:markdown id: tags:
Did you notice anything in the previous two cells?
Right! We called the `astype` function not from the `np` module, but from the `ndarray` objects themselves. These are indeed *methods*, rather than *functions*. The main differences are highlighted in the table below.
|Method | Function|
| :----------- | :-----------|
| is associated with the objects of the class they belong to | is not associated with any object|
| is called 'on' an object and we cannot invoke it just by its name | we can invoke a function just by its name.|
Nearly all the method versions do the same thing as the function versions. Choosing the method or the function will usually depend on which one is easier to type or read. Some examples will be provided later in this notebook.
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
Create an array with elements ranging from 10 up to 15 (inclusive), with data type=unsigned 8 bit integer.
Use the following functions:
- Creating a python list and converting it to an array using `np.array()`
- using `np.linspace()`
- using `np.arange()`
%% Cell type:code id: tags:
``` python
print('Your code here')
print('Your code here')
print('Your code here')
```
%% Cell type:markdown id: tags:
## Types of operations
There are different types of standard operations in NumPy:
**ufuncs**, or universal functions operats on ndarrays in an element-by-element fashion. They can be *unary*, operating on a single input, or *binary*, operating on two inputs.
They are used to implement vectorization in NumPy which is way faster than iterating over elements. They also provide broadcasting and additional methods like reduce, accumulate etc. that are very helpful for computation.
ufuncs also take additional arguments, like:
`where` boolean array or condition defining where the operations should take place.
`dtype` defining the return type of elements.
`out` output array where the return value should be copied.
A thorough explanation and list of ufunc is available at [W3Schools](https://www.w3schools.com/python/numpy/numpy_ufunc.asp)
%% Cell type:markdown id: tags:
There are ufunc equivalents for Python's native arithmetic operators, e.g., the standard addition, subtraction, multiplication, division, negation, exponentiation, and so on. The ufunc however allows for more control, for instance we can use the `out` argument to specify the array where the result of the calculation will be stored (rather than creating a temporary array). This turns out to be particularly useful for large computations.
%% Cell type:markdown id: tags:
Example: in-place addition. Create an array, add it to itself using a ufunc.
%% Cell type:code id: tags:
``` python
x = np.array([1, 2, 3])
print(f'x before addition: {x}')
print(f'id before addition: {id(x)}') # get the memory-ID of x
np.add(x, x, x) # Third argument is output array
np.add(x, x, x)
print(f'x after addition: {x}')
print(f'id after addition: {id(x)}') # get the memory-ID of x
# - notice it is the same!
```
%% Cell type:markdown id: tags:
Example: broadcasting. Can you add a 1-dimensional array of shape `(3)`
to an 2-dimensional array of shape `(3, 2)`? With broadcasting you
can, and most of the times it happens 'under the hood'.
%% Cell type:code id: tags:
``` python
a = np.array([[1, 2, 3],
[4, 5, 6]])
print(f'a:\n{a}\n') # Print a
b = np.array([10, 10, 10])
print(f'b:\n{b}\n') # Print b
print(f'np.add(a, b):\n{np.add(a, b)}\n') # add arrays a and b
```
%% Cell type:markdown id: tags:
Broadcasting is smart and consistent about what it does. The basics of broadcasting are [documented here](https://numpy.org/doc/stable/user/basics.broadcasting.html). The basic idea is that it expands dimensions of the smaller array so that they are compatible in shape.
%% Cell type:markdown id: tags:
### Array methods
Array methods also implement useful operations, sometimes similar to the ufuncs.
Remember that array methods are called on the `ndarray` object. You can find the full list of methods [here](https://numpy.org/doc/stable/reference/arrays.ndarray.html) along with all other important informations on `ndarray`.
%% Cell type:code id: tags:
``` python
x = np.arange(12)
x.shape = (3, 4)
x # array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
x.max() # 11
```
%% Cell type:markdown id: tags:
# How to Use Arrays (ndarray)
text
%% Cell type:markdown id: tags:
This notebook is based on the Numpy lesson from [Aalto Scientific Computing: Python for Scientific Computing](https://github.com/AaltoSciComp/python-for-scicomp/) and [W3Schools](https://www.w3schools.com/python/numpy/).
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
## Indexing and Slicing
NumPy has many ways to extract values out of arrays:
- You can select a single element
- You can select rows or columns
- You can select ranges where a condition is true.
An example of some ways of indexing is shown in the following image (credits GeeksForGeeks):
<img src="https://media.geeksforgeeks.org/wp-content/uploads/Numpy1.jpg" alt="indexing" style="width:400px;"/>
Clever and efficient use of these operations is a key to NumPy's speed.
%% Cell type:markdown id: tags:
<font color='red'>Reminder: In python, all indexing starts at zero, so to select the index of the first element in a list or numpy array is represented by a 0!</font>
%% Cell type:code id: tags:
``` python
a = np.arange(16).reshape(4, 4) # 4x4 matrix from 0 to 15
print(f'a:\n{a}\n')
print(f'a[0]:\n{a[0]}\n') # first row
print(f'a[:,0]:\n{a[:,0]}\n') # first column
print(f'a[1:3,1:3]:\n{a[1:3,1:3]}\n') # middle 2x2 array
print(f'a[(0, 1), (1, 1)]:\n{a[(0, 1), (1, 1)]}') # second element of first and second row as array
```
%% Cell type:markdown id: tags:
You can also perform *boolean indexing* on arrays, such as shown below:
%% Cell type:code id: tags:
``` python
print(f'a > 7:\n{a > 7}\n') # creates boolean matrix of same size as a
print(f'a[a > 7]:\n{a[a > 7]}\n') # array with matching values of above criterion
```
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
For the reshaped taxi ride duration array `taxi_weeks`, create the following arrays using slicing:
- An array containing only daily total durations of *fridays*
- An array containing *monday's* total durations from week 2 up to week 5
- An array containing only entries with a total duration of more than 600 minutes
%% Cell type:code id: tags:
``` python
fridays = 'Your code here'
print(fridays)
mondays_week_2_to_5 = 'Your code here'
print(mondays_week_2_to_5)
total_duration_over_6000 = 'Your code here'
print(total_duration_over_6000)
```
%% Cell type:markdown id: tags:
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
The reshaped array `taxi_weeks` currently starts on a friday because this is the first day of the year. People often prefer to have the first column of the array corresponding to a monday instead. Using *slicing* and *reshaping*, create a new version of `taxi_weeks` from the `durations` array where the first column represents monday and chronological order is maintained.
%% Cell type:markdown id: tags:
> Hint: It is easier if you remove some observations at the beginning and the end because they are not part of a full week of observations.
%% Cell type:code id: tags:
``` python
taxi_weeks_monday = 'Your code here'
```
%% Cell type:markdown id: tags:
Again, we visualise the result:
%% Cell type:code id: tags:
``` python
labels = ['monday', 'tuesday', 'wednesday', 'thursday','friday', 'saturday', 'sunday']
plot_taxi_weeks(taxi_weeks_monday,labels)
```
%% Cell type:markdown id: tags:
---
## Array reshaping
Arrays can be [reshaped](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) in many different ways, as long as the number of entries in the new shape does not differ from the number of entries in the original array.
For example, the following array can be reshaped into a 3 by 3 array:
<img src="./1dim.png" alt="drawing" width="600"/>
By reshaping this array into a 3 by 3 array using the default reading order, the following array is created:
<img src="./2dim.png" alt="drawindg" style="width:200px;"/>
%% Cell type:code id: tags:allow_errors
``` python
arr = np.arange(10)
print(f'original:\n{arr}')
print(f'\n5 rows and 2 columns:\n{arr.reshape((5, 2))}')
print(f'\n2 rows and 5 columns:\n{arr.reshape((2, -1))}') # -1 provides the fitting lenght of the dimension
print(f'\n1 row and 5 columns:\n{arr.reshape((1, 5))}') # This action will cause an error because
# 10 entries do not fit in a 1 by 5 array
```
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
Reshape the Taxi array as loaded in the previous exercise such that the array columns represent weekdays and the array rows represent different weeks in the period of the data set. Note that the first day of the year 2016 was a *friday*, so the week representation in the columns will start at *friday*.
%% Cell type:code id: tags:
``` python
taxi_weeks = 'Your code here'
```
%% Cell type:markdown id: tags:
A visualization of the reshaped array:
%% Cell type:code id: tags:
``` python
from plotting_functions import plot_taxi_weeks
plot_taxi_weeks(taxi_weeks, labels = ['friday','saturday','sunday','monday','tuesday','wednesday','thursday'])
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
---
## View vs copy
See the cell below:
%% Cell type:code id: tags:
``` python
a = np.eye(4) # Create an array
print(f'a:\n{a}\n') # Print a
b = a[:,0] # Set variable b as the first column of b
b[0] = 5 # Set all elements in b to 5
print(f'b:\n{b}\n') # print b
print(f'a:\n{a}\n') # print a again
```
%% Cell type:markdown id: tags:
The change in ``b`` has also changed the array ``a``!
This is because ``b`` is merely a *view* of a part of array ``a``. Both
variables point to the same memory. Hence, if one is changed, the other
one also changes! If you need to keep the original array as is, use `np.copy(a)` or `a.copy()`.
%% Cell type:code id: tags:
``` python
a = np.eye(4) # Create an array
print(f'a:\n{a}\n') # Print a
b = np.copy(a)[:,0] # Set variable b as a copy of the first column of b
b[0] = 5 # Set all elements in b to 5
print(f'b:\n{b}\n') # print b
print(f'a:\n{a}\n') # print a again
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
---
## Saving and loading arrays
When working with arrays, it might be useful to save or load an array to a file on your computer. This can be done using the `np.save()` and `np.load()` functions respectively:
%% Cell type:code id: tags:
``` python
arr = np.linspace(0, 10, 11) # Create an array
print(f'arr:\n{arr}')
np.save('arr.npy', arr) # Save the array to a file on your computer
arr = None # Setting the arr parameter to None
print(f'arr:\n{arr}')
arr = np.load('arr.npy') # Load the array from the created .npy file
print(arr)
```
%% Cell type:markdown id: tags:
You now saved `arr.npy` such that you can use it later and in different scripts! It is also possible to load csv or txt files using the `np.loadtxt()` function. by passing the correct string representing the delimiter character, a txt or csv file can be loaded as an array:
%% Cell type:code id: tags:
``` python
arr_from_csv = np.loadtxt('./numpy_files/example_data.csv', delimiter=',') # This file uses the comma as the seperating character
arr_from_txt = np.loadtxt('./numpy_files/example_data.txt', delimiter='\t') # This file uses a tab as the seperating character
print(f'array from csv file:\n{arr_from_csv}\n')
print(f'array from txt file:\n{arr_from_txt}')
```
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
Load the provided csv `taxi_duration.txt` using the `np.loadtxt` function. The text file contains two columns: one representing the day of the year, and the other representing the daily total duration of taxi rides corresponding to the day of the year. Check the number of days in your loaded dataset. You can preview the file in a text editor if you want.
%% Cell type:code id: tags:
``` python
taxis = 'Your code here'
# print(f'The dataset is {len(taxis)} days long.')
```
%% Cell type:markdown id: tags:
Now, visualize the dataset by running the cell below
%% Cell type:code id: tags:
``` python
from numpy_files.plotting_functions import plot_taxi_time_series
plot_taxi_time_series(taxis)
```
%% Cell type:markdown id: tags:
# Using Numpy for Mathematics
using it more
%% Cell type:markdown id: tags:
This notebook is based on the Numpy lesson from [Aalto Scientific Computing: Python for Scientific Computing](https://github.com/AaltoSciComp/python-for-scicomp/) and [W3Schools](https://www.w3schools.com/python/numpy/).
%% Cell type:markdown id: tags:
Clearly, you can do math on arrays. Math in NumPy is fast because it is implemented in C or Fortran, just like in most other high-level languages such as R and Matlab.
By default, in NumPy all math is performed element-by-element.
%% Cell type:code id: tags:
``` python
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
c = a + b
d = np.add(a,b)
print('a\n', a, '\n')
print('b:\n', b, '\n')
print('a + b:\n', a + b, '\n')
print('a * b:\n', a * b, '\n')
print('a / b:\n', a / b, '\n')
print('square root of a:\n', np.sqrt(a), '\n')
```
%% Cell type:markdown id: tags:
Also the sum or mean an array can be obtained through the `np.mean` and `np.std` functions:
%% Cell type:code id: tags:
``` python
print('sum of a:\n', np.sum(a), '\n')
print('mean of a:\n', np.mean(a), '\n')
```
%% Cell type:markdown id: tags:
In the above cell we see that `np.sum(a)` provides the sum of all elements in a. If we wish to get the sum per row or per column we can specify the *axis* over which to sum (0 corresponds to rows and 1 corresponds to columns):
%% Cell type:code id: tags:
``` python
a = np.array([[1,2],[3,4]])
print('a\n', a, '\n')
print('sum of a:\n', np.sum(a), '\n') # No specified axis
print('sum of a per column:\n', np.sum(a, axis = 0), '\n') # sum over axis 0
print('sum of a per row:\n', np.sum(a, axis = 1)) # sum over axis 1
```
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
Compute the standard deviation of the trip *duration* data, using the following functions: `np.sqrt()`, `np.mean()`, `np.sum()`, `np.size()`, and the mathematical operators `-` and `/`.
The standard deviation is defined as $\sqrt{\frac{\sum^{n}_{i=0} \left( x_i - \bar{x} \right)^2}{n}}$ for a vector x with size n. Compare the result with the usage of the `np.std()` function.
The array to use for this exercise is the `durations` array as defined below, which contains only the Taxi ride durations of the original imported array (no day of the year column).
%% Cell type:code id: tags:
``` python
taxis = np.loadtxt('./numpy_files/taxi_duration.txt', delimiter=',')
```
%% Cell type:code id: tags:
``` python
durations = taxis[:,1]
mean_duration = 'Your code here'
std_duration = 'Your code here'
print(f'Mean duration: {mean_duration}\nStandard deviation of the duration: {std_duration}')
```
%% Cell type:markdown id: tags:
Next, compute the mean and standard deviation per weekday using the `taxi_weeks` array:
- On which weekday, on average, does the highest taxi ride duration occur?
%% Cell type:code id: tags:
``` python
mean_duration = 'Your code here'
std_duration = 'Your code here'
print(f'Mean duration: {mean_duration}\nStandard deviation of the duration: {std_duration}')
print(np.std(taxi_weeks, axis=0))
```
%% Cell type:markdown id: tags:
---
## Dot product and matrix multiplication
As we saw in the previous example, the `*` operator or `.multiply()` function performs an element wise multiplication. To perform matrix multiplication, the `@` operator can be used:
%% Cell type:code id: tags:
``` python
a = np.eye(3) * 2
b = np.arange(1,10, dtype=np.float64).reshape((3,3))
print(f'a\n{a}\n')
print(f'b:\n{b}\n')
print(f'a * b:\n{a * b}\n') # Element-wise multiplication
print(f'a @ b:\n{a @ b}\n') # dot product or matrix multiplication
print(f'np.dot(a, b):\n{np.dot(a, b)}\n') # dot product or matrix multiplication
```
%% Cell type:markdown id: tags:
To transpose an array representing a vector or matrix, the `np.transpose()` function can be used. Alternatively, an array can be transposed by accessing its `.T` attribute.
%% Cell type:code id: tags:
``` python
a = np.arange(6)
a = a.reshape((3,2)) # a now has 3 rows and 2 columns
print(f'a:\n{a}\n')
print(f'np.transpose(a):\n{np.transpose(a)}\n') # a now has 2 rows and 3 columns
print(f'a.T:\n{a.T}') # a now has 2 rows and 3 columns (same outcome as line above)
```
%% Cell type:markdown id: tags:
---
### <font color='red'>Exercise</font>
%% Cell type:markdown id: tags:
Create the two matrices A and B as numpy arrays: $A = \begin{bmatrix} 1&4&2\\0&2&1\\3&7&6 \end{bmatrix}$, $B = \begin{bmatrix} 2&0&1\\0&3&0\\1&2&0 \end{bmatrix}$.
Next, perform the following operations:
- Compute $C = A + B$
- Compute $D = A \cdot B$
- Compute $D^T$
%% Cell type:code id: tags:
``` python
A = 'Your code here'
B = 'Your code here'
C = 'Your code here'
D = 'Your code here'
print(f'{A}\n')
print(f'{B}\n')
print(f'{C}\n')
print(f'{D}\n')
```
%% Cell type:markdown id: tags:
## Example: Linear algebra using Numpy
In this short example, we will solve a linear system of equations using numpy.
Let's say we want to fit a polynomial $y = a_0 x^2 + a_1 x + a_2$ through the points $(1,0)$, $(2,2)$, and $(3,1)$.
We can obtain the variables $a_0$, $a_1$, and $a_2$ by solving the folowing linear system of equations:
$\begin{bmatrix} 1 & 1 & 1\\ 4 & 2 & 1\\ 9 & 3 & 1 \end{bmatrix} \begin{bmatrix} a_0\\ a_1\\ a_2 \end{bmatrix} = \begin{bmatrix} 0\\ 2\\ 1 \end{bmatrix}$
If we want to solve a simple system of linear equations in the form of $\mathbf{A}\mathbf{x} = \mathbf{b}$, when given A and b. If A is invertable, then this equation can be solved by inverting rearranging the matrix and vectors: $\mathbf{A}^{-1}\mathbf{b} = \mathbf{x}$
%% Cell type:code id: tags:
``` python
A = np.array([[1, 1, 1],
[4, 2, 1],
[9, 3, 1]])
b = np.array([0, 2, 1]).T
x = np.linalg.inv(A) @ b
print(f'A:\n{A}\n')
print(f'b:\n{b}\n')
print(f'x:\n{x}\n')
```
%% Cell type:markdown id: tags:
Checking the specified conditions:
%% Cell type:code id: tags:
``` python
a0, a1, a2 = x
print(f'a0 * 1**2 + a1 * 1 + a2 = {a0 * 1**2 + a1 * 1 + a2:.2f}') # Check solution at x = 1
print(f'a0 * 2**2 + a1 * 2 + a2 = {a0 * 2**2 + a1 * 2 + a2:.2f}') # Check solution at x = 2
print(f'a0 * 3**2 + a1 * 3 + a2 = {a0 * 3**2 + a1 * 3 + a2:.2f}') # Check solution at x = 3
```
%% Cell type:markdown id: tags:
It can be seen that the solution is nearly correct... The values of used in these calculations are floats, which cannnot represent every number exactly. Therefore, when performing calculations, the outcome might differ by a very small amount in the order of 1e-15 times the magnitude of the number.
%% Cell type:markdown id: tags:
---
Alternatively, we could have used the `np.linalg.solve()` function to solve the equation $\mathbf{A}\mathbf{x} = \mathbf{b}$ given **A** and **b**:
%% Cell type:code id: tags:
``` python
x = np.linalg.solve(A, b)
print(x)
```
%% Cell type:markdown id: tags:
---
Another way of obtaining the parameters of a polynomial fitted to a number of coordinates (utilizing the least squares method) is through the `np.polyfit()` function, where the x and y coordinates of the coordinates must be passed in two seperate arrays:
%% Cell type:code id: tags:
``` python
coordinates = np.array([[1, 0], # Define an array containing the required coordinates
[2, 2],
[3, 1]])
x = np.polyfit(coordinates[:,0], coordinates[:,1], deg=2) # Use the np.polyfit function specifying the coordinates and the degree of polynomial
print(x)
```
%% Cell type:markdown id: tags:
---
This shows that there are always multiple options to tackling a poblem using numpy, and for a lot of scenarios there is likely already a numpy function which can be used to reduce the amount of code needed to perform a task.
%% Cell type:markdown id: tags:
## More linear algebra and other advanced math
In general, you use `arrays` (n-dimensions), not `matrixes`
(specialized 2-dimensional) in NumPy.
Internally, NumPy doesn't invent its own math routines: it relies on
[BLAS](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms)
and [LAPACK](https://en.wikipedia.org/wiki/LAPACK) to do this kind
of math - the same as many other languages.
- [Linear algebra in numpy documentation](https://numpy.org/doc/stable/reference/routines.linalg.html)
- [Scipy](https://docs.scipy.org/doc/scipy/reference/) has
more usful functions
- Many other libraries use NumPy arrays as the standard data
structure: they take data in this format, and return it similarly.
Thus, all the other packages you may want to use are compatible
%% Cell type:markdown id: tags:
# Exercises
%% Cell type:markdown id: tags:
## Optional Exercises
If you have extra time, try these out.
1. Reverse a vector: given a vector, reverse it such that the last
element becomes the first, e.g. ``[1, 2, 3]`` => ``[3, 2, 1]``
2. Create a 2D array with zeros on the borders and 1 inside.
3. Create a random array of length 20 with elements [0, 1), then add 10 to all
elements in the range [0.2, 0.7).
4. What is `np.round(0.5)`? What is ``np.round(1.5)``? Why?
5. In addition to ``np.round``, explore `numpy.ceil`, `numpy.floor`,
`numpy.trunc`. In particular, take note of how they behave with
negative numbers.
6. Recall the identity $sin^2(x) + cos^2(x) = 1$. Create a
random 4x4 array with values in the range [0, 10). Now test the
equality with `numpy.equal`. What result do you get with
`numpy.allclose` instead of ``np.equal``?
7. Create a 1D array with 10 random elements. Sort it.
8. What's the difference between `np_array.sort()` and
`np.sort(np_array)`?
9. For the random array in question 8, instead of sorting it, perform
an indirect sort. That is, return the list of indices which would
index the array in sorted order.
10. Create a 4x4 array of zeros, and another 4x4 array of ones. Next
combine them into a single 8x4 array with the content of the zeros
array on top and the ones on the bottom. Finally, do the same,
but create a 4x8 array with the zeros on the left and the ones on
the right.
%% Cell type:code id: tags:
``` python
# Answer for Ex. 1
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 2
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 3
# YOUR SOLUTION MAY BE DIFFERENT IF YOU USE A DIFFERENT SEED
np.random.seed(42)
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 4
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 5
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 6
# YOUR SOLUTION MAY BE DIFFERENT IF YOU USE A DIFFERENT SEED
np.random.seed(42)
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 7
# YOUR SOLUTION MAY BE DIFFERENT IF YOU USE A DIFFERENT SEED
np.random.seed(42)
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 8
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 9
# YOUR SOLUTION MAY BE DIFFERENT IF YOU USE A DIFFERENT SEED
np.random.seed(42)
```
%% Cell type:code id: tags:
``` python
# Answer for Ex. 10.
```
File deleted
book/programming/code/numpy_files/barcharts.png

53.4 KiB

0.1, 5.0
0.2, 6.0
0.3, 5.7
0.4, 6.7
0.5, 5.8
0.6, 4.5
0.7, 4.9
0.8, 5.6
0.9, 5.7
1.0, 5.3
\ No newline at end of file
0.1 5.0
0.2 6.0
0.3 5.7
0.4 6.7
0.5 5.8
0.6 4.5
0.7 4.9
0.8 5.6
0.9 5.7
1.0 5.3
\ No newline at end of file
import numpy as np
import matplotlib.pyplot as plt
def plot_taxi_time_series(array):
plt.figure(figsize=(10, 10))
plt.plot(array[:,0], array[:,1])
plt.xlabel('Day of the year')
plt.ylabel('Daily total duration of taxi rides [minutes]')
plt.show()
def plot_taxi_weeks(array, labels):
plt.figure(figsize=(5, 8))
plt.imshow(array)
plt.xticks([0, 1, 2, 3, 4, 5, 6], labels, rotation=60)
plt.ylabel('Week number')
plt.yticks(np.arange(26), np.arange(1,27))
bar = plt.colorbar()
bar.set_label('Total daily taxi ride duration [minutes]')
\ No newline at end of file
1,2975
2,2412
3,2523
4,2805
5,2856
6,3447
7,3546
8,3908
9,3505
10,3371
11,2996
12,3138
13,3458
14,3541
15,3686
16,5049
17,2746
18,2639
19,2999
20,3794
21,4069
22,3793
23,948
24,1410
25,3661
26,5528
27,3787
28,3934
29,4304
30,5325
31,3256
32,3370
33,3260
34,4189
35,5326
36,3440
37,3362
38,3209
39,3005
40,3378
41,4977
42,4161
43,4618
44,4900
45,5319
46,2715
47,3843
48,5568
49,3926
50,3948
51,6351
52,4516
53,3543
54,4258
55,3799
56,4430
57,4498
58,5527
59,3681
60,4562
61,3437
62,4249
63,5609
64,9037
65,4214
66,3203
67,3059
68,3512
69,4237
70,4365
71,4296
72,5610
73,3309
74,5634
75,7116
76,3968
77,4662
78,4471
79,5910
80,3328
81,4724
82,3546
83,3625
84,5433
85,4161
86,3362
87,4382
88,3140
89,5174
90,3832
91,6672
92,4251
93,4009
94,5259
95,4956
96,4362
97,3866
98,6665
99,5761
100,4226
101,3376
102,3412
103,4365
104,4058
105,5140
106,5195
107,5920
108,4069
109,3182
110,3867
111,5206
112,3593
113,4563
114,4758
115,3331
116,3227
117,4671
118,4052
119,4025
120,5847
121,4136
122,4802
123,3080
124,4217
125,3796
126,4709
127,5155
128,4247
129,4697
130,4922
131,3849
132,3358
133,6704
134,6826
135,3940
136,5711
137,5282
138,4880
139,5752
140,4630
141,5028
142,4476
143,3683
144,3141
145,3865
146,5943
147,4996
148,3846
149,4437
150,3188
151,3444
152,3505
153,4176
154,5929
155,4126
156,4093
157,3929
158,4869
159,3794
160,4278
161,6276
162,4529
163,4842
164,4637
165,3019
166,4050
167,4497
168,5103
169,5242
170,3530
171,3362
172,3577
173,3843
174,4206
175,3849
176,5426
177,4815
178,3282
179,4356
180,3528
181,3540
182,3724
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment