PA 1.6: Boxes and Bugs

No description has been provided for this image No description has been provided for this image

CEGM1000 MUDE: Week 1.6. Due: before Friday, Oct 4, 2024.

The purpose of this notebook is to introduce a few useful Python topics:

  1. Visualize a matrix with plt.matshow (Matplotlib)
  2. Filling in the contents of a matrix ($m$ x $n$ Numpy array) with specific patterns and values
  3. Illustrate the difference between range and np.arange

Context

For many scientific computing applications, in particular the field of numerical analysis, we formulate and solve our problems using matrices. The matrix itself is an arbitrary collection of values, however, the formulation and discretization of the problem will dictate a specific structure and meaning to the arrangement of the values inside a given matrix. When solving ordinary and partial differential equations with numerical schemes, we discretize space and time into discrete points or intervals, and the values of interest are specified by the elements of a matrix or vector. For example, a vector of the quantity $y$ can be discretized as y = [y0, y1, y2, ... , yn], where each element yi refers to the $n$ spatial coordinate of $y_i$. For 2D problems, or perhaps problem with a temporal component (a dependence on time), we need to encode this information in matrices. Thus, when implementing numerical schemes, it is important to be able to fill in the values of a matrix in an efficient and reliable way.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

Topic 1: Visualizing a Matrix

At this point you should be familiar with the Numpy library and its key data type the ndarray. In other words, it should be very obvious why executing something like this:

import numpy as np
x = np.array([1, 4, 7, 9])

returns something like this:

numpy.ndarray

We have already also used Numpy to create 2D arrays to represent matrices. Often one of the challenges of working with matrices is visualizing their contents, especially when the matrices become very large. Fortunately there is a Matplotlib method that makes visualizing matrices very easy: matshow. When using the conventional import statement import matplotlib.pyplot as plt, we can use this method as plt.matshow.

Task 1.1: Use the Python help function to view the docstring (documentation) of the matrix visualization method.

In [2]:
# help(plt.matshow)

Task 1.2: Run the cell below to visualize the A matrix. Change the values and rerun the cell to see the effect, especially noting that each "square" corresponds to an element in the matrix. Simple, right?!

In [3]:
A = [[1, 2, 1, 1],
     [2, 3, 2, 2],
     [1, 2, 1, 1],
     [1, 2, 1, 1]]

plt.matshow(A)
plt.show()
No description has been provided for this image

Task 1.3: Run the cell below to see how a 100x100 matrix filled with random values looks.

In [4]:
A = np.random.rand(100, 100)
plt.matshow(A)
plt.show()
No description has been provided for this image

That's pretty much all there is to it. Note that the axes indicate the row and column indices.

Topic 2: Filling a Matrix

Now that we can visualize the contents of a matrix, lets find an efficient way to fill it with specific values, focusing on creating specific patterns in an efficient way with our Python code. First, let's recall a few more important things about Numpy arrays, focusing on the particular case of making a 2-dimensional arrays to represent 2-dimensional matrices.

One of the first things to remember is that Numpy uses a parameter shape to define the dimension and length of each axis of an array. For the 2D case, this means an $m$-by-$n$ matrix is specified with a tuple containing two elements: (m, n).

Second, Numpy has many methods that make it easy to create a matrix and fill it with specific values. Check out a cool list here: Numpy array creation routines. Some commonly used methods are:

  • np.zeros
  • np.ones
  • np.full

Third there are many Numpy methods that can modify an existing matrix (see the same list linked above), for example: np.fill_diagonal.

Finally, remember that arrays are quite smart when it comes to indexing. For example, we can use the range method (part of the standard Python library) to things to specific indices in an array.

With these tips in mind, let's go over a few warm-up exercises to see how to easily manipulate matrices.

Task 2.1:

Refresh your memory on the range function by printing the documentation. Then comment the help line and confirm that you can use the function by using a list comprehension to print: a) values from 1 to 5, then b) values 2, 4, 6, 8, 10.

In [5]:
# # help(range)
# 
# print('Part a:')
# [print(i) for i in YOUR_CODE_HERE]
# print('Part b:')
# [print(i) for i in YOUR_CODE_HERE]

# SOLUTION
print('Part 1:')
[print(i) for i in range(6)]

print('Part 2:')
[print(i) for i in range(2, 11, 2)]
Part 1:
0
1
2
3
4
5
Part 2:
2
4
6
8
10
Out[5]:
[None, None, None, None, None]

Task 2.2:

Use a Numpy method to create a 3x3 matrix filled with value 1.

In [6]:
# A = YOUR_CODE_HERE
# 
# assert np.all(A==1)
# assert A.shape==(3, 3)

# SOLUTION
A = np.ones((3,3))

assert np.all(A==1)
assert A.shape==(3, 3)

Task 2.3:

Use a Numpy method to create a 3x3 matrix filled with value 3 on the diagonal and 0 elsewhere.

In [7]:
# A = YOUR_CODE_HERE
# np.YOUR_CODE_HERE
# 
# assert np.all(A.diagonal()==3)
# assert A.sum()==9
# assert A.shape==(3, 3)

# SOLUTION
A = np.zeros((3,3))
np.fill_diagonal(A, 3)

assert np.all(A.diagonal()==3)
assert A.sum()==9
assert A.shape==(3, 3)

Task 2.4:

Use a Numpy method to create a 10x10 matrix, then assign every other element in the diagonal of the matrix to the value 1 using range and indexing. Use plt.matshow() to confirm that the matrix plot looks like a checkerboard.

In [8]:
# A = YOUR_CODE_HERE
# A[YOUR_CODE_HERE, YOUR_CODE_HERE] = YOUR_CODE_HERE

# plt.matshow(A)
# plt.show()

# assert A.shape==(10, 10)
# assert A.sum()==5
# assert np.sum(A==1)==5
# assert np.sum(A==0)==95

# SOLUTION
A = np.zeros((10,10))
A[range(1, 10, 2), range(1, 10, 2)] = 1

plt.matshow(A)
plt.show()

assert A.shape==(10, 10)
assert A.sum()==5
assert np.sum(A==1)==5
assert np.sum(A==0)==95
No description has been provided for this image

Task 2.5:

Use a Numpy method to create a 5x5 matrix, fill the diagonal with value 5, then use range and indexing to assign the diagonal above and below the center diagonal to the value 1. The solution is illustrated in the imported Markdown figure below.

solution for matrix 1

In [9]:
# A = YOUR_CODE_HERE
# YOUR_CODE_HERE
# A[YOUR_CODE_HERE, YOUR_CODE_HERE] = YOUR_CODE_HERE
# A[YOUR_CODE_HERE, YOUR_CODE_HERE] = YOUR_CODE_HERE

# plt.matshow(A)
# plt.show()

# assert A.shape==(5, 5)
# assert A.sum()==(5*5 + 2*4)

# SOLUTION
A = np.zeros((5, 5))
np.fill_diagonal(A, 5)
A[range(4), range(1, 5)] = 1
A[range(1, 5), range(4)] = 1

plt.matshow(A)
plt.show()

assert A.shape==(5, 5)
assert A.sum()==(5*5 + 2*4)
No description has been provided for this image

Task 2.6: Create the matrix illustrated in the figure below, where the values are either 0 or 1.

solution for matrix 1

In [10]:
# A = YOUR_CODE_HERE
# for i in YOUR_CODE_HERE:
#     YOUR_CODE_HERE

# plt.matshow(A)
# plt.show()

# assert A.shape==(10, 10)
# assert A.sum()==25

# SOLUTION
A = np.zeros((10, 10))
for i in range(0, 10, 2):
    A[i, range(0, 10, 2)] = 1

plt.matshow(A)
plt.show()

assert A.shape==(10, 10)
assert A.sum()==25
No description has been provided for this image

Topic 3: a range and arange

The previous part used range to fill in the items of a matrix. However, you may also be familiar with a method from the Numpy library called arange. On the one hand, both methods do similar things, which can roughly be described as follows:

  • if one input, a, is given, count integers from 0 to a
  • if two inputs, a and b, are given, count integers from a to b
  • if three inputs, a, b and c, are given, count from a to b by (integer!) increment c
  • in all cases, exclude b

Despite these similarities they return different object types, which often leads to confusion or errors if used without explicitly accounting for this difference. Let's take a closer look to find out more.

Task 3.1:

Print the documentation for np.arange and compare it to range until you can identify the differences.

In [11]:
help(np.arange)
Help on built-in function arange in module numpy:

arange(...)
    arange([start,] stop[, step,], dtype=None, *, like=None)

    Return evenly spaced values within a given interval.

    ``arange`` can be called with a varying number of positional arguments:

    * ``arange(stop)``: Values are generated within the half-open interval
      ``[0, stop)`` (in other words, the interval including `start` but
      excluding `stop`).
    * ``arange(start, stop)``: Values are generated within the half-open
      interval ``[start, stop)``.
    * ``arange(start, stop, step)`` Values are generated within the half-open
      interval ``[start, stop)``, with spacing between values given by
      ``step``.

    For integer arguments the function is roughly equivalent to the Python
    built-in :py:class:`range`, but returns an ndarray rather than a ``range``
    instance.

    When using a non-integer step, such as 0.1, it is often better to use
    `numpy.linspace`.

    See the Warning sections below for more information.

    Parameters
    ----------
    start : integer or real, optional
        Start of interval.  The interval includes this value.  The default
        start value is 0.
    stop : integer or real
        End of interval.  The interval does not include this value, except
        in some cases where `step` is not an integer and floating point
        round-off affects the length of `out`.
    step : integer or real, optional
        Spacing between values.  For any output `out`, this is the distance
        between two adjacent values, ``out[i+1] - out[i]``.  The default
        step size is 1.  If `step` is specified as a position argument,
        `start` must also be given.
    dtype : dtype, optional
        The type of the output array.  If `dtype` is not given, infer the data
        type from the other input arguments.
    like : array_like, optional
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this argument.

        .. versionadded:: 1.20.0

    Returns
    -------
    arange : ndarray
        Array of evenly spaced values.

        For floating point arguments, the length of the result is
        ``ceil((stop - start)/step)``.  Because of floating point overflow,
        this rule may result in the last element of `out` being greater
        than `stop`.

    Warnings
    --------
    The length of the output might not be numerically stable.

    Another stability issue is due to the internal implementation of
    `numpy.arange`.
    The actual step value used to populate the array is
    ``dtype(start + step) - dtype(start)`` and not `step`. Precision loss
    can occur here, due to casting or due to using floating points when
    `start` is much larger than `step`. This can lead to unexpected
    behaviour. For example::

      >>> np.arange(0, 5, 0.5, dtype=int)
      array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
      >>> np.arange(-3, 3, 0.5, dtype=int)
      array([-3, -2, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8])

    In such cases, the use of `numpy.linspace` should be preferred.

    The built-in :py:class:`range` generates :std:doc:`Python built-in integers
    that have arbitrary size <python:c-api/long>`, while `numpy.arange`
    produces `numpy.int32` or `numpy.int64` numbers. This may result in
    incorrect results for large integer values::

      >>> power = 40
      >>> modulo = 10000
      >>> x1 = [(n ** power) % modulo for n in range(8)]
      >>> x2 = [(n ** power) % modulo for n in np.arange(8)]
      >>> print(x1)
      [0, 1, 7776, 8801, 6176, 625, 6576, 4001]  # correct
      >>> print(x2)
      [0, 1, 7776, 7185, 0, 5969, 4816, 3361]  # incorrect

    See Also
    --------
    numpy.linspace : Evenly spaced numbers with careful handling of endpoints.
    numpy.ogrid: Arrays of evenly spaced numbers in N-dimensions.
    numpy.mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions.
    :ref:`how-to-partition`

    Examples
    --------
    >>> np.arange(3)
    array([0, 1, 2])
    >>> np.arange(3.0)
    array([ 0.,  1.,  2.])
    >>> np.arange(3,7)
    array([3, 4, 5, 6])
    >>> np.arange(3,7,2)
    array([3, 5])

In particular, note the following sentences in the docstring for np.arange:

For integer arguments the function is roughly equivalent to the Python
built-in :py:class:`range`, but returns an ndarray rather than a ``range``
instance.

When using a non-integer step, such as 0.1, it is often better to use
`numpy.linspace`.

The main difference is that np.arange returns an array!

Task 3.2:

Confirm that you understand the usage of np.arange by creating the same two sets of integer values as in Task 2.1 (integers 0 through 5 and 2 through 10 by 2's), except this time you will produce Numpy arrays in addition the printing the indices.

In [12]:
# x = YOUR_CODE_HERE
# print('Part 1:')
# [print(i) for i in YOUR_CODE_HERE]
# 
# assert type(x) == np.ndarray
# assert np.all(x == [0, 1, 2, 3, 4, 5])
# 
# x = YOUR_CODE_HERE
# print('Part 2:')
# [print(i) for i in YOUR_CODE_HERE]
# 
# assert type(x) == np.ndarray
# assert np.all(x == [2, 4, 6, 8, 10])

# SOLUTION
x = np.arange(6)
print('Part 1:')
[print(i) for i in x]

assert type(x) == np.ndarray
assert np.all(x == [0, 1, 2, 3, 4, 5])

x = np.arange(2, 11, 2)
print('Part 2:')
[print(i) for i in x]

assert type(x) == np.ndarray
assert np.all(x == [2, 4, 6, 8, 10])
Part 1:
0
1
2
3
4
5
Part 2:
2
4
6
8
10

This Part is not meant to be complicated; rather, it is meant to explicitly indicate the difference between range and np.arange to help you debug your code more easily. The main takeaway is that you should use range when you are iterating through indices and don't need to use the indices as values, whereas np.arange is necessary when the indices are needed as values. It is also good to recognize that range is part of the standard Python library, whereas np.arange is not (it is part of Numpy). This is because range returns a range object, whereas np.arange returns a Numpy array.

End of notebook.

© Copyright 2024 MUDE TU Delft. This work is licensed under a CC BY 4.0 License.