Homework 2¶

Due: Monday, 2/2/26, 23:59:59 \begin{equation*} \newcommand{\mat}[1]{\mathbb{#1}} \newcommand{\vb}[1]{\mathbf{#1}} \newcommand{\dd}[1]{~\mathrm{d}#1} \end{equation*}

Problem 1 (double points) — Gauss-Jordan elimination

Write a Python function gauss_jordan using the basic arrays defined by NumPy to perform Gauss-Jordan elimination on a square matrix. The routine should take in the matrix and return its inverse. It should have an optional parameter that controls whether it uses partial pivoting, so that you can compare the numerical stability of the method when it uses pivoting and when it doesn’t. Be sure to check that your program is working by multiplying the inverse it produces by the original matrix and comparing to the identity matrix.

Suggestions¶

  1. Matrix multiplication in NumPy uses the @ symbol: A @ B
  2. As you check your routine, it can be useful to have NumPy show array values as fractions, rather than the default floating point numbers. To display fractions, use the following code:
from fractions import Fraction
np.set_printoptions(formatter={'float': lambda x: str(Fraction(x).limit_denominator())});

If you want to revert to floating-point output, call np.set_printoptions(precision=5).

  1. You may need to ensure that the data type of the matrix passed to your routine is floating (not integer). You can use a call such as
m = np.asarray(m, dtype=np.float64)

Solution¶

In [1]:
import numpy as np
from numpy.random import default_rng; rng = default_rng()
import matplotlib.pyplot as plt
%matplotlib widget
from scipy.stats import sem      # standard error of the mean
from time import time            # for timing
from fractions import Fraction   # represent real numbers as fractions
import pandas as pd
from pathlib import Path
from matplotlib.ticker import LogLocator, FuncFormatter, NullFormatter

# The following imports some fitting code I wrote
import sys
sys.path.insert(0, '/Users/saeta/.python')
from fit.powerlaw import PowerLaw
In [2]:
# The following tells numpy to express output in the form of simplified fractions
np.set_printoptions(formatter={'float':lambda x: str(Fraction(x).limit_denominator())});
In [3]:
def gauss_jordan_aug(m, pivoting=True, showall=False):
    """
    Use Gauss-Jordan elimination with partial pivoting to compute the inverse
    of square matrix m by first augmenting with an identity matrix.
    Inputs:
        m:         NxN matrix
        pivoting:  if True, use partial pivoting
        showall:   if True, print intermediate steps (helps for debugging!)
    
    Output:
        the inverse of matrix m
    """
    # First check that we are working with a square matrix
    nrows, ncols = np.asarray(m).shape
    assert nrows == ncols

    # Create the augmented matrix, with n and 2n columns
    am = np.hstack((m, np.identity(nrows, dtype=np.float64)))
    
    # Define some helper functions
    def exchange(r1, r2):
        "Exchange rows r1 and r2"
        if showall:
            show(f"\nBefore exchanging {r1} and {r2}\n")
        am[[r1, r2]] = am[[r2, r1]]
        if showall:
            show(f"\nAfter exchanging {r1} and {r2}\n")
            print("\n-----------------------------\n")

    def show(message):
        if showall:
            print(message)
            print(am)

    show("Starting matrix")

    # We'll use [i,j] indexing, so i represents the row and j the column
    for j in range(ncols):
        if pivoting:
            # Identify the best pivot
            rbest = j + np.argmax(np.abs(am[j:, j]))
            if showall:
                if rbest != j:
                    print(f"Exchanging {rbest} and {j} to pivot on {Fraction(am[rbest,j]).limit_denominator()}")
                else:
                    print(f"Pivoting at {rbest} on {Fraction(am[rbest,j]).limit_denominator()}")
            if rbest != j:
                # exchange rows rbest and col
                exchange(rbest, j)
        # Normalize
        pivot = 1.0 / float(am[j, j])
        if pivot != 1.0:
            # Normalize row j to the right of the pivot column
            am[j, j:] *= pivot
        
        # Now wipe out column j in rows i > j
        for i in range(j+1, nrows):
            multiple = -am[i, j]
            am[i, j:] += multiple * am[j, j:]
        show(f"After pivoting row {j}")
    
    # Now proceed from bottom to top
    for j in range(ncols-1, -1, -1):
        for i in range(j):
            multiple = -float(am[i, j])
            am[i, :] += multiple * am[j, :]
        show(f"After clearing column {j}")

    # Return the rightmost N columns, which are the inverse matrix
    return am[:, ncols:]
In [4]:
mat = np.array([[1,2,-3],[2,-1,4],[-2,1,3]])
inv = gauss_jordan_aug(mat, True, True)
Starting matrix
[[1 2 -3 1 0 0]
 [2 -1 4 0 1 0]
 [-2 1 3 0 0 1]]
Exchanging 1 and 0 to pivot on 2

Before exchanging 1 and 0

[[1 2 -3 1 0 0]
 [2 -1 4 0 1 0]
 [-2 1 3 0 0 1]]

After exchanging 1 and 0

[[2 -1 4 0 1 0]
 [1 2 -3 1 0 0]
 [-2 1 3 0 0 1]]

-----------------------------

After pivoting row 0
[[1 -1/2 2 0 1/2 0]
 [0 5/2 -5 1 -1/2 0]
 [0 0 7 0 1 1]]
Pivoting at 1 on 5/2
After pivoting row 1
[[1 -1/2 2 0 1/2 0]
 [0 1 -2 2/5 -1/5 0]
 [0 0 7 0 1 1]]
Pivoting at 2 on 7
After pivoting row 2
[[1 -1/2 2 0 1/2 0]
 [0 1 -2 2/5 -1/5 0]
 [0 0 1 0 1/7 1/7]]
After clearing column 2
[[1 -1/2 0 0 3/14 -2/7]
 [0 1 0 2/5 3/35 2/7]
 [0 0 1 0 1/7 1/7]]
After clearing column 1
[[1 0 0 1/5 9/35 -1/7]
 [0 1 0 2/5 3/35 2/7]
 [0 0 1 0 1/7 1/7]]
After clearing column 0
[[1 0 0 1/5 9/35 -1/7]
 [0 1 0 2/5 3/35 2/7]
 [0 0 1 0 1/7 1/7]]
In [5]:
mat @ inv
Out[5]:
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

So, the routine does appear to work properly, at least for a simple small input.

In [6]:
mat2 = np.array([[1,2,-3,-5],[2,-1,4,7],[-2,1,3,-3],[5,4,3,2]])
inv2 = gauss_jordan_aug(mat2, True)
mat2 @ inv2
Out[6]:
array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])
In [7]:
inv2
Out[7]:
array([[134/111, 39/37, 7/37, -43/111],
       [-161/111, -51/37, -12/37, 79/111],
       [52/111, 19/37, 11/37, -20/111],
       [-91/111, -24/37, -10/37, 35/111]])
Problem 2 (double points) — Numerical stability

Use your gauss_jordan function to investigate the numerical instability of Gauss-Jordan elimination when you do and you don’t use pivoting. For test cases, use matrices of pseudorandom numbers generated by a call such as

from numpy.random import default_rng
rng = default_rng()

rng.uniform(<lower_bound>, <upper_bound>, size=(<dim>, <dim>))
# For example
rng.uniform(-5, 5, size=(5,5))
array([[ 4.23959642,  3.60732447, -0.16149884,  3.6900891 ,  2.65585838],
       [ 1.97695378, -3.09057092, -4.10553997, -4.78785912,  1.44810866],
       [-4.27088306, -4.36821066, -2.14689027, -0.29851331, -4.93396243],
       [-1.99115728, -1.63366415, -3.04232138, -1.73824459,  3.46747273],
       [-4.71288883, -1.94345835, -1.96308442, -1.07039782,  2.43547367]])

One way to explore the numerical error of the inverse is to multiply the matrix by its inverse, subtract the identity matrix, and then report some statistic on the elements of the remaining matrix (which would be all zeros if there were no error). Be sure to explain your choice for figure of merit (although it is really a figure of demerit!).

Prepare a plot showing your figure of merit as a function of matrix dimension when you use pivoting and when you don't and summarize your conclusion. (Your plot should make very clear why you need to use pivoting!)

Suggestions¶

  1. Errors typically depend on the matrix dimension $N$ to some power. Therefore, it is appropriate to plot on logarithmic axes and to space values of $N$ suitably. Think carefully about how to do this.
  2. The smallest value of $N$ for your plot might be something like 256.
  3. For each value of $N$ that you choose, you should perform multiple trials (at least 5) and compute the mean and standard deviation of the mean of the values of your chosen figure of merit. Note that scipy.stats has the function sem (standard error of the mean) that you can import with the statement from scipy.stats import sem. Or, you can compute it yourself from the standard deviation.
  4. As $N$ gets large, it can take quite some time to run your gauss_jordan routine. You might wish to have an indicator to show you how the run is progressing. A simple one to use is tqdm, which you can install using pip. Once installed, import with the statement from tqdm.notebook import tqdm. To use tqdm in a for loop, just surround the iterator with tqdm(). For example,
for n in tqdm(range(nreps), f'Testing inversion for N = {dim}'):
  1. To plot using error bars in matplotlib, use something like ax.errorbar(x, y, yerr=err, fmt='o', capsize=3), where ax is an existing axes object from plt.subplots().
In [8]:
np.set_printoptions(precision=5)
In [9]:
def FOM(m, minv):
    """
    Evaluate the quality of the inverse matrix by subtracting
    the identity matrix from m @ minv, and then assessing how much
    the result differs from zeros everywhere.
    """
    diff = m @ minv - np.identity(m.shape[0])
    std = np.std(diff)  # compute the standard deviation of all N^2 values
    absmax = np.max(np.abs(diff))
    return dict(std=std, absmax=absmax)

Strategy¶

Because runtimes for large matrices can be long, and it could be convenient to log data to a file so that parallel threads can cooperate, I will separate the gathering of data from the analysis of the data. The routine test_inverse(N, metrics) generates a $N\times N$ matrix of random numbers in the range from $-N$ to $N$, computes its inverse in three ways, and then uses the functions in metrics operating on m @ inv - identity to evaluate the quality of the computed inverses. It also records the time taken to compute each inverse. The results are returned in a dictionary.

In [10]:
def test_inverse(N:int, metrics=FOM):
    """

    Inputs:
        N:       the size of the square matrix; the matrix will be constructed from
                 random numbers drawn uniformly from the range -N to N
        metrics: a function that takes the original matrix and the computed inverse,
                 computes `diff = (m @ inv) - identity`, and then returns various statistics
                 on what should be a N x N matrix of zeros

    Output: a dictionary with the following fields:
        N:         the dimension of the square matrix
        t_nopivot:  the time taken by GaussJordanAug when not pivoting
        t_pivot:    the time taken by GaussJordanAug when pivoting
        t_numpy:    the time taken by np.linalg.inv

        for each method in metrics, it includes 
            nopivot_method, pivot_method, numpy_method
    """

    # Create the matrix of random numbers
    m = rng.uniform(-N, N, size=(N,N))
    res = dict(N=N)
    t = -time()
    inopivot = gauss_jordan_aug(m, pivoting=False)
    t += time()
    res['t_nopivot'] = t
    # now analyze the quality of the inverse
    # and add results to the results dictionary
    for k, v in metrics(m, inopivot).items():
        res['nopivot_' + k] = v
    # Now the same but with pivoting
    t = -time()
    ipivot = gauss_jordan_aug(m, pivoting=True)
    t += time()
    res['t_pivot'] = t
    for k, v in metrics(m, ipivot).items():
        res['pivot_' + k] = v
    # Finally, use the NumPy routine
    t = -time()
    inp = np.linalg.inv(m)
    t += time()
    res['t_numpy'] = t
    for k, v in metrics(m, inp).items():
        res['numpy_' + k] = v
    return res
    
In [11]:
r = test_inverse(1000)
r
Out[11]:
{'N': 1000,
 't_nopivot': 1.4365119934082031,
 'nopivot_std': np.float64(0.00010754514571186064),
 'nopivot_absmax': np.float64(0.002330039917135546),
 't_pivot': 1.433528184890747,
 'pivot_std': np.float64(8.58590884586706e-11),
 'pivot_absmax': np.float64(9.939626194025484e-10),
 't_numpy': 0.01861119270324707,
 'numpy_std': np.float64(6.752487654943789e-14),
 'numpy_absmax': np.float64(9.01339655579108e-13)}

This example shows the times and statistics for a single run of test_inverse, showing indeed that errors are much greater without pivoting and that the NumPy routine is much faster than my routine written in Python.

In [12]:
def log_test_inverse(N:int, path:Path):
    r = test_inverse(N)
    if not path.exists():
        # Create the file at path and write the header row
        open(path, 'w').write("\t".join(r.keys()))
    s = [f'{v:.4g}' if k != 'N' else f'{v}' for k, v in r.items()]
    open(path, 'a').write("\n" + "\t".join(s))
    return r

Parallel computation¶

To speed the computation, I wrote a short program to use the above routines to compute in parallel. I will combine that code below. However, because of the way the Python multiprocessing module works, the code managing the parallel operation must reside in __main__, so it had to be run separately from the command line. The results were stored in the file run.tsv, which I now read:

In [13]:
df = pd.read_table("run.tsv")
df
Out[13]:
N t_nopivot nopivot_std nopivot_absmax t_pivot pivot_std pivot_absmax t_numpy numpy_std numpy_absmax
0 512 0.4878 5.352000e-08 6.922000e-07 0.4541 4.477000e-12 4.473000e-11 0.006452 6.679000e-15 6.064000e-14
1 512 0.4859 3.643000e-06 5.154000e-05 0.4491 7.070000e-11 9.261000e-10 0.009392 8.077000e-14 1.242000e-12
2 512 0.4942 4.496000e-07 7.032000e-06 0.4723 1.123000e-11 1.063000e-10 0.006192 1.427000e-14 1.331000e-13
3 512 0.4749 1.540000e-08 2.042000e-07 0.5018 1.139000e-11 1.003000e-10 0.006631 1.602000e-14 1.560000e-13
4 512 0.4773 1.953000e-07 2.388000e-06 0.4828 1.979000e-11 2.148000e-10 0.008087 2.246000e-14 3.693000e-13
... ... ... ... ... ... ... ... ... ... ...
129 6888 349.8000 1.137000e-02 2.030000e-01 3019.0000 3.897000e-09 6.560000e-08 911.500000 4.120000e-13 7.007000e-12
130 5792 251.4000 1.275000e-02 3.375000e-01 5129.0000 4.295000e-09 6.940000e-08 33.890000 4.895000e-13 8.264000e-12
131 6888 3881.0000 5.475000e-02 1.117000e+00 380.7000 4.017000e-08 7.383000e-07 6.902000 3.972000e-12 7.487000e-11
132 6888 377.9000 5.258000e+01 7.932000e+02 388.8000 9.287000e-09 1.911000e-07 7.347000 9.776000e-13 1.875000e-11
133 6888 374.3000 1.944000e-02 5.012000e-01 384.6000 5.866000e-09 1.349000e-07 5.510000 5.766000e-13 1.172000e-11

134 rows × 10 columns

In [14]:
[(int(x), f"{np.log2(x):.2f}") for x in sorted(df.N.unique())]
Out[14]:
[(512, '9.00'),
 (608, '9.25'),
 (724, '9.50'),
 (861, '9.75'),
 (1024, '10.00'),
 (1217, '10.25'),
 (1448, '10.50'),
 (1722, '10.75'),
 (2048, '11.00'),
 (2435, '11.25'),
 (2896, '11.50'),
 (3444, '11.75'),
 (4096, '12.00'),
 (4870, '12.25'),
 (5792, '12.50'),
 (6888, '12.75'),
 (8192, '13.00')]
In [15]:
def run_stats(df:pd.DataFrame, column:str):
    """
    Compute the mean, standard deviation, and standard error of
    the data in the `column` field of the DataFrame, grouped by
    value of `N`. Return the results in a new DataFrame.
    """
    Nvals = sorted(df.N.unique())
    res = []        
    for N in Nvals:
        stats = dict(N=N)
        rows = df[df.N==N][column]
        stats['mean'] = np.mean(rows)
        stats['std'] = np.std(rows)
        stats['sem'] = sem(rows)
        res.append(stats)
    out = pd.DataFrame(res)
    return out
In [16]:
run_stats(df, 't_nopivot')
Out[16]:
N mean std sem
0 512 0.424530 0.060121 0.020040
1 608 0.539740 0.004938 0.002469
2 724 0.821300 0.026738 0.013369
3 861 1.210200 0.026589 0.013294
4 1024 2.244600 0.389776 0.129925
5 1217 3.046200 0.104765 0.052383
6 1448 6.110600 1.155414 0.385138
7 1722 8.994800 0.903750 0.451875
8 2048 17.721000 2.998344 0.999448
9 2435 28.184000 0.616266 0.308133
10 2896 54.279000 8.624003 2.874668
11 3444 68.910000 8.085671 4.042836
12 4096 110.225000 22.492814 7.497605
13 4870 148.470000 12.003170 4.001057
14 5792 357.880000 177.936293 47.555475
15 6888 763.422222 1102.270459 389.711458
16 8192 2080.800000 120.859257 60.429628
In [17]:
s = run_stats(df, 't_nopivot')
In [19]:
def power_of_two(x):
    "This routine will be used to label the N axis"
    n = int(np.round(np.log2(x)))
    k = n - 10
    equals = (" = %d" % 2**n if k < 0 else r" = %d\;\mathrm{k}" % 2**k)
    return r"$2^{%d}%s$" % (n, equals)

myloglocator = LogLocator(base=2)
mylogformatter = FuncFormatter(lambda x, _: power_of_two(x))
mylogminorlocator = LogLocator(base=2, subs=(2**-0.75, 2**-0.5, 2**-0.25))
In [20]:
def plot_trend(df:pd.DataFrame, fields, **kwargs):
    """
    Given the DataFrame loaded above, and the names of three columns, 
    in the order (No Pivot, Pivot, NumPy), produce a log-log graph.
    """
    fig, ax = plt.subplots()
    for field, fmt, label in zip(fields, ('b.', 'r.', 'g.'), ('No Pivot', 'Pivot', 'NumPy')):
        stats = run_stats(df, field)
        # Let's estimate the fit parameters, which are amplitude, power, x0
        means = stats['mean'].to_numpy()
        yratio = means[-1] / means[0]
        Ns = stats.N.to_numpy()
        xratio = Ns[-1] / Ns[0]
        power = np.log(yratio) / np.log(xratio)
        amp = means[-1] / (Ns[-1]**power)
        p0 = (power, amp)
        #print(p0)
        trend = PowerLaw(Ns, means, yunc=stats['sem'], p0=p0)
        if trend.valid:
            slope = f" ({trend.params[0]:.1f})"
            trendy = trend(Ns) # compute the y values for the trend line
            ax.plot(Ns, trendy, c=fmt[0])
        else:
            slope = f" ({power:.1f})"
            guess = amp * Ns ** power
            ax.plot(Ns, guess, ':', c=fmt[0])
            print(trend.error)
        ax.errorbar(stats.N, stats['mean'], yerr=stats['sem'],
          fmt=fmt, capsize=3, label=f"{label}{slope}")
       
    ax.set_xscale('log')
    ax.set_yscale('log')
    ax.set_xlabel("$N$")
    ax.xaxis.set_major_locator(myloglocator)
    ax.xaxis.set_major_formatter(mylogformatter)
    ax.xaxis.set_minor_locator(mylogminorlocator)
    ax.xaxis.set_minor_formatter(NullFormatter())
    ax.legend()
    ax.grid()
    
    if 'ylabel' in kwargs:
        ax.set_ylabel(kwargs['ylabel'])
    if 'title' in kwargs:
        ax.set_title(kwargs['title'])
    ax.tick_params(axis='x', which='both', pad=8)
    return fig, ax
In [21]:
plt.close('all')
fig, ax = plot_trend(df, ('t_nopivot', 't_pivot', 't_numpy'), ylabel='Runtime (s)')
Figure
No description has been provided for this image

So, the time to invert a matrix seems to scale like $N^3$, and the NumPy routine is roughly 100 times faster than my Python routine.

In [24]:
plt.close('all')
fig, ax = plot_trend(df, ('nopivot_std', 'pivot_std', 'numpy_std'),
           ylabel="RMS Error", title="Standard Deviation Error")
Figure
No description has been provided for this image

This plot shows the root-mean-square value of the entries in matrix obtained by multiplying the original matrix by a computed inverse and subtracting the identity matrix. We see that when we don't use pivoting, the RMS error is orders of magnitude larger and is growing fast. When we use partial pivoting, the error rises much more slowly with $N$. The NumPy routine does much better, accumulating error more slowly than pour partial pivoting. I'm sure that it is pivoting in both rows and columns, which makes for a more complicated routine, but much better stability.

In [23]:
fig, ax = plot_trend(df, ('nopivot_absmax', 'pivot_absmax', 'numpy_absmax'),
           ylabel="Maximum Error", title="Maximum Absolute Error")
Figure
No description has been provided for this image

External Code¶

Here’s the code to generate the data. It references log_test_inverse, which is defined above and had to be included in the external code, too, of course.

def run_one(args):
    N, path = args                       # unpack the arguments
    log_test_inverse(N, path)            # run the test and log the results

if __name__ == '__main__':
    import multiprocessing as mp
    PROCESSES = 6
    reps = 5
    dims = [int(2**x) for x in np.arange(9, 13.01, 0.25)]
    all_args = []
    path = Path('run.tsv')
    # Compute the arguments for all runs
    for N in dims:
        for j in range(reps):
            all_args.append( (N, path) )
    
    with mp.Pool(PROCESSES) as pool:        # create a pool of processes
        res = pool.map(run_one, all_args)   # run the tests
In [ ]: