Homework 2¶
Due: Monday, 2/2/26, 23:59:59 \begin{equation*} \newcommand{\mat}[1]{\mathbb{#1}} \newcommand{\vb}[1]{\mathbf{#1}} \newcommand{\dd}[1]{~\mathrm{d}#1} \end{equation*}
Write a Python function gauss_jordan using the basic arrays defined by NumPy to perform Gauss-Jordan elimination on a square matrix. The routine should take in the matrix and return its inverse. It should have an optional parameter that controls whether it uses partial pivoting, so that you can compare the numerical stability of the method when it uses pivoting and when it doesn’t. Be sure to check that your program is working by multiplying the inverse it produces by the original matrix and comparing to the identity matrix.
Suggestions¶
- Matrix multiplication in NumPy uses the
@symbol:A @ B - As you check your routine, it can be useful to have NumPy show array values as fractions, rather than the default floating point numbers. To display fractions, use the following code:
from fractions import Fraction
np.set_printoptions(formatter={'float': lambda x: str(Fraction(x).limit_denominator())});
If you want to revert to floating-point output, call np.set_printoptions(precision=5).
- You may need to ensure that the data type of the matrix passed to your routine is floating (not integer). You can use a call such as
m = np.asarray(m, dtype=np.float64)
Solution¶
import numpy as np
from numpy.random import default_rng; rng = default_rng()
import matplotlib.pyplot as plt
%matplotlib widget
from scipy.stats import sem # standard error of the mean
from time import time # for timing
from fractions import Fraction # represent real numbers as fractions
import pandas as pd
from pathlib import Path
from matplotlib.ticker import LogLocator, FuncFormatter, NullFormatter
# The following imports some fitting code I wrote
import sys
sys.path.insert(0, '/Users/saeta/.python')
from fit.powerlaw import PowerLaw
# The following tells numpy to express output in the form of simplified fractions
np.set_printoptions(formatter={'float':lambda x: str(Fraction(x).limit_denominator())});
def gauss_jordan_aug(m, pivoting=True, showall=False):
"""
Use Gauss-Jordan elimination with partial pivoting to compute the inverse
of square matrix m by first augmenting with an identity matrix.
Inputs:
m: NxN matrix
pivoting: if True, use partial pivoting
showall: if True, print intermediate steps (helps for debugging!)
Output:
the inverse of matrix m
"""
# First check that we are working with a square matrix
nrows, ncols = np.asarray(m).shape
assert nrows == ncols
# Create the augmented matrix, with n and 2n columns
am = np.hstack((m, np.identity(nrows, dtype=np.float64)))
# Define some helper functions
def exchange(r1, r2):
"Exchange rows r1 and r2"
if showall:
show(f"\nBefore exchanging {r1} and {r2}\n")
am[[r1, r2]] = am[[r2, r1]]
if showall:
show(f"\nAfter exchanging {r1} and {r2}\n")
print("\n-----------------------------\n")
def show(message):
if showall:
print(message)
print(am)
show("Starting matrix")
# We'll use [i,j] indexing, so i represents the row and j the column
for j in range(ncols):
if pivoting:
# Identify the best pivot
rbest = j + np.argmax(np.abs(am[j:, j]))
if showall:
if rbest != j:
print(f"Exchanging {rbest} and {j} to pivot on {Fraction(am[rbest,j]).limit_denominator()}")
else:
print(f"Pivoting at {rbest} on {Fraction(am[rbest,j]).limit_denominator()}")
if rbest != j:
# exchange rows rbest and col
exchange(rbest, j)
# Normalize
pivot = 1.0 / float(am[j, j])
if pivot != 1.0:
# Normalize row j to the right of the pivot column
am[j, j:] *= pivot
# Now wipe out column j in rows i > j
for i in range(j+1, nrows):
multiple = -am[i, j]
am[i, j:] += multiple * am[j, j:]
show(f"After pivoting row {j}")
# Now proceed from bottom to top
for j in range(ncols-1, -1, -1):
for i in range(j):
multiple = -float(am[i, j])
am[i, :] += multiple * am[j, :]
show(f"After clearing column {j}")
# Return the rightmost N columns, which are the inverse matrix
return am[:, ncols:]
mat = np.array([[1,2,-3],[2,-1,4],[-2,1,3]])
inv = gauss_jordan_aug(mat, True, True)
Starting matrix [[1 2 -3 1 0 0] [2 -1 4 0 1 0] [-2 1 3 0 0 1]] Exchanging 1 and 0 to pivot on 2 Before exchanging 1 and 0 [[1 2 -3 1 0 0] [2 -1 4 0 1 0] [-2 1 3 0 0 1]] After exchanging 1 and 0 [[2 -1 4 0 1 0] [1 2 -3 1 0 0] [-2 1 3 0 0 1]] ----------------------------- After pivoting row 0 [[1 -1/2 2 0 1/2 0] [0 5/2 -5 1 -1/2 0] [0 0 7 0 1 1]] Pivoting at 1 on 5/2 After pivoting row 1 [[1 -1/2 2 0 1/2 0] [0 1 -2 2/5 -1/5 0] [0 0 7 0 1 1]] Pivoting at 2 on 7 After pivoting row 2 [[1 -1/2 2 0 1/2 0] [0 1 -2 2/5 -1/5 0] [0 0 1 0 1/7 1/7]] After clearing column 2 [[1 -1/2 0 0 3/14 -2/7] [0 1 0 2/5 3/35 2/7] [0 0 1 0 1/7 1/7]] After clearing column 1 [[1 0 0 1/5 9/35 -1/7] [0 1 0 2/5 3/35 2/7] [0 0 1 0 1/7 1/7]] After clearing column 0 [[1 0 0 1/5 9/35 -1/7] [0 1 0 2/5 3/35 2/7] [0 0 1 0 1/7 1/7]]
mat @ inv
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
So, the routine does appear to work properly, at least for a simple small input.
mat2 = np.array([[1,2,-3,-5],[2,-1,4,7],[-2,1,3,-3],[5,4,3,2]])
inv2 = gauss_jordan_aug(mat2, True)
mat2 @ inv2
array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
inv2
array([[134/111, 39/37, 7/37, -43/111],
[-161/111, -51/37, -12/37, 79/111],
[52/111, 19/37, 11/37, -20/111],
[-91/111, -24/37, -10/37, 35/111]])
Use your gauss_jordan function to investigate the numerical instability of Gauss-Jordan elimination when you do and you don’t use pivoting. For test cases, use matrices of pseudorandom numbers generated by a call such as
from numpy.random import default_rng
rng = default_rng()
rng.uniform(<lower_bound>, <upper_bound>, size=(<dim>, <dim>))
# For example
rng.uniform(-5, 5, size=(5,5))
array([[ 4.23959642, 3.60732447, -0.16149884, 3.6900891 , 2.65585838],
[ 1.97695378, -3.09057092, -4.10553997, -4.78785912, 1.44810866],
[-4.27088306, -4.36821066, -2.14689027, -0.29851331, -4.93396243],
[-1.99115728, -1.63366415, -3.04232138, -1.73824459, 3.46747273],
[-4.71288883, -1.94345835, -1.96308442, -1.07039782, 2.43547367]])
One way to explore the numerical error of the inverse is to multiply the matrix by its inverse, subtract the identity matrix, and then report some statistic on the elements of the remaining matrix (which would be all zeros if there were no error). Be sure to explain your choice for figure of merit (although it is really a figure of demerit!).
Prepare a plot showing your figure of merit as a function of matrix dimension when you use pivoting and when you don't and summarize your conclusion. (Your plot should make very clear why you need to use pivoting!)
Suggestions¶
- Errors typically depend on the matrix dimension $N$ to some power. Therefore, it is appropriate to plot on logarithmic axes and to space values of $N$ suitably. Think carefully about how to do this.
- The smallest value of $N$ for your plot might be something like 256.
- For each value of $N$ that you choose, you should perform multiple trials (at least 5) and compute the mean and standard deviation of the mean of the values of your chosen figure of merit. Note that
scipy.statshas the functionsem(standard error of the mean) that you can import with the statementfrom scipy.stats import sem. Or, you can compute it yourself from the standard deviation. - As $N$ gets large, it can take quite some time to run your
gauss_jordanroutine. You might wish to have an indicator to show you how the run is progressing. A simple one to use istqdm, which you can install using pip. Once installed, import with the statementfrom tqdm.notebook import tqdm. To use tqdm in a for loop, just surround the iterator withtqdm(). For example,
for n in tqdm(range(nreps), f'Testing inversion for N = {dim}'):
- To plot using error bars in matplotlib, use something like
ax.errorbar(x, y, yerr=err, fmt='o', capsize=3), whereaxis an existing axes object fromplt.subplots().
np.set_printoptions(precision=5)
def FOM(m, minv):
"""
Evaluate the quality of the inverse matrix by subtracting
the identity matrix from m @ minv, and then assessing how much
the result differs from zeros everywhere.
"""
diff = m @ minv - np.identity(m.shape[0])
std = np.std(diff) # compute the standard deviation of all N^2 values
absmax = np.max(np.abs(diff))
return dict(std=std, absmax=absmax)
Strategy¶
Because runtimes for large matrices can be long, and it could be convenient to log data to a file so that parallel threads can cooperate, I will separate the gathering of data from the analysis of the data. The routine test_inverse(N, metrics) generates a $N\times N$ matrix of random numbers in the range from $-N$ to $N$, computes its inverse in three ways, and then uses the functions in metrics operating on m @ inv - identity to evaluate the quality of the computed inverses. It also records the time taken to compute each inverse. The results are returned in a dictionary.
def test_inverse(N:int, metrics=FOM):
"""
Inputs:
N: the size of the square matrix; the matrix will be constructed from
random numbers drawn uniformly from the range -N to N
metrics: a function that takes the original matrix and the computed inverse,
computes `diff = (m @ inv) - identity`, and then returns various statistics
on what should be a N x N matrix of zeros
Output: a dictionary with the following fields:
N: the dimension of the square matrix
t_nopivot: the time taken by GaussJordanAug when not pivoting
t_pivot: the time taken by GaussJordanAug when pivoting
t_numpy: the time taken by np.linalg.inv
for each method in metrics, it includes
nopivot_method, pivot_method, numpy_method
"""
# Create the matrix of random numbers
m = rng.uniform(-N, N, size=(N,N))
res = dict(N=N)
t = -time()
inopivot = gauss_jordan_aug(m, pivoting=False)
t += time()
res['t_nopivot'] = t
# now analyze the quality of the inverse
# and add results to the results dictionary
for k, v in metrics(m, inopivot).items():
res['nopivot_' + k] = v
# Now the same but with pivoting
t = -time()
ipivot = gauss_jordan_aug(m, pivoting=True)
t += time()
res['t_pivot'] = t
for k, v in metrics(m, ipivot).items():
res['pivot_' + k] = v
# Finally, use the NumPy routine
t = -time()
inp = np.linalg.inv(m)
t += time()
res['t_numpy'] = t
for k, v in metrics(m, inp).items():
res['numpy_' + k] = v
return res
r = test_inverse(1000)
r
{'N': 1000,
't_nopivot': 1.4365119934082031,
'nopivot_std': np.float64(0.00010754514571186064),
'nopivot_absmax': np.float64(0.002330039917135546),
't_pivot': 1.433528184890747,
'pivot_std': np.float64(8.58590884586706e-11),
'pivot_absmax': np.float64(9.939626194025484e-10),
't_numpy': 0.01861119270324707,
'numpy_std': np.float64(6.752487654943789e-14),
'numpy_absmax': np.float64(9.01339655579108e-13)}
This example shows the times and statistics for a single run of test_inverse, showing indeed that errors are much greater without pivoting and that the NumPy routine is much faster than my routine written in Python.
def log_test_inverse(N:int, path:Path):
r = test_inverse(N)
if not path.exists():
# Create the file at path and write the header row
open(path, 'w').write("\t".join(r.keys()))
s = [f'{v:.4g}' if k != 'N' else f'{v}' for k, v in r.items()]
open(path, 'a').write("\n" + "\t".join(s))
return r
Parallel computation¶
To speed the computation, I wrote a short program to use the above routines to compute in parallel. I will combine that code below. However, because of the way the Python multiprocessing module works, the code managing the parallel operation must reside in __main__, so it had to be run separately from the command line. The results were stored in the file run.tsv, which I now read:
df = pd.read_table("run.tsv")
df
| N | t_nopivot | nopivot_std | nopivot_absmax | t_pivot | pivot_std | pivot_absmax | t_numpy | numpy_std | numpy_absmax | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 512 | 0.4878 | 5.352000e-08 | 6.922000e-07 | 0.4541 | 4.477000e-12 | 4.473000e-11 | 0.006452 | 6.679000e-15 | 6.064000e-14 |
| 1 | 512 | 0.4859 | 3.643000e-06 | 5.154000e-05 | 0.4491 | 7.070000e-11 | 9.261000e-10 | 0.009392 | 8.077000e-14 | 1.242000e-12 |
| 2 | 512 | 0.4942 | 4.496000e-07 | 7.032000e-06 | 0.4723 | 1.123000e-11 | 1.063000e-10 | 0.006192 | 1.427000e-14 | 1.331000e-13 |
| 3 | 512 | 0.4749 | 1.540000e-08 | 2.042000e-07 | 0.5018 | 1.139000e-11 | 1.003000e-10 | 0.006631 | 1.602000e-14 | 1.560000e-13 |
| 4 | 512 | 0.4773 | 1.953000e-07 | 2.388000e-06 | 0.4828 | 1.979000e-11 | 2.148000e-10 | 0.008087 | 2.246000e-14 | 3.693000e-13 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 129 | 6888 | 349.8000 | 1.137000e-02 | 2.030000e-01 | 3019.0000 | 3.897000e-09 | 6.560000e-08 | 911.500000 | 4.120000e-13 | 7.007000e-12 |
| 130 | 5792 | 251.4000 | 1.275000e-02 | 3.375000e-01 | 5129.0000 | 4.295000e-09 | 6.940000e-08 | 33.890000 | 4.895000e-13 | 8.264000e-12 |
| 131 | 6888 | 3881.0000 | 5.475000e-02 | 1.117000e+00 | 380.7000 | 4.017000e-08 | 7.383000e-07 | 6.902000 | 3.972000e-12 | 7.487000e-11 |
| 132 | 6888 | 377.9000 | 5.258000e+01 | 7.932000e+02 | 388.8000 | 9.287000e-09 | 1.911000e-07 | 7.347000 | 9.776000e-13 | 1.875000e-11 |
| 133 | 6888 | 374.3000 | 1.944000e-02 | 5.012000e-01 | 384.6000 | 5.866000e-09 | 1.349000e-07 | 5.510000 | 5.766000e-13 | 1.172000e-11 |
134 rows × 10 columns
[(int(x), f"{np.log2(x):.2f}") for x in sorted(df.N.unique())]
[(512, '9.00'), (608, '9.25'), (724, '9.50'), (861, '9.75'), (1024, '10.00'), (1217, '10.25'), (1448, '10.50'), (1722, '10.75'), (2048, '11.00'), (2435, '11.25'), (2896, '11.50'), (3444, '11.75'), (4096, '12.00'), (4870, '12.25'), (5792, '12.50'), (6888, '12.75'), (8192, '13.00')]
def run_stats(df:pd.DataFrame, column:str):
"""
Compute the mean, standard deviation, and standard error of
the data in the `column` field of the DataFrame, grouped by
value of `N`. Return the results in a new DataFrame.
"""
Nvals = sorted(df.N.unique())
res = []
for N in Nvals:
stats = dict(N=N)
rows = df[df.N==N][column]
stats['mean'] = np.mean(rows)
stats['std'] = np.std(rows)
stats['sem'] = sem(rows)
res.append(stats)
out = pd.DataFrame(res)
return out
run_stats(df, 't_nopivot')
| N | mean | std | sem | |
|---|---|---|---|---|
| 0 | 512 | 0.424530 | 0.060121 | 0.020040 |
| 1 | 608 | 0.539740 | 0.004938 | 0.002469 |
| 2 | 724 | 0.821300 | 0.026738 | 0.013369 |
| 3 | 861 | 1.210200 | 0.026589 | 0.013294 |
| 4 | 1024 | 2.244600 | 0.389776 | 0.129925 |
| 5 | 1217 | 3.046200 | 0.104765 | 0.052383 |
| 6 | 1448 | 6.110600 | 1.155414 | 0.385138 |
| 7 | 1722 | 8.994800 | 0.903750 | 0.451875 |
| 8 | 2048 | 17.721000 | 2.998344 | 0.999448 |
| 9 | 2435 | 28.184000 | 0.616266 | 0.308133 |
| 10 | 2896 | 54.279000 | 8.624003 | 2.874668 |
| 11 | 3444 | 68.910000 | 8.085671 | 4.042836 |
| 12 | 4096 | 110.225000 | 22.492814 | 7.497605 |
| 13 | 4870 | 148.470000 | 12.003170 | 4.001057 |
| 14 | 5792 | 357.880000 | 177.936293 | 47.555475 |
| 15 | 6888 | 763.422222 | 1102.270459 | 389.711458 |
| 16 | 8192 | 2080.800000 | 120.859257 | 60.429628 |
s = run_stats(df, 't_nopivot')
def power_of_two(x):
"This routine will be used to label the N axis"
n = int(np.round(np.log2(x)))
k = n - 10
equals = (" = %d" % 2**n if k < 0 else r" = %d\;\mathrm{k}" % 2**k)
return r"$2^{%d}%s$" % (n, equals)
myloglocator = LogLocator(base=2)
mylogformatter = FuncFormatter(lambda x, _: power_of_two(x))
mylogminorlocator = LogLocator(base=2, subs=(2**-0.75, 2**-0.5, 2**-0.25))
def plot_trend(df:pd.DataFrame, fields, **kwargs):
"""
Given the DataFrame loaded above, and the names of three columns,
in the order (No Pivot, Pivot, NumPy), produce a log-log graph.
"""
fig, ax = plt.subplots()
for field, fmt, label in zip(fields, ('b.', 'r.', 'g.'), ('No Pivot', 'Pivot', 'NumPy')):
stats = run_stats(df, field)
# Let's estimate the fit parameters, which are amplitude, power, x0
means = stats['mean'].to_numpy()
yratio = means[-1] / means[0]
Ns = stats.N.to_numpy()
xratio = Ns[-1] / Ns[0]
power = np.log(yratio) / np.log(xratio)
amp = means[-1] / (Ns[-1]**power)
p0 = (power, amp)
#print(p0)
trend = PowerLaw(Ns, means, yunc=stats['sem'], p0=p0)
if trend.valid:
slope = f" ({trend.params[0]:.1f})"
trendy = trend(Ns) # compute the y values for the trend line
ax.plot(Ns, trendy, c=fmt[0])
else:
slope = f" ({power:.1f})"
guess = amp * Ns ** power
ax.plot(Ns, guess, ':', c=fmt[0])
print(trend.error)
ax.errorbar(stats.N, stats['mean'], yerr=stats['sem'],
fmt=fmt, capsize=3, label=f"{label}{slope}")
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel("$N$")
ax.xaxis.set_major_locator(myloglocator)
ax.xaxis.set_major_formatter(mylogformatter)
ax.xaxis.set_minor_locator(mylogminorlocator)
ax.xaxis.set_minor_formatter(NullFormatter())
ax.legend()
ax.grid()
if 'ylabel' in kwargs:
ax.set_ylabel(kwargs['ylabel'])
if 'title' in kwargs:
ax.set_title(kwargs['title'])
ax.tick_params(axis='x', which='both', pad=8)
return fig, ax
plt.close('all')
fig, ax = plot_trend(df, ('t_nopivot', 't_pivot', 't_numpy'), ylabel='Runtime (s)')
So, the time to invert a matrix seems to scale like $N^3$, and the NumPy routine is roughly 100 times faster than my Python routine.
plt.close('all')
fig, ax = plot_trend(df, ('nopivot_std', 'pivot_std', 'numpy_std'),
ylabel="RMS Error", title="Standard Deviation Error")
This plot shows the root-mean-square value of the entries in matrix obtained by multiplying the original matrix by a computed inverse and subtracting the identity matrix. We see that when we don't use pivoting, the RMS error is orders of magnitude larger and is growing fast. When we use partial pivoting, the error rises much more slowly with $N$. The NumPy routine does much better, accumulating error more slowly than pour partial pivoting. I'm sure that it is pivoting in both rows and columns, which makes for a more complicated routine, but much better stability.
fig, ax = plot_trend(df, ('nopivot_absmax', 'pivot_absmax', 'numpy_absmax'),
ylabel="Maximum Error", title="Maximum Absolute Error")
External Code¶
Here’s the code to generate the data. It references log_test_inverse, which is defined above and had to be included in the external code, too, of course.
def run_one(args):
N, path = args # unpack the arguments
log_test_inverse(N, path) # run the test and log the results
if __name__ == '__main__':
import multiprocessing as mp
PROCESSES = 6
reps = 5
dims = [int(2**x) for x in np.arange(9, 13.01, 0.25)]
all_args = []
path = Path('run.tsv')
# Compute the arguments for all runs
for N in dims:
for j in range(reps):
all_args.append( (N, path) )
with mp.Pool(PROCESSES) as pool: # create a pool of processes
res = pool.map(run_one, all_args) # run the tests