Exercises Notebook
Converted from
exercises.ipynbfor web reading.
Vector Spaces Subspaces - Exercises
10 graded exercises covering the full linear algebra basics arc, from computation to ML-facing matrix workflows.
| Format | Description |
|---|---|
| Problem | Markdown cell with task description |
| Your Solution | Code cell for learner work |
| Solution | Reference solution with checks |
Difficulty: straightforward -> moderate -> challenging.
Code cell 2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
try:
import seaborn as sns
sns.set_theme(style="whitegrid", palette="colorblind")
HAS_SNS = True
except ImportError:
plt.style.use("seaborn-v0_8-whitegrid")
HAS_SNS = False
mpl.rcParams.update({
"figure.figsize": (10, 6),
"figure.dpi": 120,
"font.size": 13,
"axes.titlesize": 15,
"axes.labelsize": 13,
"xtick.labelsize": 11,
"ytick.labelsize": 11,
"legend.fontsize": 11,
"legend.framealpha": 0.85,
"lines.linewidth": 2.0,
"axes.spines.top": False,
"axes.spines.right": False,
"savefig.bbox": "tight",
"savefig.dpi": 150,
})
np.random.seed(42)
print("Plot setup complete.")
Code cell 3
import numpy as np
import numpy.linalg as la
import scipy.linalg as sla
from scipy import stats
COLORS = {
"primary": "#0077BB",
"secondary": "#EE7733",
"tertiary": "#009988",
"error": "#CC3311",
"neutral": "#555555",
"highlight": "#EE3377",
}
HAS_MPL = True
np.set_printoptions(precision=8, suppress=True)
np.random.seed(42)
def header(title):
print("\n" + "=" * len(title))
print(title)
print("=" * len(title))
def check_true(name, cond):
ok = bool(cond)
print(f"{'PASS' if ok else 'FAIL'} - {name}")
return ok
def check_close(name, got, expected, tol=1e-8):
ok = np.allclose(got, expected, atol=tol, rtol=tol)
print(f"{'PASS' if ok else 'FAIL'} - {name}: got {got}, expected {expected}")
return ok
def check(name, got, expected, tol=1e-8):
return check_close(name, got, expected, tol=tol)
def softmax(z, axis=-1, tau=1.0):
z = np.asarray(z, dtype=float) / float(tau)
z = z - np.max(z, axis=axis, keepdims=True)
e = np.exp(z)
return e / np.sum(e, axis=axis, keepdims=True)
def cosine_similarity(a, b):
a = np.asarray(a, dtype=float); b = np.asarray(b, dtype=float)
return float(a @ b / (la.norm(a) * la.norm(b) + 1e-12))
def numerical_rank(A, tol=1e-10):
return int(np.sum(la.svd(A, compute_uv=False) > tol))
def orthonormal_basis(A, tol=1e-10):
Q, R = la.qr(A)
keep = np.abs(np.diag(R)) > tol
return Q[:, keep]
def null_space(A, tol=1e-10):
U, S, Vt = la.svd(A)
return Vt[S.size:,:].T if S.size < Vt.shape[0] else Vt[S <= tol,:].T
# Compatibility helpers used by the Chapter 02 theory and exercise cells.
def null_space(A, tol=1e-10):
A = np.asarray(A, dtype=float)
U, S, Vt = la.svd(A, full_matrices=True)
rank = int(np.sum(S > tol))
return Vt[rank:].T
svd_null_space = null_space
def gram_schmidt(vectors, tol=1e-10):
A = np.asarray(vectors, dtype=float)
if A.ndim == 1:
A = A.reshape(1, -1)
basis = []
for v in A:
w = v.astype(float).copy()
for q in basis:
w = w - np.dot(w, q) * q
norm = la.norm(w)
if norm > tol:
basis.append(w / norm)
return np.array(basis)
def projection_matrix_from_columns(A, tol=1e-10):
Q = orthonormal_basis(np.asarray(A, dtype=float), tol=tol)
return Q @ Q.T
def random_unit_vectors(n, d):
X = np.random.randn(n, d)
return X / np.maximum(la.norm(X, axis=1, keepdims=True), 1e-12)
def pairwise_distances(X):
X = np.asarray(X, dtype=float)
diff = X[:, None, :] - X[None, :, :]
return la.norm(diff, axis=-1)
def normalize(x, axis=None, tol=1e-12):
x = np.asarray(x, dtype=float)
norm = la.norm(x, axis=axis, keepdims=True)
return x / np.maximum(norm, tol)
def frobenius_inner(A, B):
return float(np.sum(np.asarray(A, dtype=float) * np.asarray(B, dtype=float)))
def outer_sum_product(A, B):
A = np.asarray(A, dtype=float)
B = np.asarray(B, dtype=float)
return sum(np.outer(A[:, k], B[k, :]) for k in range(A.shape[1]))
def softmax_rows(X):
return softmax(X, axis=1)
def col_space(A, tol=1e-10):
return orthonormal_basis(np.asarray(A, dtype=float), tol=tol)
def row_space(A, tol=1e-10):
return orthonormal_basis(np.asarray(A, dtype=float).T, tol=tol).T
def rref(A, tol=1e-10):
R = np.array(A, dtype=float, copy=True)
m, n = R.shape
pivots = []
row = 0
for col in range(n):
pivot = row + int(np.argmax(np.abs(R[row:, col]))) if row < m else row
if row >= m or abs(R[pivot, col]) <= tol:
continue
if pivot != row:
R[[row, pivot]] = R[[pivot, row]]
R[row] = R[row] / R[row, col]
for r in range(m):
if r != row:
R[r] = R[r] - R[r, col] * R[row]
pivots.append(col)
row += 1
if row == m:
break
R[np.abs(R) < tol] = 0.0
return R, pivots
def nullspace_basis(A, tol=1e-10):
A = np.asarray(A, dtype=float)
U, S, Vt = la.svd(A, full_matrices=True)
rank = int(np.sum(S > tol))
return Vt[rank:].T, rank
print("Chapter helper setup complete.")
Exercise 1: Verifying Vector Space Axioms *
The eight axioms are the complete definition of a vector space. A single axiom failure disqualifies the entire structure. Working through non-examples is equally important — it builds the instinct for which axiom is the weakest link in a given candidate.
For each of the five candidates below, determine whether it is a vector space with the given operations. If it is a vector space, identify the zero vector and one additive inverse. If it is not, identify the specific axiom(s) that fail and supply a concrete counterexample.
(a) with standard addition but modified scalar multiplication:
(scalar multiplication only scales the first coordinate).
(b) (positive reals) with operations:
(c) with the standard operations inherited from .
(d) with standard function addition and scalar multiplication.
(e) with standard operations.
Approach: for each case, work through the three-condition test (zero vector, closure under , closure under scalar mult) and check the eight axioms systematically. Use concrete vectors and scalars to build or destroy the axioms.
Code cell 5
# Your Solution
# Exercise 1 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 1.")
Code cell 6
# Solution
# Exercise 1 - reference solution
header('Exercise 1(a): Modified scalar multiplication')
# Axiom 7: alpha*(u+v) = alpha*u + alpha*v?
# alpha*(u1+v1, u2+v2) = (alpha*(u1+v1), u2+v2)
# alpha*u + alpha*v = (alpha*u1, u2) + (alpha*v1, v2) = (alpha*(u1+v1), u2+v2) -- OK here
# Axiom 8: (alpha+beta)*v = alpha*v + beta*v?
# (alpha+beta)*(u1,u2) = ((alpha+beta)*u1, u2)
# alpha*(u1,u2) + beta*(u1,u2) = (alpha*u1, u2) + (beta*u1, u2) = ((alpha+beta)*u1, u2+u2) = ((alpha+beta)*u1, 2*u2)
# FAIL when u2 != 0: u2 != 2*u2
u1, u2 = 3.0, 5.0
alpha, beta = 2.0, 3.0
lhs = ((alpha + beta) * u1, u2) # (alpha+beta) scalar mult
rhs = (alpha * u1 + beta * u1, u2 + u2) # alpha*v + beta*v
print(f'(alpha+beta)*(u1,u2) = {lhs}')
print(f'alpha*(u1,u2) + beta*(u1,u2) = {rhs}')
check_true('Axiom 8 (vector distributivity) fails', lhs[1] != rhs[1])
print('VERDICT: NOT a vector space. Axiom 8 fails because the second coordinate is unaffected')
print('by scalar multiplication, so scaling and adding disagree.')
header('Exercise 1(b): (R_>0, multiplication, exponentiation)')
# "Zero" vector: e = 1 (since u*1 = u for all u > 0)
# Additive inverse: u^{-1} = 1/u (since u*(1/u) = 1 = e)
# Scalar mult identity: 1 . u = u^1 = u (Axiom 5 OK)
# Axiom 8: (alpha+beta) . u = u^{alpha+beta}; alpha.u (+) beta.u = u^alpha * u^beta = u^{alpha+beta} OK
# Axiom 7: alpha.(u (+) v) = (uv)^alpha; alpha.u (+) alpha.v = u^alpha * v^alpha = (uv)^alpha OK
# Axiom 6: alpha.(beta.u) = alpha.(u^beta) = (u^beta)^alpha = u^{alpha*beta}; (alpha*beta).u = u^{alpha*beta} OK
u, v, al, be = 4.0, 9.0, 2.0, 3.0
e = 1.0 # zero vector
check_close('Additive identity: u (+) 1 = u', u * e, u)
check_close('Additive inverse: u (+) u^-1 = 1', u * (1.0 / u), e)
check_close('Axiom 8: (al+be).u = al.u (+) be.u', u**(al + be), (u**al) * (u**be))
check_close('Axiom 7: al.(u (+) v) = al.u (+) al.v', (u * v)**al, (u**al) * (v**al))
check_close('Axiom 6: al.(be.u) = (al*be).u', (u**be)**al, u**(al * be))
check_close('Commutativity: u (+) v = v (+) u', u * v, v * u)
check_true('Zero vector is 1 (not 0)', e == 1.0)
print('VERDICT: IS a vector space. Every axiom holds; the structure is isomorphic to (R, +, *) via log/exp.')
header('Exercise 1(c): {x+y+z=1} with standard operations')
p = np.array([1.0, 0.0, 0.0]) # p[0]+p[1]+p[2] = 1
q = np.array([0.0, 1.0, 0.0]) # q[0]+q[1]+q[2] = 1
zero = np.array([0.0, 0.0, 0.0])
check_true('Zero vector in V? (0+0+0=1?)', np.isclose(zero.sum(), 1.0))
print(f'p + q = {p + q}, sum = {(p+q).sum()} (should be 1 to stay in V, is 2)')
check_true('p+q NOT in V (sum=2 != 1)', not np.isclose((p + q).sum(), 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero vector (0+0+0=0 != 1).')
print('Also not closed under addition (sum would be 2). It is an affine subspace.')
header('Exercise 1(d): {f: f(0)=0} with standard function ops')
# Represent functions by their values on a grid
x = np.linspace(-2, 2, 9)
f = np.sin(x) # f(0) = sin(0) = 0 OK
g = x**3 - x # g(0) = 0 OK
alpha = 3.7
check_close('f(0) = 0', f[4], 0.0) # x[4] = 0
check_close('g(0) = 0', g[4], 0.0)
check_close('(f+g)(0) = 0', (f + g)[4], 0.0)
check_close('(alpha*f)(0) = 0', (alpha * f)[4], 0.0)
check_close('zero function in V: 0(0)=0', 0.0, 0.0)
print('VERDICT: IS a vector space (subspace of all functions R->R).')
print('All three conditions hold: zero function, closure under addition, closure under scaling.')
header('Exercise 1(e): {f: f(0)=1} with standard function ops')
f = 1 + 0 * x # f(0) = 1
g = np.cos(x) # g(0) = 1
zero_fn = 0 * x # zero function: 0(0) = 0
check_true('Zero function NOT in V (0(0)=0 != 1)', not np.isclose(zero_fn[4], 1.0))
print(f'(f+g)(0) = {(f+g)[4]} (should be 1 to stay in V, is 2)')
check_true('f+g NOT in V ((f+g)(0)=2 != 1)', not np.isclose((f + g)[4], 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero function. Not closed under addition.')
print('This is an affine subspace (a coset of {f: f(0)=0}) — offset by the constant 1 at 0.')
print("Exercise 1 solution complete.")
Exercise 2: Subspace Verification — Proof and Counterexample **
Knowing the three-condition test is not enough; you must also know when the test succeeds and why one condition can fail while others hold. For each subset below, determine whether it is a subspace of the given vector space. If it is a subspace: find a basis and state the dimension. If it is not: identify which condition fails and produce a concrete counterexample.
(a) .
(b) (the union of the two coordinate axes) .
(c) (traceless matrices) .
(d) (degree- polynomials that vanish at 1) .
(e) (the double cone) .
Code cell 8
# Your Solution
# Exercise 2 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 2.")
Code cell 9
# Solution
# Exercise 2 - reference solution
header('Exercise 2(a): W = {2x - y + 3z = 0}')
# This is the null space of A = [2, -1, 3]. Always a subspace.
A2a = np.array([[2.0, -1.0, 3.0]])
R2a, pivots2a = rref(A2a)
# 3 variables, 1 pivot => 2 free variables => dim = 2
# Free vars: y=s (col1), z=t (col2); then x = (y - 3z)/2 = (s - 3t)/2
# Basis vectors:
b1 = np.array([0.5, 1.0, 0.0]) # s=1, t=0: x=1/2
b2 = np.array([-1.5, 0.0, 1.0]) # s=0, t=1: x=-3/2
check_close('b1 in W (2x-y+3z=0)', 2*b1[0] - b1[1] + 3*b1[2], 0.0)
check_close('b2 in W (2x-y+3z=0)', 2*b2[0] - b2[1] + 3*b2[2], 0.0)
check_close('b1+b2 in W', 2*(b1+b2)[0] - (b1+b2)[1] + 3*(b1+b2)[2], 0.0)
check_close('zero in W', 2*0 - 0 + 3*0, 0.0)
print(f'Basis: {b1}, {b2}')
print('VERDICT: IS a subspace (null space of [2,-1,3]). dim = 2.')
header('Exercise 2(b): W = {xy = 0} (coordinate axes)')
u2b = np.array([1.0, 0.0]) # on x-axis: xy = 0
v2b = np.array([0.0, 1.0]) # on y-axis: xy = 0
s2b = u2b + v2b # sum = (1,1)
print(f'u = {u2b}, u[0]*u[1] = {u2b[0]*u2b[1]}')
print(f'v = {v2b}, v[0]*v[1] = {v2b[0]*v2b[1]}')
print(f'u+v = {s2b}, (u+v)[0]*(u+v)[1] = {s2b[0]*s2b[1]}')
check_true('u+v NOT in W (product != 0)', not np.isclose(s2b[0] * s2b[1], 0.0))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0)+(0,1)=(1,1), and 1*1=1≠0.')
print('The union of two subspaces is almost never a subspace.')
header('Exercise 2(c): W = {tr(A) = 0} in R^{2x2}')
# Basis: three linearly independent traceless matrices
E12 = np.array([[0.0, 1.0], [0.0, 0.0]]) # tr=0
E21 = np.array([[0.0, 0.0], [1.0, 0.0]]) # tr=0
D = np.array([[1.0, 0.0], [0.0, -1.0]]) # tr=0
for name2c, M2c in [('E12', E12), ('E21', E21), ('D', D)]:
check_close(f'tr({name2c}) = 0', np.trace(M2c), 0.0)
check_close('tr(E12 + E21) = 0', np.trace(E12 + E21), 0.0)
check_close('tr(3.7*D) = 0', np.trace(3.7 * D), 0.0)
# Linear independence: form matrix of flattened basis vectors
basis_matrix = np.vstack([E12.ravel(), E21.ravel(), D.ravel()])
check_true('Three basis matrices are linearly independent', np.linalg.matrix_rank(basis_matrix) == 3)
print('Basis: {E12, E21, D} (three traceless 2x2 matrices)')
print('VERDICT: IS a subspace. dim = 3. (Full R^{2x2} has dim 4; one trace constraint reduces by 1.)')
header('Exercise 2(d): W = {p in P3 : p(1) = 0}')
# p(1) = a0 + a1 + a2 + a3 = 0 => a0 = -a1 - a2 - a3
# Free variables: a1, a2, a3 => dim = 3
# Basis: set one free variable to 1, others to 0:
# a1=1: p(t) = -t + t = t-1 => coefficients [-1, 1, 0, 0]; p(1)= -1+1=0 OK
# a2=1: p(t) = -t^2 + t^2 => coefficients [-1, 0, 1, 0]; p(1)= -1+1=0 OK
# a3=1: p(t) = -t^3 + t^3 => coefficients [-1, 0, 0, 1]; p(1)= -1+1=0 OK
def poly_eval(coeffs, t):
"""Evaluate polynomial with coefficients [a0,a1,...] at t."""
return sum(c * t**i for i, c in enumerate(coeffs))
b2d_1 = [-1.0, 1.0, 0.0, 0.0] # t - 1
b2d_2 = [-1.0, 0.0, 1.0, 0.0] # t^2 - 1
b2d_3 = [-1.0, 0.0, 0.0, 1.0] # t^3 - 1
for name2d, b2d in [('t-1', b2d_1), ('t^2-1', b2d_2), ('t^3-1', b2d_3)]:
check_close(f'p(1) = 0 for {name2d}', poly_eval(b2d, 1.0), 0.0)
# Closure: sum of two basis elements
s2d = [b2d_1[i] + b2d_2[i] for i in range(4)]
check_close('(b1+b2)(1) = 0', poly_eval(s2d, 1.0), 0.0)
# Independence: coefficient matrix has full row rank
basis_mat2d = np.array([b2d_1, b2d_2, b2d_3])
check_true('Three basis polynomials are independent', np.linalg.matrix_rank(basis_mat2d) == 3)
print('Basis: {t-1, t^2-1, t^3-1} (three polynomials vanishing at t=1)')
print('VERDICT: IS a subspace. dim = 3. (P3 has dim 4; one point-evaluation constraint reduces by 1.)')
header('Exercise 2(e): W = {x^2 + y^2 = z^2} (double cone)')
# On the cone: (1,0,1) has 1+0=1=1^2 OK; (0,1,1) has 0+1=1=1^2 OK
u2e = np.array([1.0, 0.0, 1.0])
v2e = np.array([0.0, 1.0, 1.0])
s2e = u2e + v2e # = (1, 1, 2)
lhs = s2e[0]**2 + s2e[1]**2 # = 1 + 1 = 2
rhs = s2e[2]**2 # = 4
print(f'u = {u2e}: x^2+y^2 = {u2e[0]**2+u2e[1]**2}, z^2 = {u2e[2]**2}')
print(f'v = {v2e}: x^2+y^2 = {v2e[0]**2+v2e[1]**2}, z^2 = {v2e[2]**2}')
print(f'u+v = {s2e}: x^2+y^2 = {lhs}, z^2 = {rhs}')
check_true('u+v NOT on cone', not np.isclose(lhs, rhs))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0,1)+(0,1,1)=(1,1,2),')
print('and 1^2+1^2=2 != 4=2^2. The cone is a nonlinear surface, not a flat subspace.')
print("Exercise 2 solution complete.")
Exercise 3: The Four Fundamental Subspaces **
For every matrix there are exactly four canonical subspaces. They come in two orthogonal pairs and their dimensions satisfy the Rank-Nullity theorem.
Let
(a) Row-reduce to RREF. Identify pivot columns and free columns.
(b) Find a basis for . Verify via the Rank-Nullity theorem: .
(c) Find a basis for . (Use pivot columns of the original , not RREF.)
(d) Find a basis for . (Use non-zero rows of RREF.)
(e) Find a basis for .
(f) Verify: and .
(g) Verify the orthogonality relation : check that every basis vector of is orthogonal to every basis vector of .
Code cell 11
# Your Solution
# Exercise 3 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 3.")
Code cell 12
# Solution
# Exercise 3 - reference solution
A3 = np.array([
[1.0, 2.0, 0.0, 1.0],
[0.0, 0.0, 1.0, 3.0],
[1.0, 2.0, 1.0, 4.0]
])
m3, n3 = A3.shape
# (a) RREF
R3, pivots3 = rref(A3)
header('Exercise 3(a): RREF')
print('RREF(A) =\n', R3)
print('Pivot columns:', pivots3)
print('Free columns:', [c for c in range(n3) if c not in pivots3])
rank3 = len(pivots3)
# (b) Null space: free variables are columns 1 and 3 (0-indexed)
# From RREF: x1 = -2*x2 - x4, x3 = -3*x4
# Free variable x2=1, x4=0: null vector n1 = (-2, 1, 0, 0)
# Free variable x2=0, x4=1: null vector n2 = (-1, 0, -3, 1)
n3_1 = np.array([-2.0, 1.0, 0.0, 0.0])
n3_2 = np.array([-1.0, 0.0, -3.0, 1.0])
null_basis3 = np.column_stack([n3_1, n3_2])
header('Exercise 3(b): Null space of A')
print('Null space basis (columns):\n', null_basis3)
check_close('A @ n1 = 0', A3 @ n3_1, np.zeros(m3))
check_close('A @ n2 = 0', A3 @ n3_2, np.zeros(m3))
nullity3 = null_basis3.shape[1]
check_true('Rank-Nullity: rank + nullity = 4', rank3 + nullity3 == n3)
# (c) Column space: pivot columns of ORIGINAL A
col_basis3 = A3[:, pivots3]
header('Exercise 3(c): Column space of A')
print('Column space basis (pivot columns of A):\n', col_basis3)
check_true('col(A) basis has rank = rank(A)', np.linalg.matrix_rank(col_basis3) == rank3)
# (d) Row space: non-zero rows of RREF
row_basis3 = R3[~np.all(np.isclose(R3, 0.0), axis=1)]
header('Exercise 3(d): Row space of A')
print('Row space basis (non-zero rows of RREF):\n', row_basis3)
check_true('row(A) has correct dimension', row_basis3.shape[0] == rank3)
# (e) Left null space: null space of A^T
# A^T is 4x3; rank = 2; left null dim = 3 - 2 = 1
# Solve A^T y = 0 via RREF of A^T
Rt, pivots_t = rref(A3.T)
# From RREF of A^T: y3 is free; y1 = y3, y2 = -y3 (read from RREF)
# More directly via SVD:
left_null3, _ = nullspace_basis(A3.T)
header('Exercise 3(e): Left null space (null of A^T)')
print('Left null space basis:\n', left_null3)
check_close('A^T @ left_null = 0', A3.T @ left_null3, np.zeros((n3, left_null3.shape[1])))
left_nullity3 = left_null3.shape[1]
# (f) Dimension checks
header('Exercise 3(f): Dimension verification')
print(f'dim(col(A)) = {rank3}, dim(null(A^T)) = {left_nullity3}, sum = {rank3 + left_nullity3}')
print(f'dim(row(A)) = {rank3}, dim(null(A)) = {nullity3}, sum = {rank3 + nullity3}')
check_true('dim(col(A)) + dim(null(A^T)) = m = 3', rank3 + left_nullity3 == m3)
check_true('dim(row(A)) + dim(null(A)) = n = 4', rank3 + nullity3 == n3)
# (g) Orthogonality: row(A) perp null(A)
header('Exercise 3(g): Orthogonality row(A) perp null(A)')
inner_products = row_basis3 @ null_basis3
print('Row-space basis @ Null-space basis =\n', inner_products)
check_close('All inner products row_i . null_j = 0', inner_products, np.zeros_like(inner_products))
print('\nStrang\'s picture:')
print(f' R^4 = row(A) [dim {rank3}] ⊕ null(A) [dim {nullity3}] (orthogonal direct sum)')
print(f' R^3 = col(A) [dim {rank3}] ⊕ null(A^T) [dim {left_nullity3}] (orthogonal direct sum)')
print("Exercise 3 solution complete.")
Exercise 4: Span, Linear Independence, and Basis **
Span, independence, and basis are three complementary views of the same underlying geometry. A spanning set may be redundant; an independent set may not span; a basis is both, with no waste.
In , consider the vectors
(a) Determine whether are linearly independent. If not, which vectors are dependent?
(b) Find the dimension of .
(c) Identify a maximal subset that forms a basis for .
(d) Express each dependent vector as a linear combination of the basis vectors.
(e) Is in ? If yes, find its coordinates in the basis from (c). If no, explain why not.
Code cell 14
# Your Solution
# Exercise 4 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 4.")
Code cell 15
# Solution
# Exercise 4 - reference solution
v1 = np.array([1.0, 2.0, 0.0, 1.0])
v2 = np.array([0.0, 1.0, 1.0, 2.0])
v3 = np.array([1.0, 3.0, 1.0, 3.0])
v4 = np.array([2.0, 1.0, -1.0, -1.0])
w = np.array([3.0, 5.0, 1.0, 4.0])
V4 = np.column_stack([v1, v2, v3, v4])
# (a) and (b): row reduce [v1 | v2 | v3 | v4]
R4, pivots4 = rref(V4)
rank4 = len(pivots4)
free4 = [c for c in range(4) if c not in pivots4]
header('Exercise 4(a)(b): Independence and dimension')
print('RREF([v1|v2|v3|v4]) =\n', R4)
print('Pivot columns (0-indexed):', pivots4)
print('Free columns (0-indexed):', free4)
check_true('Vectors are linearly DEPENDENT (rank < 4)', rank4 < 4)
print(f'dim(span{{v1,v2,v3,v4}}) = {rank4}')
# (c) Basis = pivot columns of original V4
basis4 = V4[:, pivots4]
header('Exercise 4(c): Basis')
print('Basis (pivot columns of V):' )
for i, p in enumerate(pivots4):
print(f' b{i+1} = v{p+1} = {V4[:, p]}')
check_true('Basis has correct rank', np.linalg.matrix_rank(basis4) == rank4)
# (d) Express dependent vectors in terms of the pivot columns
# From RREF: column j is expressed in terms of pivot columns via the RREF entries
# For free column c: the c-th column of RREF gives the coefficients
header('Exercise 4(d): Dependent vectors as linear combinations')
for fc in free4:
# RREF col fc gives pivot_coords such that v_{fc} = sum(pivot_coords[i] * v_{pivot[i]})
pivot_coords = R4[range(rank4), fc] # coefficients in front of pivot columns
reconstruction = sum(pivot_coords[i] * V4[:, pivots4[i]] for i in range(rank4))
print(f'v{fc+1} = ', end='')
for i, p in enumerate(pivots4):
print(f'{pivot_coords[i]:+.4g}*v{p+1} ', end='')
print()
check_close(f'v{fc+1} reconstructed correctly', reconstruction, V4[:, fc])
# (e) Is w in span?
header('Exercise 4(e): Is w = (3,5,1,4) in the span?')
# Augment basis with w and check if rank increases
Vaug = np.column_stack([basis4, w])
rank_aug = np.linalg.matrix_rank(Vaug)
is_in_span = (rank_aug == rank4)
check_true('w is in span{v1,...,v4}', is_in_span)
if is_in_span:
# Solve basis4 @ coords = w using least squares (exact since w is in col(basis4))
coords4, _, _, _ = np.linalg.lstsq(basis4, w, rcond=None)
check_close('basis @ coords = w', basis4 @ coords4, w)
print('Coordinates in basis {v1, v2}:', coords4)
print(f'w = {coords4[0]:.4g}*v{pivots4[0]+1} + {coords4[1]:.4g}*v{pivots4[1]+1}')
print("Exercise 4 solution complete.")
Exercise 5: Gram-Schmidt Orthogonalisation and Projections **
Gram-Schmidt converts any independent set into an orthonormal basis for the same span. The resulting orthonormal basis makes projections trivial and is the foundation of QR factorisation.
Start from , , .
(a) Verify that are linearly independent.
(b) Apply Gram-Schmidt to produce an orthonormal basis .
(c) Verify: for each , and for .
(d) Express in the orthonormal basis; verify by reconstruction:
(e) Construct the orthogonal projection matrix onto . Verify: (idempotent) and (symmetric). Compute and the residual ; verify that .
Code cell 17
# Your Solution
# Exercise 5 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 5.")
Code cell 18
# Solution
# Exercise 5 - reference solution
v5_1 = np.array([1.0, 1.0, 0.0])
v5_2 = np.array([1.0, 0.0, 1.0])
v5_3 = np.array([0.0, 1.0, 1.0])
w5 = np.array([1.0, 2.0, 3.0])
V5 = np.column_stack([v5_1, v5_2, v5_3])
# (a) Linear independence: non-zero determinant
det5 = np.linalg.det(V5)
header('Exercise 5(a): Linear independence')
print(f'det([v1|v2|v3]) = {det5:.6f}')
check_true('det != 0 => linearly independent', abs(det5) > 1e-10)
# (b) Gram-Schmidt
def gram_schmidt(vecs):
qs = []
for v in vecs:
u = v.copy()
for q in qs:
u = u - np.dot(u, q) * q # subtract projection onto previously found q
norm_u = np.linalg.norm(u)
if norm_u < 1e-12:
raise ValueError('Input vectors are linearly dependent.')
qs.append(u / norm_u)
return np.column_stack(qs)
Q5 = gram_schmidt([v5_1, v5_2, v5_3])
header('Exercise 5(b): Gram-Schmidt result')
print('Orthonormal basis Q (columns = q1, q2, q3):\n', Q5)
# (c) Orthonormality
header('Exercise 5(c): Orthonormality verification')
QtQ = Q5.T @ Q5
print('Q^T Q =\n', QtQ)
check_close('Q^T Q = I (orthonormal)', QtQ, np.eye(3))
for i in range(3):
check_close(f'||q{i+1}|| = 1', np.linalg.norm(Q5[:, i]), 1.0)
for i in range(3):
for j in range(i+1, 3):
check_close(f'<q{i+1}, q{j+1}> = 0', np.dot(Q5[:, i], Q5[:, j]), 0.0)
# (d) Coordinates of w in orthonormal basis
header('Exercise 5(d): Coordinates of w = (1,2,3) in orthonormal basis')
coords5 = Q5.T @ w5 # coords[i] = <w, q_i>
print('Coordinates (= <w, q_i>):', coords5)
w5_reconstructed = Q5 @ coords5
check_close('Reconstruction Q @ coords = w', w5_reconstructed, w5)
print(f'w = {coords5[0]:.4f}*q1 + {coords5[1]:.4f}*q2 + {coords5[2]:.4f}*q3')
# (e) Projection onto span{v1, v2} = span{q1, q2}
header('Exercise 5(e): Projection onto span{v1, v2}')
Q2 = Q5[:, :2] # first two orthonormal vectors
P5 = Q2 @ Q2.T # projection matrix onto span{q1, q2}
print('Projection matrix P =\n', P5)
check_close('P^2 = P (idempotent)', P5 @ P5, P5)
check_close('P^T = P (symmetric)', P5.T, P5)
Pw5 = P5 @ w5
res5 = w5 - Pw5
print('P @ w =', Pw5)
print('residual r = w - P@w =', res5)
# Residual must be orthogonal to span{v1, v2}
check_close('r perp v1', np.dot(res5, v5_1), 0.0)
check_close('r perp v2', np.dot(res5, v5_2), 0.0)
check_close('r perp q1', np.dot(res5, Q5[:, 0]), 0.0)
check_close('r perp q2', np.dot(res5, Q5[:, 1]), 0.0)
print('\nConnection to QR: Q5 IS the Q factor; V5 = Q5 @ (Q5^T @ V5) = Q5 @ R.')
R_factor = Q5.T @ V5
check_close('V5 = Q5 @ R', Q5 @ R_factor, V5)
check_true('R is upper triangular', np.allclose(R_factor, np.triu(R_factor)))
print("Exercise 5 solution complete.")
Exercise 6: Subspace Operations — Sum, Intersection, and Direct Sum ***
Subspaces can be combined. The sum is the smallest subspace containing both; the intersection is the largest subspace contained in both. A direct sum requires that their intersection is trivial.
Part A. In , let
(a) Find a basis for . Is ?
(b) Find . Give a basis or show it equals .
(c) Verify the dimension formula: .
(d) Is ? If not, find a subspace with .
Part B. In , let
(e) Find , , , and . Verify the dimension formula.
Code cell 20
# Your Solution
# Exercise 6 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 6.")
Code cell 21
# Solution
# Exercise 6 - reference solution
e1 = np.array([1.0, 0.0, 0.0])
e2 = np.array([0.0, 1.0, 0.0])
e3 = np.array([0.0, 0.0, 1.0])
w6_a = np.array([1.0, 1.0, 0.0])
w6_b = np.array([0.0, 0.0, 1.0])
W1_basis = np.column_stack([e1, e2])
W2_basis = np.column_stack([w6_a, w6_b])
# (a) W1 + W2: pool all 4 generators; rank of the pool = dim(W1 + W2)
all_gens = np.column_stack([W1_basis, W2_basis]) # 3x4
R6, pivots6 = rref(all_gens)
sum_basis6 = all_gens[:, pivots6]
dim_sum6 = len(pivots6)
header('Exercise 6(a): W1 + W2')
print('Generators of W1+W2 (columns):\n', all_gens)
print('Pivot columns:', pivots6)
print('Basis for W1+W2 (pivot columns of generators):\n', sum_basis6)
check_true('W1 + W2 = R^3 (dim = 3)', dim_sum6 == 3)
# (b) W1 ∩ W2: solve alpha*e1 + beta*e2 = gamma*(1,1,0) + delta*(0,0,1)
# alpha = gamma, beta = gamma, 0 = delta => delta=0, gamma=any, alpha=beta=gamma
# Intersection = span{(1,1,0)} which is in W1 (it's a linear combo of e1,e2) AND in W2 (it's w6_a)
inter_vec6 = np.array([1.0, 1.0, 0.0]) # (1,1,0) = 1*e1 + 1*e2 = 1*w6_a + 0*w6_b
header('Exercise 6(b): W1 ∩ W2')
check_true('(1,1,0) in W1 (is linear combo of e1,e2)', np.allclose(1*e1 + 1*e2, inter_vec6))
check_true('(1,1,0) in W2 (is w6_a itself)', np.allclose(inter_vec6, w6_a))
inter_basis6 = inter_vec6.reshape(-1, 1)
dim_inter6 = 1
print(f'W1 ∩ W2 = span{{(1,1,0)}}, dim = {dim_inter6}')
# (c) Dimension formula
dim_W1 = W1_basis.shape[1] # = 2
dim_W2 = W2_basis.shape[1] # = 2
header('Exercise 6(c): Dimension formula')
print(f'dim(W1) = {dim_W1}, dim(W2) = {dim_W2}, dim(W1∩W2) = {dim_inter6}, dim(W1+W2) = {dim_sum6}')
check_true('dim formula: dim(W1+W2) = dim(W1)+dim(W2)-dim(W1∩W2)',
dim_sum6 == dim_W1 + dim_W2 - dim_inter6)
# (d) Direct sum? W1 ∩ W2 = span{(1,1,0)} ≠ {0}, so NOT a direct sum.
# Find W3 with R^3 = W1 ⊕ W3: take W3 = span{e3} = z-axis
W3_basis = e3.reshape(-1, 1)
header('Exercise 6(d): Direct sum?')
check_true('W1 ∩ W2 ≠ {0} => NOT a direct sum', dim_inter6 > 0)
print('W1 ⊕ W3 with W3 = span{e3}:')
W1_plus_W3 = np.column_stack([W1_basis, W3_basis])
check_true('W1 + W3 = R^3 (rank 3)', np.linalg.matrix_rank(W1_plus_W3) == 3)
# W1 ∩ W3: vectors in xy-plane AND on z-axis => must be zero
print('W1 ∩ W3 = {0} (xy-plane meets z-axis only at origin)')
print('=> R^3 = W1 ⊕ W3 = span{e1,e2} ⊕ span{e3} (standard decomposition)')
# Part B
header('Exercise 6(e): Part B in R^4')
A6_w1 = np.array([[1.0, 1.0, 0.0, 0.0]])
A6_w2 = np.array([[0.0, 0.0, 1.0, 1.0]])
# W1 = null([1,1,0,0]): x1+x2=0, 2 free vars (x2,x3,x4) => dim 3
# W2 = null([0,0,1,1]): x3+x4=0, 2 free vars (x1,x2,x4) => dim 3
_, rank_w1 = nullspace_basis(A6_w1)
_, rank_w2 = nullspace_basis(A6_w2)
dim_w1_B = 4 - rank_w1 # = 3
dim_w2_B = 4 - rank_w2 # = 3
# W1 ∩ W2 = null([1,1,0,0; 0,0,1,1]): two constraints on R^4 => dim = 4-2 = 2
A6_both = np.vstack([A6_w1, A6_w2])
_, rank_both = nullspace_basis(A6_both)
# rank_both = rank of the combined constraint matrix
rank_constraint = np.linalg.matrix_rank(A6_both)
dim_inter_B = 4 - rank_constraint # = 2
# W1 + W2: by dimension formula
dim_sum_B = dim_w1_B + dim_w2_B - dim_inter_B
print(f'dim(W1) = {dim_w1_B}')
print(f'dim(W2) = {dim_w2_B}')
print(f'dim(W1∩W2) = {dim_inter_B}')
print(f'dim(W1+W2) = {dim_sum_B} (by formula)')
# Verify: W1+W2 = R^4 iff dim = 4
check_true('Dimension formula holds', dim_sum_B == dim_w1_B + dim_w2_B - dim_inter_B)
check_true('W1 + W2 = R^4', dim_sum_B == 4)
print('Basis for W1∩W2: {(-1,1,0,0)^T restricted x3+x4=0, (-1,1,0,0),(0,0,-1,1)}? Verify:')
inter_b1 = np.array([-1.0, 1.0, 0.0, 0.0]) # x1+x2=0 and x3+x4=0
inter_b2 = np.array([ 0.0, 0.0,-1.0, 1.0])
check_close('inter_b1 in W1', A6_w1 @ inter_b1, np.zeros(1))
check_close('inter_b1 in W2', A6_w2 @ inter_b1, np.zeros(1))
check_close('inter_b2 in W1', A6_w1 @ inter_b2, np.zeros(1))
check_close('inter_b2 in W2', A6_w2 @ inter_b2, np.zeros(1))
print("Exercise 6 solution complete.")
Exercise 7: Change of Basis and Coordinates ***
A change of basis is a relabelling of the coordinate system. The same vector looks different in different bases; the same linear map has different matrices in different bases. Mastering this is prerequisite to understanding PCA, attention heads, and representation geometry.
In , let
(a) Verify that both and are bases for (show that each is a linearly independent spanning set).
(b) Find the change-of-basis matrix (its -th column is the coordinate vector of in basis ).
(c) A vector has coordinates in basis . Find .
(d) Compute explicitly (its standard-basis coordinates). Verify that your answer to (c) gives the correct representation.
(e) A linear map has matrix
in the standard basis. Find its matrix in basis . Verify that and have the same eigenvalues (they are similar matrices).
Code cell 23
# Your Solution
# Exercise 7 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 7.")
Code cell 24
# Solution
# Exercise 7 - reference solution
b1 = np.array([1.0, 2.0])
b2 = np.array([1.0, -1.0])
c1 = np.array([2.0, 1.0])
c2 = np.array([-1.0, 1.0])
P_B = np.column_stack([b1, b2])
P_C = np.column_stack([c1, c2])
# (a) Both bases: non-zero determinant <=> invertible <=> spans R^2 and is independent
det_B = np.linalg.det(P_B)
det_C = np.linalg.det(P_C)
header('Exercise 7(a): Verify bases')
print(f'det(P_B) = {det_B:.4f}')
print(f'det(P_C) = {det_C:.4f}')
check_true('B is a basis (det != 0)', abs(det_B) > 1e-10)
check_true('C is a basis (det != 0)', abs(det_C) > 1e-10)
# (b) P_{B->C}: j-th column = coords of b_j in basis C
# b_j (standard) = P_C @ [b_j]_C => [b_j]_C = P_C^{-1} @ b_j
# P_{B->C} = P_C^{-1} @ P_B
P_BtoC = np.linalg.inv(P_C) @ P_B
header('Exercise 7(b): Change-of-basis matrix P_{B->C}')
print('P_{B->C} =\n', P_BtoC)
# Verify: P_C @ P_{B->C} @ [v]_B = v (standard), which means P_{B->C} correctly translates
# Check: P_C @ P_{B->C} = P_B
check_close('P_C @ P_{B->C} = P_B (consistency check)', P_C @ P_BtoC, P_B)
# (c) [v]_C = P_{B->C} @ [v]_B
v_in_B = np.array([3.0, -1.0])
v_in_C = P_BtoC @ v_in_B
header('Exercise 7(c): Coordinates of v in basis C')
print(f'[v]_B = {v_in_B}')
print(f'[v]_C = P_{{B->C}} @ [v]_B = {v_in_C}')
# (d) v in standard coords: v = P_B @ [v]_B
v_std = P_B @ v_in_B
header('Exercise 7(d): v in standard coordinates')
print(f'v (standard) = P_B @ [v]_B = {v_std}')
# Verify: P_C @ [v]_C should also give v_std
check_close('P_C @ [v]_C = v_std', P_C @ v_in_C, v_std)
# Verify [v]_C directly: inverse of C applied to v_std
check_close('[v]_C = P_C^{-1} @ v_std', np.linalg.inv(P_C) @ v_std, v_in_C)
# (e) Matrix of T in basis B: similarity transformation
# M_B = P_B^{-1} @ M_std @ P_B
M_std = np.array([[2.0, 1.0], [0.0, 3.0]])
P_B_inv = np.linalg.inv(P_B)
M_B = P_B_inv @ M_std @ P_B
header('Exercise 7(e): T in basis B')
print('M_std =\n', M_std)
print('M_B = P_B^{-1} M_std P_B =\n', M_B)
eigvals_std = np.sort(np.linalg.eigvals(M_std).real)
eigvals_B = np.sort(np.linalg.eigvals(M_B).real)
print(f'Eigenvalues of M_std: {eigvals_std}')
print(f'Eigenvalues of M_B: {eigvals_B}')
check_close('Similar matrices have the same eigenvalues', eigvals_std, eigvals_B)
check_close('M_std and M_B have the same trace', np.trace(M_std), np.trace(M_B))
check_close('M_std and M_B have the same determinant', np.linalg.det(M_std), np.linalg.det(M_B))
print('\nKey insight: The linear map T is the same object regardless of basis.')
print('Only its matrix representation changes; eigenvalues and determinant are intrinsic.')
print("Exercise 7 solution complete.")
Exercise 8: AI Application — Subspace Analysis of a Linear Layer ***
Every linear layer induces four fundamental subspaces. Understanding these subspaces tells you exactly what information a layer can and cannot transmit. LoRA exploits this structure by restricting weight updates to a low-dimensional subspace of .
Let
(a) Compute the SVD . What is ? Report the singular values.
(b) Identify — the reachable output subspace. What is its dimension? If a downstream vector is the target, is it reachable? What about ? (Verify computationally.)
(c) Identify — the input directions the layer ignores. What is its dimension? If the null space is non-trivial, give an explicit null vector and verify . If the null space is trivial, explain why.
(d) LoRA adapter: let with and . What is ? What is ? Verify that numerically.
(e) Updated weight . Without computing the full matrix, what is the upper bound on ? Under what condition would ? Under what condition would ? Verify your answer by computing directly.
Code cell 26
# Your Solution
# Exercise 8 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 8.")
Code cell 27
# Solution
# Exercise 8 - reference solution
W8 = np.array([
[3.0, 1.0, 2.0],
[1.0, 2.0, 1.0],
[2.0, 1.0, 3.0]
])
# (a) SVD and rank
U8, s8, Vt8 = np.linalg.svd(W8)
rank8 = np.linalg.matrix_rank(W8)
header('Exercise 8(a): SVD and rank')
print(f'rank(W) = {rank8}')
print(f'Singular values: {s8}')
print(f'U (left singular vectors):\n{U8}')
print(f'V^T (right singular vectors):\n{Vt8}')
check_close('SVD reconstruction: U @ diag(s) @ Vt = W', U8 @ np.diag(s8) @ Vt8, W8)
# (b) Column space
b_target = np.array([1.0, 0.0, 0.0])
b_target2 = np.array([1.0, -1.0, 0.0])
def in_col_space(A, b, tol=1e-8):
"""True if b is in the column space of A (check via rank augmentation)."""
r_A = np.linalg.matrix_rank(A)
r_aug = np.linalg.matrix_rank(np.column_stack([A, b]))
return r_A == r_aug
header('Exercise 8(b): Column space and reachability')
print(f'col(W) has dimension {rank8} (= rank(W))')
print(f'Basis for col(W): left singular vectors u1,...,u_{rank8} of W')
col_basis8 = U8[:, :rank8]
print('col(W) basis (columns):\n', col_basis8)
b1_reachable = in_col_space(W8, b_target)
b2_reachable = in_col_space(W8, b_target2)
print(f'b = {b_target}: reachable = {b1_reachable}')
print(f'b\' = {b_target2}: reachable = {b2_reachable}')
check_true('b = (1,0,0) is in col(W)', b1_reachable)
check_true('b\' = (1,-1,0) is in col(W)', b2_reachable)
print('Since rank(W)=3=m, col(W)=R^3: every b is reachable. W has full column rank => col(W)=R^3.')
# (c) Null space
null8, rank8_check = nullspace_basis(W8)
nullity8 = null8.shape[1] if null8.size > 0 else 0
header('Exercise 8(c): Null space')
print(f'rank(W) = {rank8_check}, nullity = n - rank = {W8.shape[1] - rank8_check}')
print(f'null(W) has dimension {nullity8}')
if nullity8 == 0:
print('null(W) = {0}: W is injective (full column rank).')
print('Every distinct input produces a distinct output. No information is lost.')
print('rank(W) = n = 3 => W is injective => null(W) is trivial.')
else:
print('null(W) basis:\n', null8)
check_close('W @ null_vec = 0', W8 @ null8[:, 0], np.zeros(W8.shape[0]))
# (d) LoRA adapter
b8 = np.array([1.0, 0.0, 1.0])
a8 = np.array([1.0, 1.0, 0.0])
DeltaW8 = np.outer(b8, a8) # b @ a^T (outer product)
rank_DW8 = np.linalg.matrix_rank(DeltaW8)
header('Exercise 8(d): LoRA adapter DeltaW = b @ a^T')
print('b =', b8)
print('a =', a8)
print('DeltaW = b @ a^T =\n', DeltaW8)
print(f'rank(DeltaW) = {rank_DW8}')
check_true('DeltaW has rank 1 (outer product of nonzero vectors)', rank_DW8 == 1)
# col(DeltaW) = span{b}
# Any vector in col(DeltaW) is of the form DeltaW @ x = (a^T x) b, which is a scalar multiple of b
x_test = np.array([2.0, -1.0, 3.0])
Dw_x = DeltaW8 @ x_test
scalar = np.dot(a8, x_test)
check_close('DeltaW @ x = (a^T x) * b', Dw_x, scalar * b8)
print(f'col(DeltaW) = span{{b}} = span{{(1,0,1)}}; dimension = 1')
print(f'null(DeltaW) = null(a^T): any x with a^T x = 0; dimension = 2')
# (e) Rank of W' = W + DeltaW
Wprime8 = W8 + DeltaW8
rank_Wprime8 = np.linalg.matrix_rank(Wprime8)
header('Exercise 8(e): rank(W\' = W + DeltaW)')
print(f'rank(W) = {rank8}')
print(f'rank(DeltaW) = {rank_DW8}')
print(f'rank(W\') = {rank_Wprime8}')
print(f'Upper bound: rank(W + DeltaW) <= rank(W) + rank(DeltaW) = {rank8 + rank_DW8}')
check_true('rank(W\') <= rank(W) + rank(DeltaW)', rank_Wprime8 <= rank8 + rank_DW8)
print()
print('Interpretation:')
print(f' rank(W\') = rank(W) = {rank8}: the LoRA update does NOT increase rank.')
print(' (W already has full rank 3 = m = n; rank cannot exceed 3 in R^{3x3}.)')
print()
print('General rule for LoRA (when rank(W) < n):')
print(' rank(W\') = rank(W): if col(DeltaW) = span{b} is already inside col(W)')
print(' rank(W\') = rank(W) + 1: if b is NOT in col(W) (LoRA adds a genuinely new direction)')
print(' Here rank(W)=3=m, so col(W)=R^3 contains everything; the adapter cannot increase rank.')
print()
print('AI Takeaway:')
print(' col(W) = the reachable output subspace; rank-deficient W has dead output dimensions.')
print(' null(W) = the ignored input subspace; inputs in null(W) are invisible to this layer.')
print(' LoRA constrains updates to a rank-r subspace of R^{mxn}; the LoRA rank r is the')
print(' number of new directions the adapter can contribute to the column space of W.')
print("Exercise 8 solution complete.")
Exercise 9 (★★★): Tangent Subspace of the Probability Simplex
The probability simplex constraint has tangent subspace
Find a basis for this tangent subspace and verify every basis vector sums to zero.
Code cell 29
# Your Solution
# Exercise 9 - learner workspace
# Build a basis for vectors whose coordinates sum to zero.
print("Learner workspace ready for Exercise 9.")
Code cell 30
# Solution
# Exercise 9 - simplex tangent subspace
header("Exercise 9: simplex tangent subspace")
n = 4
B = np.vstack([np.eye(n-1), -np.ones(n-1)])
print("basis columns:\n", B)
check_close("each basis vector sums to zero", np.ones(n) @ B, np.zeros(n-1), tol=1e-12)
check_true("dimension is n-1", la.matrix_rank(B) == n-1)
print("Takeaway: constrained probability updates live in a lower-dimensional linear subspace.")
Exercise 10 (★★★): Affine Solution Set of an Underdetermined System
For with infinitely many solutions, write
Find one solution, a nullspace direction, and verify several affine solutions.
Code cell 32
# Your Solution
# Exercise 10 - learner workspace
# Find a particular solution plus nullspace direction.
print("Learner workspace ready for Exercise 10.")
Code cell 33
# Solution
# Exercise 10 - affine solution set
header("Exercise 10: affine solution set")
A = np.array([[1.0, 2.0, -1.0], [0.0, 1.0, 1.0]])
b = np.array([1.0, 2.0])
x0 = la.lstsq(A, b, rcond=None)[0]
U, S, Vt = la.svd(A)
z = Vt[-1]
print("particular solution:", x0)
print("null direction:", z)
check_close("A x0 = b", A @ x0, b, tol=1e-10)
check_close("A z = 0", A @ z, np.zeros(A.shape[0]), tol=1e-10)
for t in [-2.0, 0.0, 3.5]:
check_close(f"solution at t={t}", A @ (x0 + t*z), b, tol=1e-10)
print("Takeaway: underdetermined systems are affine spaces: one solution plus all null directions.")
What to Review After Finishing
Axioms and structure
- Can you identify which axiom fails in a non-example, and produce a concrete counterexample?
- Do you know why the three-condition test (zero, closure under , closure under scalar mult) is sufficient to check the full eight axioms?
- Can you recognise affine subspaces — sets that look like subspaces but are offset from the origin?
Span, independence, and basis
- Can you compute the dimension of a span via row reduction?
- Can you extract a basis from a spanning set by identifying pivot columns?
- Can you express dependent vectors as linear combinations of the basis and verify the reconstruction?
Four fundamental subspaces
- Can you compute all four subspaces (col, null, row, left null) of a matrix from RREF?
- Can you verify the orthogonality and ?
- Do you understand Strang's picture: is a bijection from to , and kills ?
Gram-Schmidt and projections
- Can you execute Gram-Schmidt step by step and verify orthonormality?
- Can you construct a projection matrix and verify and ?
- Can you explain why the residual is orthogonal to the subspace?
Subspace operations and basis change
- Can you compute and using the dimension formula?
- Do you know when (direct sum) vs when it fails?
- Can you find the change-of-basis matrix and convert coordinates between bases?
- Can you find the matrix of a linear map in a new basis via the similarity transform ?
AI connections
- Can you interpret , , , for a neural network weight matrix?
- Can you explain LoRA in terms of restricting the weight update to a rank- subspace of ?
- Do you understand why the rank of a LoRA update is at most and when it is strictly less?
References
- notes.md
- theory.ipynb
- MIT 18.06 Linear Algebra (Strang)
- Linear Algebra Done Right — Axler
- LoRA: Low-Rank Adaptation — Hu et al. (2021)
- Intrinsic Dimensionality of Fine-Tuning — Aghajanyan et al. (2020)
- Superposition Hypothesis — Elhage et al. (2022)
- A Mathematical Framework for Transformer Circuits — Elhage et al. (2021)
- DeepSeek-V2 Technical Report (MLA / KV subspace compression)