Exercises Notebook

Converted from exercises.ipynb for web reading.

Vector Spaces Subspaces - Exercises

10 graded exercises covering the full linear algebra basics arc, from computation to ML-facing matrix workflows.

Format	Description
Problem	Markdown cell with task description
Your Solution	Code cell for learner work
Solution	Reference solution with checks

Difficulty: straightforward -> moderate -> challenging.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import numpy.linalg as la
import scipy.linalg as sla
from scipy import stats

COLORS = {
    "primary": "#0077BB",
    "secondary": "#EE7733",
    "tertiary": "#009988",
    "error": "#CC3311",
    "neutral": "#555555",
    "highlight": "#EE3377",
}
HAS_MPL = True
np.set_printoptions(precision=8, suppress=True)
np.random.seed(42)

def header(title):
    print("\n" + "=" * len(title))
    print(title)
    print("=" * len(title))

def check_true(name, cond):
    ok = bool(cond)
    print(f"{'PASS' if ok else 'FAIL'} - {name}")
    return ok

def check_close(name, got, expected, tol=1e-8):
    ok = np.allclose(got, expected, atol=tol, rtol=tol)
    print(f"{'PASS' if ok else 'FAIL'} - {name}: got {got}, expected {expected}")
    return ok

def check(name, got, expected, tol=1e-8):
    return check_close(name, got, expected, tol=tol)

def softmax(z, axis=-1, tau=1.0):
    z = np.asarray(z, dtype=float) / float(tau)
    z = z - np.max(z, axis=axis, keepdims=True)
    e = np.exp(z)
    return e / np.sum(e, axis=axis, keepdims=True)

def cosine_similarity(a, b):
    a = np.asarray(a, dtype=float); b = np.asarray(b, dtype=float)
    return float(a @ b / (la.norm(a) * la.norm(b) + 1e-12))

def numerical_rank(A, tol=1e-10):
    return int(np.sum(la.svd(A, compute_uv=False) > tol))

def orthonormal_basis(A, tol=1e-10):
    Q, R = la.qr(A)
    keep = np.abs(np.diag(R)) > tol
    return Q[:, keep]

def null_space(A, tol=1e-10):
    U, S, Vt = la.svd(A)
    return Vt[S.size:,:].T if S.size < Vt.shape[0] else Vt[S <= tol,:].T



# Compatibility helpers used by the Chapter 02 theory and exercise cells.
def null_space(A, tol=1e-10):
    A = np.asarray(A, dtype=float)
    U, S, Vt = la.svd(A, full_matrices=True)
    rank = int(np.sum(S > tol))
    return Vt[rank:].T

svd_null_space = null_space

def gram_schmidt(vectors, tol=1e-10):
    A = np.asarray(vectors, dtype=float)
    if A.ndim == 1:
        A = A.reshape(1, -1)
    basis = []
    for v in A:
        w = v.astype(float).copy()
        for q in basis:
            w = w - np.dot(w, q) * q
        norm = la.norm(w)
        if norm > tol:
            basis.append(w / norm)
    return np.array(basis)

def projection_matrix_from_columns(A, tol=1e-10):
    Q = orthonormal_basis(np.asarray(A, dtype=float), tol=tol)
    return Q @ Q.T


def random_unit_vectors(n, d):
    X = np.random.randn(n, d)
    return X / np.maximum(la.norm(X, axis=1, keepdims=True), 1e-12)

def pairwise_distances(X):
    X = np.asarray(X, dtype=float)
    diff = X[:, None, :] - X[None, :, :]
    return la.norm(diff, axis=-1)


def normalize(x, axis=None, tol=1e-12):
    x = np.asarray(x, dtype=float)
    norm = la.norm(x, axis=axis, keepdims=True)
    return x / np.maximum(norm, tol)

def frobenius_inner(A, B):
    return float(np.sum(np.asarray(A, dtype=float) * np.asarray(B, dtype=float)))

def outer_sum_product(A, B):
    A = np.asarray(A, dtype=float)
    B = np.asarray(B, dtype=float)
    return sum(np.outer(A[:, k], B[k, :]) for k in range(A.shape[1]))

def softmax_rows(X):
    return softmax(X, axis=1)

def col_space(A, tol=1e-10):
    return orthonormal_basis(np.asarray(A, dtype=float), tol=tol)

def row_space(A, tol=1e-10):
    return orthonormal_basis(np.asarray(A, dtype=float).T, tol=tol).T

def rref(A, tol=1e-10):
    R = np.array(A, dtype=float, copy=True)
    m, n = R.shape
    pivots = []
    row = 0
    for col in range(n):
        pivot = row + int(np.argmax(np.abs(R[row:, col]))) if row < m else row
        if row >= m or abs(R[pivot, col]) <= tol:
            continue
        if pivot != row:
            R[[row, pivot]] = R[[pivot, row]]
        R[row] = R[row] / R[row, col]
        for r in range(m):
            if r != row:
                R[r] = R[r] - R[r, col] * R[row]
        pivots.append(col)
        row += 1
        if row == m:
            break
    R[np.abs(R) < tol] = 0.0
    return R, pivots

def nullspace_basis(A, tol=1e-10):
    A = np.asarray(A, dtype=float)
    U, S, Vt = la.svd(A, full_matrices=True)
    rank = int(np.sum(S > tol))
    return Vt[rank:].T, rank

print("Chapter helper setup complete.")

Exercise 1: Verifying Vector Space Axioms *

The eight axioms are the complete definition of a vector space. A single axiom failure disqualifies the entire structure. Working through non-examples is equally important — it builds the instinct for which axiom is the weakest link in a given candidate.

For each of the five candidates below, determine whether it is a vector space with the given operations. If it is a vector space, identify the zero vector and one additive inverse. If it is not, identify the specific axiom(s) that fail and supply a concrete counterexample.

(a) $V = \mathbb{R}^2$ with standard addition but modified scalar multiplication:

\alpha \odot (u_1, u_2) = (\alpha u_1,\; u_2)

(scalar multiplication only scales the first coordinate).

(b) $V = \mathbb{R}_{>0}$ (positive reals) with operations:

u \oplus v = uv \quad\text{(multiplication as addition)}, \qquad \alpha \odot u = u^\alpha \quad\text{(exponentiation as scalar mult)}.

(c) $V = \{(x, y, z) \in \mathbb{R}^3 : x + y + z = 1\}$ with the standard operations inherited from $\mathbb{R}^3$ .

(d) $V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 0\}$ with standard function addition and scalar multiplication.

(e) $V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 1\}$ with standard operations.

Approach: for each case, work through the three-condition test (zero vector, closure under $+$ , closure under scalar mult) and check the eight axioms systematically. Use concrete vectors and scalars to build or destroy the axioms.

Code cell 5

# Your Solution
# Exercise 1 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 1.")

Code cell 6

# Solution
# Exercise 1 - reference solution

header('Exercise 1(a): Modified scalar multiplication')
# Axiom 7: alpha*(u+v) = alpha*u + alpha*v?
# alpha*(u1+v1, u2+v2) = (alpha*(u1+v1), u2+v2)
# alpha*u + alpha*v = (alpha*u1, u2) + (alpha*v1, v2) = (alpha*(u1+v1), u2+v2) -- OK here
# Axiom 8: (alpha+beta)*v = alpha*v + beta*v?
# (alpha+beta)*(u1,u2) = ((alpha+beta)*u1, u2)
# alpha*(u1,u2) + beta*(u1,u2) = (alpha*u1, u2) + (beta*u1, u2) = ((alpha+beta)*u1, u2+u2) = ((alpha+beta)*u1, 2*u2)
# FAIL when u2 != 0: u2 != 2*u2
u1, u2 = 3.0, 5.0
alpha, beta = 2.0, 3.0
lhs = ((alpha + beta) * u1, u2)           # (alpha+beta) scalar mult
rhs = (alpha * u1 + beta * u1, u2 + u2)   # alpha*v + beta*v
print(f'(alpha+beta)*(u1,u2) = {lhs}')
print(f'alpha*(u1,u2) + beta*(u1,u2) = {rhs}')
check_true('Axiom 8 (vector distributivity) fails', lhs[1] != rhs[1])
print('VERDICT: NOT a vector space. Axiom 8 fails because the second coordinate is unaffected')
print('by scalar multiplication, so scaling and adding disagree.')

header('Exercise 1(b): (R_>0, multiplication, exponentiation)')
# "Zero" vector: e = 1 (since u*1 = u for all u > 0)
# Additive inverse: u^{-1} = 1/u (since u*(1/u) = 1 = e)
# Scalar mult identity: 1 . u = u^1 = u  (Axiom 5 OK)
# Axiom 8: (alpha+beta) . u = u^{alpha+beta}; alpha.u (+) beta.u = u^alpha * u^beta = u^{alpha+beta}  OK
# Axiom 7: alpha.(u (+) v) = (uv)^alpha; alpha.u (+) alpha.v = u^alpha * v^alpha = (uv)^alpha  OK
# Axiom 6: alpha.(beta.u) = alpha.(u^beta) = (u^beta)^alpha = u^{alpha*beta}; (alpha*beta).u = u^{alpha*beta}  OK
u, v, al, be = 4.0, 9.0, 2.0, 3.0
e = 1.0  # zero vector
check_close('Additive identity: u (+) 1 = u', u * e, u)
check_close('Additive inverse: u (+) u^-1 = 1', u * (1.0 / u), e)
check_close('Axiom 8: (al+be).u = al.u (+) be.u', u**(al + be), (u**al) * (u**be))
check_close('Axiom 7: al.(u (+) v) = al.u (+) al.v', (u * v)**al, (u**al) * (v**al))
check_close('Axiom 6: al.(be.u) = (al*be).u', (u**be)**al, u**(al * be))
check_close('Commutativity: u (+) v = v (+) u', u * v, v * u)
check_true('Zero vector is 1 (not 0)', e == 1.0)
print('VERDICT: IS a vector space. Every axiom holds; the structure is isomorphic to (R, +, *) via log/exp.')

header('Exercise 1(c): {x+y+z=1} with standard operations')
p = np.array([1.0, 0.0, 0.0])   # p[0]+p[1]+p[2] = 1
q = np.array([0.0, 1.0, 0.0])   # q[0]+q[1]+q[2] = 1
zero = np.array([0.0, 0.0, 0.0])
check_true('Zero vector in V? (0+0+0=1?)', np.isclose(zero.sum(), 1.0))
print(f'p + q = {p + q}, sum = {(p+q).sum()}  (should be 1 to stay in V, is 2)')
check_true('p+q NOT in V (sum=2 != 1)', not np.isclose((p + q).sum(), 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero vector (0+0+0=0 != 1).')
print('Also not closed under addition (sum would be 2). It is an affine subspace.')

header('Exercise 1(d): {f: f(0)=0} with standard function ops')
# Represent functions by their values on a grid
x = np.linspace(-2, 2, 9)
f = np.sin(x)           # f(0) = sin(0) = 0  OK
g = x**3 - x            # g(0) = 0  OK
alpha = 3.7
check_close('f(0) = 0', f[4], 0.0)   # x[4] = 0
check_close('g(0) = 0', g[4], 0.0)
check_close('(f+g)(0) = 0', (f + g)[4], 0.0)
check_close('(alpha*f)(0) = 0', (alpha * f)[4], 0.0)
check_close('zero function in V: 0(0)=0', 0.0, 0.0)
print('VERDICT: IS a vector space (subspace of all functions R->R).')
print('All three conditions hold: zero function, closure under addition, closure under scaling.')

header('Exercise 1(e): {f: f(0)=1} with standard function ops')
f = 1 + 0 * x           # f(0) = 1
g = np.cos(x)            # g(0) = 1
zero_fn = 0 * x          # zero function: 0(0) = 0
check_true('Zero function NOT in V (0(0)=0 != 1)', not np.isclose(zero_fn[4], 1.0))
print(f'(f+g)(0) = {(f+g)[4]}  (should be 1 to stay in V, is 2)')
check_true('f+g NOT in V ((f+g)(0)=2 != 1)', not np.isclose((f + g)[4], 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero function. Not closed under addition.')
print('This is an affine subspace (a coset of {f: f(0)=0}) — offset by the constant 1 at 0.')

print("Exercise 1 solution complete.")

Exercise 2: Subspace Verification — Proof and Counterexample **

Knowing the three-condition test is not enough; you must also know when the test succeeds and why one condition can fail while others hold. For each subset below, determine whether it is a subspace of the given vector space. If it is a subspace: find a basis and state the dimension. If it is not: identify which condition fails and produce a concrete counterexample.

(a) $W = \{(x, y, z) \in \mathbb{R}^3 : 2x - y + 3z = 0\} \subseteq \mathbb{R}^3$ .

(b) $W = \{(x, y) \in \mathbb{R}^2 : xy = 0\}$ (the union of the two coordinate axes) $\subseteq \mathbb{R}^2$ .

(c) $W = \{A \in \mathbb{R}^{2 \times 2} : \operatorname{tr}(A) = 0\}$ (traceless matrices) $\subseteq \mathbb{R}^{2 \times 2}$ .

(d) $W = \{p \in P_3 : p(1) = 0\}$ (degree- $\leq 3$ polynomials that vanish at 1) $\subseteq P_3$ .

(e) $W = \{(x, y, z) \in \mathbb{R}^3 : x^2 + y^2 = z^2\}$ (the double cone) $\subseteq \mathbb{R}^3$ .

Code cell 8

# Your Solution
# Exercise 2 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 2.")

Code cell 9

# Solution
# Exercise 2 - reference solution

header('Exercise 2(a): W = {2x - y + 3z = 0}')
# This is the null space of A = [2, -1, 3]. Always a subspace.
A2a = np.array([[2.0, -1.0, 3.0]])
R2a, pivots2a = rref(A2a)
# 3 variables, 1 pivot => 2 free variables => dim = 2
# Free vars: y=s (col1), z=t (col2); then x = (y - 3z)/2 = (s - 3t)/2
# Basis vectors:
b1 = np.array([0.5, 1.0, 0.0])   # s=1, t=0: x=1/2
b2 = np.array([-1.5, 0.0, 1.0])  # s=0, t=1: x=-3/2
check_close('b1 in W (2x-y+3z=0)', 2*b1[0] - b1[1] + 3*b1[2], 0.0)
check_close('b2 in W (2x-y+3z=0)', 2*b2[0] - b2[1] + 3*b2[2], 0.0)
check_close('b1+b2 in W', 2*(b1+b2)[0] - (b1+b2)[1] + 3*(b1+b2)[2], 0.0)
check_close('zero in W', 2*0 - 0 + 3*0, 0.0)
print(f'Basis: {b1}, {b2}')
print('VERDICT: IS a subspace (null space of [2,-1,3]). dim = 2.')

header('Exercise 2(b): W = {xy = 0} (coordinate axes)')
u2b = np.array([1.0, 0.0])   # on x-axis: xy = 0
v2b = np.array([0.0, 1.0])   # on y-axis: xy = 0
s2b = u2b + v2b              # sum = (1,1)
print(f'u = {u2b}, u[0]*u[1] = {u2b[0]*u2b[1]}')
print(f'v = {v2b}, v[0]*v[1] = {v2b[0]*v2b[1]}')
print(f'u+v = {s2b}, (u+v)[0]*(u+v)[1] = {s2b[0]*s2b[1]}')
check_true('u+v NOT in W (product != 0)', not np.isclose(s2b[0] * s2b[1], 0.0))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0)+(0,1)=(1,1), and 1*1=1≠0.')
print('The union of two subspaces is almost never a subspace.')

header('Exercise 2(c): W = {tr(A) = 0} in R^{2x2}')
# Basis: three linearly independent traceless matrices
E12 = np.array([[0.0, 1.0], [0.0, 0.0]])  # tr=0
E21 = np.array([[0.0, 0.0], [1.0, 0.0]])  # tr=0
D   = np.array([[1.0, 0.0], [0.0, -1.0]]) # tr=0
for name2c, M2c in [('E12', E12), ('E21', E21), ('D', D)]:
    check_close(f'tr({name2c}) = 0', np.trace(M2c), 0.0)
check_close('tr(E12 + E21) = 0', np.trace(E12 + E21), 0.0)
check_close('tr(3.7*D) = 0', np.trace(3.7 * D), 0.0)
# Linear independence: form matrix of flattened basis vectors
basis_matrix = np.vstack([E12.ravel(), E21.ravel(), D.ravel()])
check_true('Three basis matrices are linearly independent', np.linalg.matrix_rank(basis_matrix) == 3)
print('Basis: {E12, E21, D}  (three traceless 2x2 matrices)')
print('VERDICT: IS a subspace. dim = 3.  (Full R^{2x2} has dim 4; one trace constraint reduces by 1.)')

header('Exercise 2(d): W = {p in P3 : p(1) = 0}')
# p(1) = a0 + a1 + a2 + a3 = 0  =>  a0 = -a1 - a2 - a3
# Free variables: a1, a2, a3  => dim = 3
# Basis: set one free variable to 1, others to 0:
# a1=1: p(t) = -t + t = t-1  => coefficients [-1, 1, 0, 0]; p(1)= -1+1=0 OK
# a2=1: p(t) = -t^2 + t^2   => coefficients [-1, 0, 1, 0]; p(1)= -1+1=0 OK
# a3=1: p(t) = -t^3 + t^3   => coefficients [-1, 0, 0, 1]; p(1)= -1+1=0 OK
def poly_eval(coeffs, t):
    """Evaluate polynomial with coefficients [a0,a1,...] at t."""
    return sum(c * t**i for i, c in enumerate(coeffs))

b2d_1 = [-1.0,  1.0,  0.0,  0.0]   # t - 1
b2d_2 = [-1.0,  0.0,  1.0,  0.0]   # t^2 - 1
b2d_3 = [-1.0,  0.0,  0.0,  1.0]   # t^3 - 1
for name2d, b2d in [('t-1', b2d_1), ('t^2-1', b2d_2), ('t^3-1', b2d_3)]:
    check_close(f'p(1) = 0 for {name2d}', poly_eval(b2d, 1.0), 0.0)
# Closure: sum of two basis elements
s2d = [b2d_1[i] + b2d_2[i] for i in range(4)]
check_close('(b1+b2)(1) = 0', poly_eval(s2d, 1.0), 0.0)
# Independence: coefficient matrix has full row rank
basis_mat2d = np.array([b2d_1, b2d_2, b2d_3])
check_true('Three basis polynomials are independent', np.linalg.matrix_rank(basis_mat2d) == 3)
print('Basis: {t-1, t^2-1, t^3-1}  (three polynomials vanishing at t=1)')
print('VERDICT: IS a subspace. dim = 3.  (P3 has dim 4; one point-evaluation constraint reduces by 1.)')

header('Exercise 2(e): W = {x^2 + y^2 = z^2} (double cone)')
# On the cone: (1,0,1) has 1+0=1=1^2 OK; (0,1,1) has 0+1=1=1^2 OK
u2e = np.array([1.0, 0.0, 1.0])
v2e = np.array([0.0, 1.0, 1.0])
s2e = u2e + v2e  # = (1, 1, 2)
lhs = s2e[0]**2 + s2e[1]**2   # = 1 + 1 = 2
rhs = s2e[2]**2               # = 4
print(f'u = {u2e}: x^2+y^2 = {u2e[0]**2+u2e[1]**2}, z^2 = {u2e[2]**2}')
print(f'v = {v2e}: x^2+y^2 = {v2e[0]**2+v2e[1]**2}, z^2 = {v2e[2]**2}')
print(f'u+v = {s2e}: x^2+y^2 = {lhs}, z^2 = {rhs}')
check_true('u+v NOT on cone', not np.isclose(lhs, rhs))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0,1)+(0,1,1)=(1,1,2),')
print('and 1^2+1^2=2 != 4=2^2. The cone is a nonlinear surface, not a flat subspace.')

print("Exercise 2 solution complete.")

Exercise 3: The Four Fundamental Subspaces **

For every matrix $A \in \mathbb{R}^{m \times n}$ there are exactly four canonical subspaces. They come in two orthogonal pairs and their dimensions satisfy the Rank-Nullity theorem.

Let

A = \begin{pmatrix} 1 & 2 & 0 & 1 \\ 0 & 0 & 1 & 3 \\ 1 & 2 & 1 & 4 \end{pmatrix} \in \mathbb{R}^{3 \times 4}.

(a) Row-reduce $A$ to RREF. Identify pivot columns and free columns.

(b) Find a basis for $\operatorname{null}(A)$ . Verify via the Rank-Nullity theorem: $\operatorname{rank}(A) + \operatorname{nullity}(A) = 4$ .

(c) Find a basis for $\operatorname{col}(A)$ . (Use pivot columns of the original $A$ , not RREF.)

(d) Find a basis for $\operatorname{row}(A)$ . (Use non-zero rows of RREF.)

(e) Find a basis for $\operatorname{null}(A^\top)$ .

(f) Verify: $\dim(\operatorname{col}(A)) + \dim(\operatorname{null}(A^\top)) = 3$ and $\dim(\operatorname{row}(A)) + \dim(\operatorname{null}(A)) = 4$ .

(g) Verify the orthogonality relation $\operatorname{row}(A) \perp \operatorname{null}(A)$ : check that every basis vector of $\operatorname{row}(A)$ is orthogonal to every basis vector of $\operatorname{null}(A)$ .

Code cell 11

# Your Solution
# Exercise 3 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 3.")

Code cell 12

# Solution
# Exercise 3 - reference solution

A3 = np.array([
    [1.0, 2.0, 0.0, 1.0],
    [0.0, 0.0, 1.0, 3.0],
    [1.0, 2.0, 1.0, 4.0]
])
m3, n3 = A3.shape

# (a) RREF
R3, pivots3 = rref(A3)
header('Exercise 3(a): RREF')
print('RREF(A) =\n', R3)
print('Pivot columns:', pivots3)
print('Free columns:', [c for c in range(n3) if c not in pivots3])
rank3 = len(pivots3)

# (b) Null space: free variables are columns 1 and 3 (0-indexed)
# From RREF: x1 = -2*x2 - x4, x3 = -3*x4
# Free variable x2=1, x4=0: null vector n1 = (-2, 1, 0, 0)
# Free variable x2=0, x4=1: null vector n2 = (-1, 0, -3, 1)
n3_1 = np.array([-2.0, 1.0,  0.0, 0.0])
n3_2 = np.array([-1.0, 0.0, -3.0, 1.0])
null_basis3 = np.column_stack([n3_1, n3_2])

header('Exercise 3(b): Null space of A')
print('Null space basis (columns):\n', null_basis3)
check_close('A @ n1 = 0', A3 @ n3_1, np.zeros(m3))
check_close('A @ n2 = 0', A3 @ n3_2, np.zeros(m3))
nullity3 = null_basis3.shape[1]
check_true('Rank-Nullity: rank + nullity = 4', rank3 + nullity3 == n3)

# (c) Column space: pivot columns of ORIGINAL A
col_basis3 = A3[:, pivots3]
header('Exercise 3(c): Column space of A')
print('Column space basis (pivot columns of A):\n', col_basis3)
check_true('col(A) basis has rank = rank(A)', np.linalg.matrix_rank(col_basis3) == rank3)

# (d) Row space: non-zero rows of RREF
row_basis3 = R3[~np.all(np.isclose(R3, 0.0), axis=1)]
header('Exercise 3(d): Row space of A')
print('Row space basis (non-zero rows of RREF):\n', row_basis3)
check_true('row(A) has correct dimension', row_basis3.shape[0] == rank3)

# (e) Left null space: null space of A^T
# A^T is 4x3; rank = 2; left null dim = 3 - 2 = 1
# Solve A^T y = 0 via RREF of A^T
Rt, pivots_t = rref(A3.T)
# From RREF of A^T: y3 is free; y1 = y3, y2 = -y3 (read from RREF)
# More directly via SVD:
left_null3, _ = nullspace_basis(A3.T)
header('Exercise 3(e): Left null space (null of A^T)')
print('Left null space basis:\n', left_null3)
check_close('A^T @ left_null = 0', A3.T @ left_null3, np.zeros((n3, left_null3.shape[1])))
left_nullity3 = left_null3.shape[1]

# (f) Dimension checks
header('Exercise 3(f): Dimension verification')
print(f'dim(col(A)) = {rank3},  dim(null(A^T)) = {left_nullity3},  sum = {rank3 + left_nullity3}')
print(f'dim(row(A)) = {rank3},  dim(null(A))   = {nullity3},         sum = {rank3 + nullity3}')
check_true('dim(col(A)) + dim(null(A^T)) = m = 3', rank3 + left_nullity3 == m3)
check_true('dim(row(A)) + dim(null(A))   = n = 4', rank3 + nullity3 == n3)

# (g) Orthogonality: row(A) perp null(A)
header('Exercise 3(g): Orthogonality row(A) perp null(A)')
inner_products = row_basis3 @ null_basis3
print('Row-space basis @ Null-space basis =\n', inner_products)
check_close('All inner products row_i . null_j = 0', inner_products, np.zeros_like(inner_products))
print('\nStrang\'s picture:')
print(f'  R^4 = row(A) [dim {rank3}] ⊕ null(A) [dim {nullity3}]  (orthogonal direct sum)')
print(f'  R^3 = col(A) [dim {rank3}] ⊕ null(A^T) [dim {left_nullity3}]  (orthogonal direct sum)')

print("Exercise 3 solution complete.")

Exercise 4: Span, Linear Independence, and Basis **

Span, independence, and basis are three complementary views of the same underlying geometry. A spanning set may be redundant; an independent set may not span; a basis is both, with no waste.

In $\mathbb{R}^4$ , consider the vectors

\mathbf{v}_1 = (1,2,0,1)^\top, \quad \mathbf{v}_2 = (0,1,1,2)^\top, \quad \mathbf{v}_3 = (1,3,1,3)^\top, \quad \mathbf{v}_4 = (2,1,-1,-1)^\top.

(a) Determine whether $\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}$ are linearly independent. If not, which vectors are dependent?

(b) Find the dimension of $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}$ .

(c) Identify a maximal subset that forms a basis for $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}$ .

(d) Express each dependent vector as a linear combination of the basis vectors.

(e) Is $\mathbf{w} = (3, 5, 1, 4)^\top$ in $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}$ ? If yes, find its coordinates in the basis from (c). If no, explain why not.

Code cell 14

# Your Solution
# Exercise 4 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 4.")

Code cell 15

# Solution
# Exercise 4 - reference solution

v1 = np.array([1.0, 2.0,  0.0,  1.0])
v2 = np.array([0.0, 1.0,  1.0,  2.0])
v3 = np.array([1.0, 3.0,  1.0,  3.0])
v4 = np.array([2.0, 1.0, -1.0, -1.0])
w  = np.array([3.0, 5.0,  1.0,  4.0])
V4 = np.column_stack([v1, v2, v3, v4])

# (a) and (b): row reduce [v1 | v2 | v3 | v4]
R4, pivots4 = rref(V4)
rank4 = len(pivots4)
free4 = [c for c in range(4) if c not in pivots4]

header('Exercise 4(a)(b): Independence and dimension')
print('RREF([v1|v2|v3|v4]) =\n', R4)
print('Pivot columns (0-indexed):', pivots4)
print('Free columns (0-indexed):', free4)
check_true('Vectors are linearly DEPENDENT (rank < 4)', rank4 < 4)
print(f'dim(span{{v1,v2,v3,v4}}) = {rank4}')

# (c) Basis = pivot columns of original V4
basis4 = V4[:, pivots4]
header('Exercise 4(c): Basis')
print('Basis (pivot columns of V):' )
for i, p in enumerate(pivots4):
    print(f'  b{i+1} = v{p+1} = {V4[:, p]}')
check_true('Basis has correct rank', np.linalg.matrix_rank(basis4) == rank4)

# (d) Express dependent vectors in terms of the pivot columns
# From RREF: column j is expressed in terms of pivot columns via the RREF entries
# For free column c: the c-th column of RREF gives the coefficients
header('Exercise 4(d): Dependent vectors as linear combinations')
for fc in free4:
    # RREF col fc gives pivot_coords such that v_{fc} = sum(pivot_coords[i] * v_{pivot[i]})
    pivot_coords = R4[range(rank4), fc]   # coefficients in front of pivot columns
    reconstruction = sum(pivot_coords[i] * V4[:, pivots4[i]] for i in range(rank4))
    print(f'v{fc+1} = ', end='')
    for i, p in enumerate(pivots4):
        print(f'{pivot_coords[i]:+.4g}*v{p+1} ', end='')
    print()
    check_close(f'v{fc+1} reconstructed correctly', reconstruction, V4[:, fc])

# (e) Is w in span?
header('Exercise 4(e): Is w = (3,5,1,4) in the span?')
# Augment basis with w and check if rank increases
Vaug = np.column_stack([basis4, w])
rank_aug = np.linalg.matrix_rank(Vaug)
is_in_span = (rank_aug == rank4)
check_true('w is in span{v1,...,v4}', is_in_span)
if is_in_span:
    # Solve basis4 @ coords = w using least squares (exact since w is in col(basis4))
    coords4, _, _, _ = np.linalg.lstsq(basis4, w, rcond=None)
    check_close('basis @ coords = w', basis4 @ coords4, w)
    print('Coordinates in basis {v1, v2}:', coords4)
    print(f'w = {coords4[0]:.4g}*v{pivots4[0]+1} + {coords4[1]:.4g}*v{pivots4[1]+1}')

print("Exercise 4 solution complete.")

Exercise 5: Gram-Schmidt Orthogonalisation and Projections **

Gram-Schmidt converts any independent set into an orthonormal basis for the same span. The resulting orthonormal basis makes projections trivial and is the foundation of QR factorisation.

Start from $\mathbf{v}_1 = (1,1,0)^\top$ , $\mathbf{v}_2 = (1,0,1)^\top$ , $\mathbf{v}_3 = (0,1,1)^\top$ .

(a) Verify that $\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}$ are linearly independent.

(b) Apply Gram-Schmidt to produce an orthonormal basis $\{\mathbf{q}_1, \mathbf{q}_2, \mathbf{q}_3\}$ .

(c) Verify: $\|\mathbf{q}_i\| = 1$ for each $i$ , and $\langle \mathbf{q}_i, \mathbf{q}_j \rangle = 0$ for $i \neq j$ .

(d) Express $\mathbf{w} = (1, 2, 3)^\top$ in the orthonormal basis; verify by reconstruction:

\mathbf{w} = \langle \mathbf{w}, \mathbf{q}_1 \rangle \mathbf{q}_1 + \langle \mathbf{w}, \mathbf{q}_2 \rangle \mathbf{q}_2 + \langle \mathbf{w}, \mathbf{q}_3 \rangle \mathbf{q}_3.

(e) Construct the orthogonal projection matrix $P$ onto $\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\} = \operatorname{span}\{\mathbf{q}_1, \mathbf{q}_2\}$ . Verify: $P^2 = P$ (idempotent) and $P^\top = P$ (symmetric). Compute $P\mathbf{w}$ and the residual $\mathbf{r} = \mathbf{w} - P\mathbf{w}$ ; verify that $\mathbf{r} \perp \operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\}$ .

Code cell 17

# Your Solution
# Exercise 5 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 5.")

Code cell 18

# Solution
# Exercise 5 - reference solution

v5_1 = np.array([1.0, 1.0, 0.0])
v5_2 = np.array([1.0, 0.0, 1.0])
v5_3 = np.array([0.0, 1.0, 1.0])
w5   = np.array([1.0, 2.0, 3.0])
V5   = np.column_stack([v5_1, v5_2, v5_3])

# (a) Linear independence: non-zero determinant
det5 = np.linalg.det(V5)
header('Exercise 5(a): Linear independence')
print(f'det([v1|v2|v3]) = {det5:.6f}')
check_true('det != 0 => linearly independent', abs(det5) > 1e-10)

# (b) Gram-Schmidt
def gram_schmidt(vecs):
    qs = []
    for v in vecs:
        u = v.copy()
        for q in qs:
            u = u - np.dot(u, q) * q   # subtract projection onto previously found q
        norm_u = np.linalg.norm(u)
        if norm_u < 1e-12:
            raise ValueError('Input vectors are linearly dependent.')
        qs.append(u / norm_u)
    return np.column_stack(qs)

Q5 = gram_schmidt([v5_1, v5_2, v5_3])

header('Exercise 5(b): Gram-Schmidt result')
print('Orthonormal basis Q (columns = q1, q2, q3):\n', Q5)

# (c) Orthonormality
header('Exercise 5(c): Orthonormality verification')
QtQ = Q5.T @ Q5
print('Q^T Q =\n', QtQ)
check_close('Q^T Q = I (orthonormal)', QtQ, np.eye(3))
for i in range(3):
    check_close(f'||q{i+1}|| = 1', np.linalg.norm(Q5[:, i]), 1.0)
for i in range(3):
    for j in range(i+1, 3):
        check_close(f'<q{i+1}, q{j+1}> = 0', np.dot(Q5[:, i], Q5[:, j]), 0.0)

# (d) Coordinates of w in orthonormal basis
header('Exercise 5(d): Coordinates of w = (1,2,3) in orthonormal basis')
coords5 = Q5.T @ w5   # coords[i] = <w, q_i>
print('Coordinates (= <w, q_i>):', coords5)
w5_reconstructed = Q5 @ coords5
check_close('Reconstruction Q @ coords = w', w5_reconstructed, w5)
print(f'w = {coords5[0]:.4f}*q1 + {coords5[1]:.4f}*q2 + {coords5[2]:.4f}*q3')

# (e) Projection onto span{v1, v2} = span{q1, q2}
header('Exercise 5(e): Projection onto span{v1, v2}')
Q2 = Q5[:, :2]           # first two orthonormal vectors
P5 = Q2 @ Q2.T           # projection matrix onto span{q1, q2}
print('Projection matrix P =\n', P5)
check_close('P^2 = P (idempotent)', P5 @ P5, P5)
check_close('P^T = P (symmetric)', P5.T, P5)
Pw5  = P5 @ w5
res5 = w5 - Pw5
print('P @ w =', Pw5)
print('residual r = w - P@w =', res5)
# Residual must be orthogonal to span{v1, v2}
check_close('r perp v1', np.dot(res5, v5_1), 0.0)
check_close('r perp v2', np.dot(res5, v5_2), 0.0)
check_close('r perp q1', np.dot(res5, Q5[:, 0]), 0.0)
check_close('r perp q2', np.dot(res5, Q5[:, 1]), 0.0)
print('\nConnection to QR: Q5 IS the Q factor; V5 = Q5 @ (Q5^T @ V5) = Q5 @ R.')
R_factor = Q5.T @ V5
check_close('V5 = Q5 @ R', Q5 @ R_factor, V5)
check_true('R is upper triangular', np.allclose(R_factor, np.triu(R_factor)))

print("Exercise 5 solution complete.")

Exercise 6: Subspace Operations — Sum, Intersection, and Direct Sum ***

Subspaces can be combined. The sum $W_1 + W_2$ is the smallest subspace containing both; the intersection $W_1 \cap W_2$ is the largest subspace contained in both. A direct sum $W_1 \oplus W_2$ requires that their intersection is trivial.

Part A. In $\mathbb{R}^3$ , let

W_1 = \operatorname{span}\{\mathbf{e}_1, \mathbf{e}_2\} \quad (\text{the } xy\text{-plane}), \qquad W_2 = \operatorname{span}\{(1,1,0)^\top,\,(0,0,1)^\top\}.

(a) Find a basis for $W_1 + W_2$ . Is $W_1 + W_2 = \mathbb{R}^3$ ?

(b) Find $W_1 \cap W_2$ . Give a basis or show it equals $\{\mathbf{0}\}$ .

(c) Verify the dimension formula: $\dim(W_1 + W_2) = \dim(W_1) + \dim(W_2) - \dim(W_1 \cap W_2)$ .

(d) Is $\mathbb{R}^3 = W_1 \oplus W_2$ ? If not, find a subspace $W_3$ with $\mathbb{R}^3 = W_1 \oplus W_3$ .

Part B. In $\mathbb{R}^4$ , let

W_1 = \{\mathbf{x} : x_1 + x_2 = 0\}, \qquad W_2 = \{\mathbf{x} : x_3 + x_4 = 0\}.

(e) Find $\dim(W_1)$ , $\dim(W_2)$ , $\dim(W_1 \cap W_2)$ , and $\dim(W_1 + W_2)$ . Verify the dimension formula.

Code cell 20

# Your Solution
# Exercise 6 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 6.")

Code cell 21

# Solution
# Exercise 6 - reference solution

e1 = np.array([1.0, 0.0, 0.0])
e2 = np.array([0.0, 1.0, 0.0])
e3 = np.array([0.0, 0.0, 1.0])
w6_a = np.array([1.0, 1.0, 0.0])
w6_b = np.array([0.0, 0.0, 1.0])
W1_basis = np.column_stack([e1, e2])
W2_basis = np.column_stack([w6_a, w6_b])

# (a) W1 + W2: pool all 4 generators; rank of the pool = dim(W1 + W2)
all_gens = np.column_stack([W1_basis, W2_basis])  # 3x4
R6, pivots6 = rref(all_gens)
sum_basis6 = all_gens[:, pivots6]
dim_sum6 = len(pivots6)

header('Exercise 6(a): W1 + W2')
print('Generators of W1+W2 (columns):\n', all_gens)
print('Pivot columns:', pivots6)
print('Basis for W1+W2 (pivot columns of generators):\n', sum_basis6)
check_true('W1 + W2 = R^3 (dim = 3)', dim_sum6 == 3)

# (b) W1 ∩ W2: solve alpha*e1 + beta*e2 = gamma*(1,1,0) + delta*(0,0,1)
# alpha = gamma, beta = gamma, 0 = delta => delta=0, gamma=any, alpha=beta=gamma
# Intersection = span{(1,1,0)} which is in W1 (it's a linear combo of e1,e2) AND in W2 (it's w6_a)
inter_vec6 = np.array([1.0, 1.0, 0.0])  # (1,1,0) = 1*e1 + 1*e2 = 1*w6_a + 0*w6_b

header('Exercise 6(b): W1 ∩ W2')
check_true('(1,1,0) in W1 (is linear combo of e1,e2)', np.allclose(1*e1 + 1*e2, inter_vec6))
check_true('(1,1,0) in W2 (is w6_a itself)', np.allclose(inter_vec6, w6_a))
inter_basis6 = inter_vec6.reshape(-1, 1)
dim_inter6 = 1
print(f'W1 ∩ W2 = span{{(1,1,0)}}, dim = {dim_inter6}')

# (c) Dimension formula
dim_W1 = W1_basis.shape[1]   # = 2
dim_W2 = W2_basis.shape[1]   # = 2
header('Exercise 6(c): Dimension formula')
print(f'dim(W1) = {dim_W1},  dim(W2) = {dim_W2},  dim(W1∩W2) = {dim_inter6},  dim(W1+W2) = {dim_sum6}')
check_true('dim formula: dim(W1+W2) = dim(W1)+dim(W2)-dim(W1∩W2)',
           dim_sum6 == dim_W1 + dim_W2 - dim_inter6)

# (d) Direct sum?  W1 ∩ W2 = span{(1,1,0)} ≠ {0}, so NOT a direct sum.
# Find W3 with R^3 = W1 ⊕ W3: take W3 = span{e3} = z-axis
W3_basis = e3.reshape(-1, 1)
header('Exercise 6(d): Direct sum?')
check_true('W1 ∩ W2 ≠ {0} => NOT a direct sum', dim_inter6 > 0)
print('W1 ⊕ W3 with W3 = span{e3}:')
W1_plus_W3 = np.column_stack([W1_basis, W3_basis])
check_true('W1 + W3 = R^3 (rank 3)', np.linalg.matrix_rank(W1_plus_W3) == 3)
# W1 ∩ W3: vectors in xy-plane AND on z-axis => must be zero
print('W1 ∩ W3 = {0} (xy-plane meets z-axis only at origin)')
print('=> R^3 = W1 ⊕ W3 = span{e1,e2} ⊕ span{e3}  (standard decomposition)')

# Part B
header('Exercise 6(e): Part B in R^4')
A6_w1 = np.array([[1.0, 1.0, 0.0, 0.0]])
A6_w2 = np.array([[0.0, 0.0, 1.0, 1.0]])
# W1 = null([1,1,0,0]): x1+x2=0, 2 free vars (x2,x3,x4) => dim 3
# W2 = null([0,0,1,1]): x3+x4=0, 2 free vars (x1,x2,x4) => dim 3
_, rank_w1 = nullspace_basis(A6_w1)
_, rank_w2 = nullspace_basis(A6_w2)
dim_w1_B = 4 - rank_w1   # = 3
dim_w2_B = 4 - rank_w2   # = 3
# W1 ∩ W2 = null([1,1,0,0; 0,0,1,1]): two constraints on R^4 => dim = 4-2 = 2
A6_both = np.vstack([A6_w1, A6_w2])
_, rank_both = nullspace_basis(A6_both)
# rank_both = rank of the combined constraint matrix
rank_constraint = np.linalg.matrix_rank(A6_both)
dim_inter_B = 4 - rank_constraint   # = 2
# W1 + W2: by dimension formula
dim_sum_B = dim_w1_B + dim_w2_B - dim_inter_B
print(f'dim(W1) = {dim_w1_B}')
print(f'dim(W2) = {dim_w2_B}')
print(f'dim(W1∩W2) = {dim_inter_B}')
print(f'dim(W1+W2) = {dim_sum_B} (by formula)')
# Verify: W1+W2 = R^4 iff dim = 4
check_true('Dimension formula holds', dim_sum_B == dim_w1_B + dim_w2_B - dim_inter_B)
check_true('W1 + W2 = R^4', dim_sum_B == 4)
print('Basis for W1∩W2: {(-1,1,0,0)^T restricted x3+x4=0, (-1,1,0,0),(0,0,-1,1)}? Verify:')
inter_b1 = np.array([-1.0, 1.0, 0.0,  0.0])  # x1+x2=0 and x3+x4=0
inter_b2 = np.array([ 0.0, 0.0,-1.0,  1.0])
check_close('inter_b1 in W1', A6_w1 @ inter_b1, np.zeros(1))
check_close('inter_b1 in W2', A6_w2 @ inter_b1, np.zeros(1))
check_close('inter_b2 in W1', A6_w1 @ inter_b2, np.zeros(1))
check_close('inter_b2 in W2', A6_w2 @ inter_b2, np.zeros(1))

print("Exercise 6 solution complete.")

Exercise 7: Change of Basis and Coordinates ***

A change of basis is a relabelling of the coordinate system. The same vector looks different in different bases; the same linear map has different matrices in different bases. Mastering this is prerequisite to understanding PCA, attention heads, and representation geometry.

In $\mathbb{R}^2$ , let

B = \{\mathbf{b}_1 = (1,2)^\top, \; \mathbf{b}_2 = (1,-1)^\top\}, \qquad C = \{\mathbf{c}_1 = (2,1)^\top, \; \mathbf{c}_2 = (-1,1)^\top\}.

(a) Verify that both $B$ and $C$ are bases for $\mathbb{R}^2$ (show that each is a linearly independent spanning set).

(b) Find the change-of-basis matrix $P_{B \to C}$ (its $j$ -th column is the coordinate vector of $\mathbf{b}_j$ in basis $C$ ).

(c) A vector has coordinates $[\mathbf{v}]_B = (3, -1)^\top$ in basis $B$ . Find $[\mathbf{v}]_C$ .

(d) Compute $\mathbf{v}$ explicitly (its standard-basis coordinates). Verify that your answer to (c) gives the correct representation.

(e) A linear map $T : \mathbb{R}^2 \to \mathbb{R}^2$ has matrix

M_{\text{std}} = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix}

in the standard basis. Find its matrix $M_B$ in basis $B$ . Verify that $M_{\text{std}}$ and $M_B$ have the same eigenvalues (they are similar matrices).

Code cell 23

# Your Solution
# Exercise 7 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 7.")

Code cell 24

# Solution
# Exercise 7 - reference solution

b1 = np.array([1.0,  2.0])
b2 = np.array([1.0, -1.0])
c1 = np.array([2.0,  1.0])
c2 = np.array([-1.0, 1.0])
P_B = np.column_stack([b1, b2])
P_C = np.column_stack([c1, c2])

# (a) Both bases: non-zero determinant <=> invertible <=> spans R^2 and is independent
det_B = np.linalg.det(P_B)
det_C = np.linalg.det(P_C)
header('Exercise 7(a): Verify bases')
print(f'det(P_B) = {det_B:.4f}')
print(f'det(P_C) = {det_C:.4f}')
check_true('B is a basis (det != 0)', abs(det_B) > 1e-10)
check_true('C is a basis (det != 0)', abs(det_C) > 1e-10)

# (b) P_{B->C}: j-th column = coords of b_j in basis C
# b_j (standard) = P_C @ [b_j]_C  =>  [b_j]_C = P_C^{-1} @ b_j
# P_{B->C} = P_C^{-1} @ P_B
P_BtoC = np.linalg.inv(P_C) @ P_B
header('Exercise 7(b): Change-of-basis matrix P_{B->C}')
print('P_{B->C} =\n', P_BtoC)
# Verify: P_C @ P_{B->C} @ [v]_B = v (standard), which means P_{B->C} correctly translates
# Check: P_C @ P_{B->C} = P_B
check_close('P_C @ P_{B->C} = P_B (consistency check)', P_C @ P_BtoC, P_B)

# (c) [v]_C = P_{B->C} @ [v]_B
v_in_B = np.array([3.0, -1.0])
v_in_C = P_BtoC @ v_in_B
header('Exercise 7(c): Coordinates of v in basis C')
print(f'[v]_B = {v_in_B}')
print(f'[v]_C = P_{{B->C}} @ [v]_B = {v_in_C}')

# (d) v in standard coords: v = P_B @ [v]_B
v_std = P_B @ v_in_B
header('Exercise 7(d): v in standard coordinates')
print(f'v (standard) = P_B @ [v]_B = {v_std}')
# Verify: P_C @ [v]_C should also give v_std
check_close('P_C @ [v]_C = v_std', P_C @ v_in_C, v_std)
# Verify [v]_C directly: inverse of C applied to v_std
check_close('[v]_C = P_C^{-1} @ v_std', np.linalg.inv(P_C) @ v_std, v_in_C)

# (e) Matrix of T in basis B: similarity transformation
# M_B = P_B^{-1} @ M_std @ P_B
M_std = np.array([[2.0, 1.0], [0.0, 3.0]])
P_B_inv = np.linalg.inv(P_B)
M_B = P_B_inv @ M_std @ P_B
header('Exercise 7(e): T in basis B')
print('M_std =\n', M_std)
print('M_B = P_B^{-1} M_std P_B =\n', M_B)
eigvals_std = np.sort(np.linalg.eigvals(M_std).real)
eigvals_B   = np.sort(np.linalg.eigvals(M_B).real)
print(f'Eigenvalues of M_std: {eigvals_std}')
print(f'Eigenvalues of M_B:   {eigvals_B}')
check_close('Similar matrices have the same eigenvalues', eigvals_std, eigvals_B)
check_close('M_std and M_B have the same trace', np.trace(M_std), np.trace(M_B))
check_close('M_std and M_B have the same determinant', np.linalg.det(M_std), np.linalg.det(M_B))
print('\nKey insight: The linear map T is the same object regardless of basis.')
print('Only its matrix representation changes; eigenvalues and determinant are intrinsic.')

print("Exercise 7 solution complete.")

Exercise 8: AI Application — Subspace Analysis of a Linear Layer ***

Every linear layer $W \in \mathbb{R}^{m \times n}$ induces four fundamental subspaces. Understanding these subspaces tells you exactly what information a layer can and cannot transmit. LoRA exploits this structure by restricting weight updates to a low-dimensional subspace of $\mathbb{R}^{m \times n}$ .

Let

W = \begin{pmatrix} 3 & 1 & 2 \\ 1 & 2 & 1 \\ 2 & 1 & 3 \end{pmatrix}.

(a) Compute the SVD $W = U \Sigma V^\top$ . What is $\operatorname{rank}(W)$ ? Report the singular values.

(b) Identify $\operatorname{col}(W)$ — the reachable output subspace. What is its dimension? If a downstream vector $\mathbf{b} = (1, 0, 0)^\top$ is the target, is it reachable? What about $\mathbf{b}' = (1, -1, 0)^\top$ ? (Verify computationally.)

(c) Identify $\operatorname{null}(W)$ — the input directions the layer ignores. What is its dimension? If the null space is non-trivial, give an explicit null vector and verify $W\mathbf{n} = \mathbf{0}$ . If the null space is trivial, explain why.

(d) LoRA adapter: let $\Delta W = \mathbf{b}\mathbf{a}^\top$ with $\mathbf{b} = (1, 0, 1)^\top$ and $\mathbf{a} = (1, 1, 0)^\top$ . What is $\operatorname{col}(\Delta W)$ ? What is $\operatorname{rank}(\Delta W)$ ? Verify that $\Delta W = \mathbf{b}\mathbf{a}^\top$ numerically.

(e) Updated weight $W' = W + \Delta W$ . Without computing the full matrix, what is the upper bound on $\operatorname{rank}(W')$ ? Under what condition would $\operatorname{rank}(W') = \operatorname{rank}(W)$ ? Under what condition would $\operatorname{rank}(W') = \operatorname{rank}(W) + 1$ ? Verify your answer by computing $\operatorname{rank}(W')$ directly.

Code cell 26

# Your Solution
# Exercise 8 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 8.")

Code cell 27

# Solution
# Exercise 8 - reference solution

W8 = np.array([
    [3.0, 1.0, 2.0],
    [1.0, 2.0, 1.0],
    [2.0, 1.0, 3.0]
])

# (a) SVD and rank
U8, s8, Vt8 = np.linalg.svd(W8)
rank8 = np.linalg.matrix_rank(W8)

header('Exercise 8(a): SVD and rank')
print(f'rank(W) = {rank8}')
print(f'Singular values: {s8}')
print(f'U (left singular vectors):\n{U8}')
print(f'V^T (right singular vectors):\n{Vt8}')
check_close('SVD reconstruction: U @ diag(s) @ Vt = W', U8 @ np.diag(s8) @ Vt8, W8)

# (b) Column space
b_target  = np.array([1.0,  0.0, 0.0])
b_target2 = np.array([1.0, -1.0, 0.0])

def in_col_space(A, b, tol=1e-8):
    """True if b is in the column space of A (check via rank augmentation)."""
    r_A = np.linalg.matrix_rank(A)
    r_aug = np.linalg.matrix_rank(np.column_stack([A, b]))
    return r_A == r_aug

header('Exercise 8(b): Column space and reachability')
print(f'col(W) has dimension {rank8} (= rank(W))')
print(f'Basis for col(W): left singular vectors u1,...,u_{rank8} of W')
col_basis8 = U8[:, :rank8]
print('col(W) basis (columns):\n', col_basis8)

b1_reachable = in_col_space(W8, b_target)
b2_reachable = in_col_space(W8, b_target2)
print(f'b = {b_target}: reachable = {b1_reachable}')
print(f'b\' = {b_target2}: reachable = {b2_reachable}')
check_true('b = (1,0,0) is in col(W)', b1_reachable)
check_true('b\' = (1,-1,0) is in col(W)', b2_reachable)
print('Since rank(W)=3=m, col(W)=R^3: every b is reachable. W has full column rank => col(W)=R^3.')

# (c) Null space
null8, rank8_check = nullspace_basis(W8)
nullity8 = null8.shape[1] if null8.size > 0 else 0

header('Exercise 8(c): Null space')
print(f'rank(W) = {rank8_check},  nullity = n - rank = {W8.shape[1] - rank8_check}')
print(f'null(W) has dimension {nullity8}')
if nullity8 == 0:
    print('null(W) = {0}: W is injective (full column rank).')
    print('Every distinct input produces a distinct output. No information is lost.')
    print('rank(W) = n = 3 => W is injective => null(W) is trivial.')
else:
    print('null(W) basis:\n', null8)
    check_close('W @ null_vec = 0', W8 @ null8[:, 0], np.zeros(W8.shape[0]))

# (d) LoRA adapter
b8 = np.array([1.0, 0.0, 1.0])
a8 = np.array([1.0, 1.0, 0.0])
DeltaW8 = np.outer(b8, a8)  # b @ a^T (outer product)
rank_DW8 = np.linalg.matrix_rank(DeltaW8)

header('Exercise 8(d): LoRA adapter DeltaW = b @ a^T')
print('b =', b8)
print('a =', a8)
print('DeltaW = b @ a^T =\n', DeltaW8)
print(f'rank(DeltaW) = {rank_DW8}')
check_true('DeltaW has rank 1 (outer product of nonzero vectors)', rank_DW8 == 1)
# col(DeltaW) = span{b}
# Any vector in col(DeltaW) is of the form DeltaW @ x = (a^T x) b, which is a scalar multiple of b
x_test = np.array([2.0, -1.0, 3.0])
Dw_x = DeltaW8 @ x_test
scalar = np.dot(a8, x_test)
check_close('DeltaW @ x = (a^T x) * b', Dw_x, scalar * b8)
print(f'col(DeltaW) = span{{b}} = span{{(1,0,1)}}; dimension = 1')
print(f'null(DeltaW) = null(a^T): any x with a^T x = 0; dimension = 2')

# (e) Rank of W' = W + DeltaW
Wprime8 = W8 + DeltaW8
rank_Wprime8 = np.linalg.matrix_rank(Wprime8)

header('Exercise 8(e): rank(W\' = W + DeltaW)')
print(f'rank(W) = {rank8}')
print(f'rank(DeltaW) = {rank_DW8}')
print(f'rank(W\') = {rank_Wprime8}')
print(f'Upper bound: rank(W + DeltaW) <= rank(W) + rank(DeltaW) = {rank8 + rank_DW8}')
check_true('rank(W\') <= rank(W) + rank(DeltaW)', rank_Wprime8 <= rank8 + rank_DW8)
print()
print('Interpretation:')
print(f'  rank(W\') = rank(W) = {rank8}: the LoRA update does NOT increase rank.')
print('  (W already has full rank 3 = m = n; rank cannot exceed 3 in R^{3x3}.)')
print()
print('General rule for LoRA (when rank(W) < n):')
print('  rank(W\') = rank(W):     if col(DeltaW) = span{b} is already inside col(W)')
print('  rank(W\') = rank(W) + 1: if b is NOT in col(W) (LoRA adds a genuinely new direction)')
print('  Here rank(W)=3=m, so col(W)=R^3 contains everything; the adapter cannot increase rank.')

print()
print('AI Takeaway:')
print('  col(W) = the reachable output subspace; rank-deficient W has dead output dimensions.')
print('  null(W) = the ignored input subspace; inputs in null(W) are invisible to this layer.')
print('  LoRA constrains updates to a rank-r subspace of R^{mxn}; the LoRA rank r is the')
print('  number of new directions the adapter can contribute to the column space of W.')

print("Exercise 8 solution complete.")

Exercise 9 (★★★): Tangent Subspace of the Probability Simplex

The probability simplex constraint $\sum_i p_i=1$ has tangent subspace

T=\{v\in\mathbb{R}^n: \mathbf{1}^\top v=0\}.

Find a basis for this tangent subspace and verify every basis vector sums to zero.

Code cell 29

# Your Solution
# Exercise 9 - learner workspace
# Build a basis for vectors whose coordinates sum to zero.
print("Learner workspace ready for Exercise 9.")

Code cell 30

# Solution
# Exercise 9 - simplex tangent subspace
header("Exercise 9: simplex tangent subspace")

n = 4
B = np.vstack([np.eye(n-1), -np.ones(n-1)])
print("basis columns:\n", B)
check_close("each basis vector sums to zero", np.ones(n) @ B, np.zeros(n-1), tol=1e-12)
check_true("dimension is n-1", la.matrix_rank(B) == n-1)
print("Takeaway: constrained probability updates live in a lower-dimensional linear subspace.")

Exercise 10 (★★★): Affine Solution Set of an Underdetermined System

For $Ax=b$ with infinitely many solutions, write

x=x_0+z, \qquad z\in\operatorname{null}(A).

Find one solution, a nullspace direction, and verify several affine solutions.

Code cell 32

# Your Solution
# Exercise 10 - learner workspace
# Find a particular solution plus nullspace direction.
print("Learner workspace ready for Exercise 10.")

Code cell 33

# Solution
# Exercise 10 - affine solution set
header("Exercise 10: affine solution set")

A = np.array([[1.0, 2.0, -1.0], [0.0, 1.0, 1.0]])
b = np.array([1.0, 2.0])
x0 = la.lstsq(A, b, rcond=None)[0]
U, S, Vt = la.svd(A)
z = Vt[-1]
print("particular solution:", x0)
print("null direction:", z)
check_close("A x0 = b", A @ x0, b, tol=1e-10)
check_close("A z = 0", A @ z, np.zeros(A.shape[0]), tol=1e-10)
for t in [-2.0, 0.0, 3.5]:
    check_close(f"solution at t={t}", A @ (x0 + t*z), b, tol=1e-10)
print("Takeaway: underdetermined systems are affine spaces: one solution plus all null directions.")

What to Review After Finishing

Axioms and structure

Can you identify which axiom fails in a non-example, and produce a concrete counterexample?
Do you know why the three-condition test (zero, closure under $+$ , closure under scalar mult) is sufficient to check the full eight axioms?
Can you recognise affine subspaces — sets that look like subspaces but are offset from the origin?

Span, independence, and basis

Can you compute the dimension of a span via row reduction?
Can you extract a basis from a spanning set by identifying pivot columns?
Can you express dependent vectors as linear combinations of the basis and verify the reconstruction?

Four fundamental subspaces

Can you compute all four subspaces (col, null, row, left null) of a matrix from RREF?
Can you verify the orthogonality $\operatorname{row}(A) \perp \operatorname{null}(A)$ and $\operatorname{col}(A) \perp \operatorname{null}(A^\top)$ ?
Do you understand Strang's picture: $A$ is a bijection from $\operatorname{row}(A)$ to $\operatorname{col}(A)$ , and $A$ kills $\operatorname{null}(A)$ ?

Gram-Schmidt and projections

Can you execute Gram-Schmidt step by step and verify orthonormality?
Can you construct a projection matrix $P = QQ^\top$ and verify $P^2 = P$ and $P^\top = P$ ?
Can you explain why the residual $\mathbf{w} - P\mathbf{w}$ is orthogonal to the subspace?

Subspace operations and basis change

Can you compute $\dim(W_1 + W_2)$ and $\dim(W_1 \cap W_2)$ using the dimension formula?
Do you know when $V = W_1 \oplus W_2$ (direct sum) vs when it fails?
Can you find the change-of-basis matrix $P_{B \to C}$ and convert coordinates between bases?
Can you find the matrix of a linear map in a new basis via the similarity transform $M_B = P_B^{-1} M_{\text{std}} P_B$ ?

AI connections

Can you interpret $\operatorname{col}(W)$ , $\operatorname{null}(W)$ , $\operatorname{row}(W)$ , $\operatorname{null}(W^\top)$ for a neural network weight matrix?
Can you explain LoRA in terms of restricting the weight update to a rank- $r$ subspace of $\mathbb{R}^{m \times n}$ ?
Do you understand why the rank of a LoRA update $\Delta W = \mathbf{b}\mathbf{a}^\top$ is at most $r$ and when it is strictly less?

Vector Spaces Subspaces