Calculus FundamentalsMath for LLMs

Calculus Fundamentals

Calculus Fundamentals

Exercises Notebook

Converted from exercises.ipynb for web reading.

Limits and Continuity - Exercises

10 graded exercises covering the full section arc, from core calculus mechanics to ML-facing applications.

FormatDescription
ProblemMarkdown cell with task description
Your SolutionCode cell for learner work
SolutionReference solution with checks

Difficulty: straightforward -> moderate -> challenging.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import numpy.linalg as la
from scipy import integrate, special, stats
from math import factorial
import matplotlib.patches as patches

COLORS = {
    "primary": "#0077BB",
    "secondary": "#EE7733",
    "tertiary": "#009988",
    "error": "#CC3311",
    "neutral": "#555555",
    "highlight": "#EE3377",
}
HAS_MPL = True
np.set_printoptions(precision=8, suppress=True)
np.random.seed(42)

def header(title):
    print("\n" + "=" * len(title))
    print(title)
    print("=" * len(title))

def check_true(name, cond):
    ok = bool(cond)
    print(f"{'PASS' if ok else 'FAIL'} - {name}")
    return ok

def check_close(name, got, expected, tol=1e-8):
    ok = np.allclose(got, expected, atol=tol, rtol=tol)
    print(f"{'PASS' if ok else 'FAIL'} - {name}: got {got}, expected {expected}")
    return ok



def centered_diff(f, x, h=1e-6):
    return (f(x + h) - f(x - h)) / (2 * h)

def forward_diff(f, x, h=1e-6):
    return (f(x + h) - f(x)) / h

def backward_diff(f, x, h=1e-6):
    return (f(x) - f(x - h)) / h



def grad_check(f, x, analytic_grad, h=1e-6):
    x = np.asarray(x, dtype=float)
    analytic_grad = np.asarray(analytic_grad, dtype=float)
    numeric_grad = np.zeros_like(x, dtype=float)
    for idx in np.ndindex(x.shape):
        x_plus = x.copy(); x_minus = x.copy()
        x_plus[idx] += h; x_minus[idx] -= h
        numeric_grad[idx] = (f(x_plus) - f(x_minus)) / (2 * h)
    denom = la.norm(analytic_grad) + la.norm(numeric_grad) + 1e-12
    return la.norm(analytic_grad - numeric_grad) / denom



def check(name, got, expected, tol=1e-8):
    return check_close(name, got, expected, tol=tol)

print("Chapter helper setup complete.")

Exercise 1 (★): Basic Limit Computation

Compute the following limits analytically and verify numerically.

(a) limx3x29x3\displaystyle\lim_{x \to 3} \frac{x^2 - 9}{x - 3}

(b) limx0sin(5x)3x\displaystyle\lim_{x \to 0} \frac{\sin(5x)}{3x}

(c) limx3x2+2x1x2+5\displaystyle\lim_{x \to \infty} \frac{3x^2 + 2x - 1}{x^2 + 5}

Hint for (a): Factor the numerator. Hint for (b): Use limu0sin(u)/u=1\lim_{u\to 0}\sin(u)/u = 1. Hint for (c): Divide numerator and denominator by x2x^2.

Code cell 5

# Your Solution
# Exercise 1 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 1.")

Code cell 6

# Solution
# Exercise 1 - reference solution

import numpy as np

# (a) lim_{x->3} (x^2-9)/(x-3) = lim (x-3)(x+3)/(x-3) = lim (x+3) = 6
limit_a = 6.0

# Numerical verification
f_a = lambda x: (x**2 - 9) / (x - 3)
h_vals = [1e-1, 1e-3, 1e-6, 1e-9]
a_numerical = np.mean([(f_a(3+h) + f_a(3-h))/2 for h in h_vals])

# (b) lim_{x->0} sin(5x)/(3x) = (5/3)*lim sin(u)/u = 5/3
limit_b = 5/3
f_b = lambda x: np.sin(5*x)/(3*x) if abs(x) > 1e-15 else 5/3
b_numerical = np.mean([(f_b(h) + f_b(-h))/2 for h in [1e-2, 1e-4, 1e-6]])

# (c) lim_{x->inf} (3x^2+2x-1)/(x^2+5)
# = lim (3 + 2/x - 1/x^2)/(1 + 5/x^2) = 3/1 = 3
limit_c = 3.0
f_c = lambda x: (3*x**2 + 2*x - 1)/(x**2 + 5)
c_numerical = f_c(1e8)

header('Exercise 1: Basic Limit Computation')
print(f'(a) lim (x^2-9)/(x-3) as x->3 = {limit_a}')
check_close('(a) analytic = 6', a_numerical, limit_a, tol=1e-6)

print(f'(b) lim sin(5x)/(3x) as x->0 = {limit_b:.6f}')
check_close('(b) analytic = 5/3', b_numerical, limit_b, tol=1e-6)

print(f'(c) lim (3x^2+2x-1)/(x^2+5) as x->inf = {limit_c}')
check_close('(c) analytic = 3', c_numerical, limit_c, tol=1e-4)

print('\nTakeaway: Three core techniques — factoring, fundamental sin limit, leading coefficient ratio.')

print("Exercise 1 solution complete.")

Exercise 2 (★): One-Sided Limits

Analyze the one-sided limits of the following functions and determine whether the two-sided limit exists.

(a) f(x)=x2x2f(x) = \dfrac{\lvert x - 2 \rvert}{x - 2} at x=2x = 2

(b) g(x)={x2+1x<13x1x1g(x) = \begin{cases} x^2 + 1 & x < 1 \\ 3x - 1 & x \geq 1 \end{cases} at x=1x = 1

For each: compute left and right limits, state whether the two-sided limit exists, and classify any discontinuity.

Code cell 8

# Your Solution
# Exercise 2 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 2.")

Code cell 9

# Solution
# Exercise 2 - reference solution

import numpy as np

# (a) f(x) = |x-2|/(x-2)
# For x < 2: |x-2| = -(x-2), so f(x) = -1  => left limit = -1
# For x > 2: |x-2| = (x-2), so f(x) = +1   => right limit = +1
left_a = -1.0
right_a = +1.0

# Numerical
f = lambda x: np.abs(x - 2) / (x - 2)
h = 1e-8
left_a_num = f(2 - h)
right_a_num = f(2 + h)

# (b) Piecewise: g(x) = x^2+1 for x<1, 3x-1 for x>=1
# Left limit (x->1^-): 1^2+1 = 2
# Right limit (x->1^+): 3(1)-1 = 2
# g(1) = 3(1)-1 = 2
left_b = 2.0
right_b = 2.0

g = lambda x: x**2 + 1 if x < 1 else 3*x - 1
left_b_num = g(1 - h)
right_b_num = g(1 + h)

header('Exercise 2: One-Sided Limits')
print('(a) |x-2|/(x-2) at x=2:')
check_close('Left limit = -1', left_a_num, left_a)
check_close('Right limit = +1', right_a_num, right_a)
check_true('Two-sided limit DNE (left ≠ right)', left_a != right_a)
print('  Discontinuity type: JUMP (left=-1, right=+1)')

print()
print('(b) Piecewise function at x=1:')
check_close('Left limit = 2', left_b_num, left_b, tol=1e-6)
check_close('Right limit = 2', right_b_num, right_b, tol=1e-6)
check_true('Two-sided limit exists (left = right = 2)', abs(left_b - right_b) < 1e-12)
check_close('g(1) = 2 (continuous!)', g(1), 2.0)
print('  g is continuous at x=1 — both one-sided limits equal g(1)')

print('\nTakeaway: Two-sided limit exists iff both one-sided limits agree.')

print("Exercise 2 solution complete.")

Exercise 3 (★): L'Hôpital's Rule

Apply L'Hôpital's Rule to resolve the following indeterminate forms. Identify the form before applying the rule.

(a) limx0ex1xx2\displaystyle\lim_{x \to 0} \frac{e^x - 1 - x}{x^2}   [0/00/0]

(b) limxlnxx\displaystyle\lim_{x \to \infty} \frac{\ln x}{x}   [/\infty/\infty]

(c) limx0+xlnx\displaystyle\lim_{x \to 0^+} x \ln x   [0()0 \cdot (-\infty), convert first]

For (c): Rewrite as lnx/(1/x)\ln x / (1/x) to get /\infty/\infty form.

Code cell 11

# Your Solution
# Exercise 3 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 3.")

Code cell 12

# Solution
# Exercise 3 - reference solution

import numpy as np

# (a) lim_{x->0} (e^x - 1 - x)/x^2
# First application (0/0): (e^x - 1)/(2x) [still 0/0]
# Second application: e^x/2 -> 1/2
limit_a = 0.5

# Verify numerically via Taylor: e^x = 1 + x + x^2/2 + ...
# (e^x-1-x)/x^2 = (x^2/2 + x^3/6 + ...)/x^2 = 1/2 + x/6 + ... -> 1/2
f_a = lambda x: (np.expm1(x) - x) / x**2  # use expm1 for stability
a_vals = [f_a(h) for h in [1e-1, 1e-2, 1e-4, 1e-6]]

# (b) lim_{x->inf} ln(x)/x
# L'Hopital: (1/x)/1 = 1/x -> 0
limit_b = 0.0
b_vals = [np.log(x)/x for x in [1e2, 1e4, 1e6, 1e8]]

# (c) lim_{x->0+} x*ln(x)
# Rewrite: ln(x)/(1/x), form -inf/+inf
# L'Hopital: (1/x)/(-1/x^2) = -x -> 0
limit_c = 0.0
c_vals = [x * np.log(x) for x in [1e-1, 1e-2, 1e-4, 1e-8]]

header("Exercise 3: L'Hopital's Rule")
print('(a) lim (e^x-1-x)/x^2 as x->0:')
print(f'  Numerical: {a_vals}')
check_close('Converges to 1/2', a_vals[-1], limit_a, tol=1e-5)

print('(b) lim ln(x)/x as x->inf:')
print(f'  Numerical: {b_vals}')
check_close('Converges to 0', b_vals[-1], limit_b, tol=1e-5)

print('(c) lim x*ln(x) as x->0+:')
print(f'  Numerical: {c_vals}')
check_close('Converges to 0', c_vals[-1], limit_c, tol=1e-5)

print("\nTakeaway: L'Hopital requires 0/0 or inf/inf form. Convert 0*(-inf) by rewriting.")

print("Exercise 3 solution complete.")

Exercise 4 (★): Continuity Analysis

For each function, determine all points of discontinuity and classify each as removable, jump, or essential (infinite). Where applicable, fix removable discontinuities.

(a) f(x)=x21x1f(x) = \dfrac{x^2 - 1}{x - 1}

(b) g(x)=1x24g(x) = \dfrac{1}{x^2 - 4}

(c) h(x)={x+1x<00x=0x21x>0h(x) = \begin{cases} x + 1 & x < 0 \\ 0 & x = 0 \\ x^2 - 1 & x > 0 \end{cases}

Code cell 14

# Your Solution
# Exercise 4 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 4.")

Code cell 15

# Solution
# Exercise 4 - reference solution

import numpy as np

h_step = 1e-9

# (a) f(x) = (x^2-1)/(x-1) = (x-1)(x+1)/(x-1)
# Undefined at x=1; limit = lim (x+1) = 2
# Removable discontinuity: fix by setting f(1) = 2
f_a = lambda x: (x**2 - 1) / (x - 1)
lim_at_1 = (f_a(1 + h_step) + f_a(1 - h_step)) / 2

# (b) g(x) = 1/(x^2-4) = 1/((x-2)(x+2))
# Undefined at x=2 and x=-2
# As x->2: denominator->0, numerator->1 => g->+/-inf (essential)
# As x->-2: same
g = lambda x: 1 / (x**2 - 4)
g_right_2 = g(2 + h_step)
g_left_2 = g(2 - h_step)

# (c) h(x): at x=0
# Left limit (x->0^-): x+1 -> 0+1 = 1
# Right limit (x->0^+): x^2-1 -> 0-1 = -1
# h(0) = 0 (defined)
# Left != Right => jump discontinuity
left_c = 1.0
right_c = -1.0
h_at_0 = 0.0

header('Exercise 4: Continuity Analysis')

print('(a) f(x) = (x^2-1)/(x-1):')
check_close('Limit at x=1 = 2 (removable)', lim_at_1, 2.0, tol=1e-6)
print('  Type: REMOVABLE. Fix: set f(1) = 2')

print()
print('(b) g(x) = 1/(x^2-4):')
check_true('g(2+) > 0 and large (positive side)', g_right_2 > 1e6)
check_true('g(2-) < 0 and large (negative side)', g_left_2 < -1e6)
print('  Type at x=±2: ESSENTIAL (infinite). Vertical asymptotes.')

print()
print('(c) Piecewise h(x) at x=0:')
check_close('Left limit = 1', left_c, 1.0)
check_close('Right limit = -1', right_c, -1.0)
check_true('Left ≠ Right => JUMP discontinuity', abs(left_c - right_c) > 1)
print(f'  h(0) = {h_at_0} (defined but neither limit equals h(0))')

print('\nTakeaway: Continuity requires all three: f(a) defined, limit exists, they agree.')

print("Exercise 4 solution complete.")

Exercise 5 (★★): Squeeze Theorem and IVT

(a) Prove using the Squeeze Theorem that limx0x2cos(1/x)=0\displaystyle\lim_{x \to 0} x^2 \cos(1/x) = 0.

Verify numerically and plot the squeeze.

(b) Use the IVT to show that f(x)=x3+2x5f(x) = x^3 + 2x - 5 has a root in [1,2][1, 2]. Find the root numerically using bisection (tolerance 10810^{-8}).

Code cell 17

# Your Solution
# Exercise 5 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 5.")

Code cell 18

# Solution
# Exercise 5 - reference solution

import numpy as np
import matplotlib.pyplot as plt

COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988'}

# (a) Squeeze Theorem
# -1 <= cos(1/x) <= 1 for all x != 0
# => -x^2 <= x^2*cos(1/x) <= x^2
# Both -x^2 and x^2 -> 0 as x -> 0
# By Squeeze: x^2*cos(1/x) -> 0

x_vals = np.linspace(-0.5, 0.5, 5000)
x_vals = x_vals[np.abs(x_vals) > 1e-10]

f_squeeze = x_vals**2 * np.cos(1/x_vals)
upper = x_vals**2
lower = -x_vals**2

# Verify squeeze
check_true('Lower bound holds: -x^2 <= x^2*cos(1/x)', np.all(lower <= f_squeeze + 1e-15))
check_true('Upper bound holds: x^2*cos(1/x) <= x^2', np.all(f_squeeze <= upper + 1e-15))

x_small = np.array([1e-1, 1e-2, 1e-4, 1e-6, 1e-8])
f_small = x_small**2 * np.cos(1/x_small)
print(f'x^2*cos(1/x) at small x: {f_small}')
check_close('lim_{x->0} x^2*cos(1/x) = 0', f_small[-1], 0.0, tol=1e-14)

if HAS_MPL:
    fig, ax = plt.subplots(figsize=(10, 5))
    x_plot = np.linspace(-0.4, 0.4, 2000)
    x_plot = x_plot[np.abs(x_plot) > 1e-10]
    ax.fill_between(x_plot, -x_plot**2, x_plot**2, alpha=0.2, color=COLORS['secondary'], label='Squeeze bounds')
    ax.plot(x_plot, x_plot**2*np.cos(1/x_plot), color=COLORS['primary'], lw=1.5, label=r'$x^2\cos(1/x)$')
    ax.plot(x_plot, x_plot**2, color=COLORS['tertiary'], lw=1.5, ls='--', label=r'$x^2$')
    ax.plot(x_plot, -x_plot**2, color=COLORS['tertiary'], lw=1.5, ls='--', label=r'$-x^2$')
    ax.set_xlabel('x'); ax.set_ylabel('f(x)')
    ax.set_title(r'Squeeze: $-x^2 \leq x^2\cos(1/x) \leq x^2$, all $\to 0$')
    ax.legend(); ax.set_ylim(-0.2, 0.2)
    fig.tight_layout(); plt.show()

# (b) IVT and bisection
def f_b(x): return x**3 + 2*x - 5

print(f'f(1) = {f_b(1)}, f(2) = {f_b(2)}')
check_true('IVT applies: f(1) < 0 < f(2)', f_b(1) < 0 < f_b(2))

a, b = 1.0, 2.0
for _ in range(60):
    m = (a + b) / 2
    if f_b(a) * f_b(m) < 0:
        b = m
    else:
        a = m
    if b - a < 1e-12: break
root = (a + b) / 2

check_close('Root of x^3+2x-5 in [1,2]', f_b(root), 0.0, tol=1e-8)
print(f'Root found: {root:.10f}')
print('\nTakeaway: Squeeze proves limits by bounding; IVT proves root existence by sign change.')

print("Exercise 5 solution complete.")

Exercise 6 (★★): ε-δ Proof

Give a formal ε-δ proof that limx2(3x1)=5\lim_{x \to 2} (3x - 1) = 5.

(a) Find an explicit expression for δ\delta in terms of ε\varepsilon.

(b) Verify: for ε=0.3\varepsilon = 0.3, compute δ\delta and confirm that x2<δ    (3x1)5<0.3\lvert x - 2 \rvert < \delta \implies \lvert (3x-1) - 5 \rvert < 0.3.

(c) Generalize: prove limxa(mx+b)=ma+b\lim_{x \to a} (mx + b) = ma + b for any m,a,bRm, a, b \in \mathbb{R}. What is the explicit δ(ε)\delta(\varepsilon)?

Code cell 20

# Your Solution
# Exercise 6 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 6.")

Code cell 21

# Solution
# Exercise 6 - reference solution

import numpy as np

# (a) |(3x-1) - 5| = |3(x-2)| = 3|x-2|
# Need 3|x-2| < eps => |x-2| < eps/3
# So delta(eps) = eps/3

def delta_for_epsilon(eps):
    return eps / 3

def general_delta(eps, m):
    if abs(m) < 1e-15:
        return float('inf')  # constant function: any delta works
    return eps / abs(m)

header('Exercise 6: epsilon-delta Proof')

# (a) Show delta formula
print('(a) Proof:')
print('  |(3x-1) - 5| = |3x - 6| = 3|x - 2|')
print('  For 3|x-2| < eps: choose delta = eps/3')
print('  Then |x-2| < delta => 3|x-2| < 3*(eps/3) = eps. QED.')

# (b) Verify numerically
eps = 0.3
delta = delta_for_epsilon(eps)
x_test = np.linspace(2 - delta + 1e-12, 2 + delta - 1e-12, 10000)
f_vals = 3*x_test - 1
errors = np.abs(f_vals - 5)

print(f'\n(b) eps={eps}, delta=eps/3={delta:.6f}')
check_close('delta formula', delta, eps/3)
check_true(f'All |f(x)-5| < eps={eps}', np.all(errors < eps))
print(f'Max error in delta-window: {errors.max():.6f} < {eps}')

# (c) General linear limit
print('\n(c) General: lim_{x->a}(mx+b) = ma+b')
print('  |(mx+b) - (ma+b)| = |m||x-a|')
print('  Choose delta = eps/|m| (for m != 0)')
for m_val in [1, 2, 5, 10]:
    d = general_delta(0.1, m_val)
    check_close(f'delta(eps=0.1, m={m_val}) = 0.1/{m_val}', d, 0.1/m_val)

print('\nTakeaway: epsilon-delta proofs follow a pattern — bound |f(x)-L| by C|x-a|, then set delta=eps/C.')

print("Exercise 6 solution complete.")

Exercise 7 (★★★): Gradient as a Limit

The derivative is defined as a limit of difference quotients:

f(a)=limh0f(a+h)f(a)hf'(a) = \lim_{h \to 0} \frac{f(a+h) - f(a)}{h}

(a) Implement finite_diff_1(f, a, h) (one-sided) and finite_diff_c(f, a, h) (centered). Compare their errors on f(x)=sin(x)f(x) = \sin(x) at a=π/4a = \pi/4 for h[101,1014]h \in [10^{-1}, 10^{-14}].

(b) Show that centered differences have error O(h2)O(h^2) while one-sided have error O(h)O(h). Verify by fitting log(error)=αlog(h)+C\log(\text{error}) = \alpha \log(h) + C and checking α1\alpha \approx 1 vs 22.

(c) Implement grad_check(f, theta, grad_fn, h=1e-5) that computes the relative error between an analytic gradient and the centered finite difference. Test on f(θ)=Wθ2/2f(\mathbf{\theta}) = \|W\theta\|^2 / 2 with W=[[1,2],[3,4]]W = [[1,2],[3,4]].

Code cell 23

# Your Solution
# Exercise 7 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 7.")

Code cell 24

# Solution
# Exercise 7 - reference solution

import numpy as np
import matplotlib.pyplot as plt

COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988'}

def finite_diff_1(f, a, h):
    return (f(a + h) - f(a)) / h

def finite_diff_c(f, a, h):
    return (f(a + h) - f(a - h)) / (2 * h)

def grad_check(f, theta, grad_fn, h=1e-5):
    grad_an = grad_fn(theta)
    grad_fd = np.zeros_like(theta)
    for i in range(len(theta)):
        tp = theta.copy(); tp[i] += h
        tm = theta.copy(); tm[i] -= h
        grad_fd[i] = (f(tp) - f(tm)) / (2*h)
    rel_err = np.linalg.norm(grad_an - grad_fd) / (np.linalg.norm(grad_an) + np.linalg.norm(grad_fd) + 1e-12)
    return rel_err, grad_an, grad_fd

f = np.sin
a = np.pi / 4
true_deriv = np.cos(a)

header('Exercise 7: Gradient as a Limit')

# (a) Error comparison
h_vals = np.logspace(-1, -14, 50)
err_1 = np.abs([finite_diff_1(f, a, h) - true_deriv for h in h_vals])
err_c = np.abs([finite_diff_c(f, a, h) - true_deriv for h in h_vals])

opt_idx_1 = np.argmin(err_1)
opt_idx_c = np.argmin(err_c)
print(f'Best one-sided: h={h_vals[opt_idx_1]:.2e}, error={err_1[opt_idx_1]:.2e}')
print(f'Best centered:  h={h_vals[opt_idx_c]:.2e}, error={err_c[opt_idx_c]:.2e}')
check_true('Centered is more accurate than one-sided', err_c[opt_idx_c] < err_1[opt_idx_1])

# (b) Convergence rate
# Use intermediate h where FP errors haven't dominated
mask = h_vals > 1e-8
h_fit = h_vals[mask]
e1_fit = err_1[mask]; ec_fit = err_c[mask]

alpha_1 = np.polyfit(np.log10(h_fit), np.log10(np.maximum(e1_fit, 1e-16)), 1)[0]
alpha_c = np.polyfit(np.log10(h_fit), np.log10(np.maximum(ec_fit, 1e-16)), 1)[0]
print(f'\nConvergence rate (one-sided): {alpha_1:.2f} (expected ~1.0)')
print(f'Convergence rate (centered):  {alpha_c:.2f} (expected ~2.0)')
check_true('One-sided rate ~1', abs(alpha_1 - 1.0) < 0.3)
check_true('Centered rate ~2', abs(alpha_c - 2.0) < 0.4)

if HAS_MPL:
    fig, ax = plt.subplots(figsize=(10, 6))
    ax.loglog(h_vals, err_1, color=COLORS['primary'], lw=2, label=r'One-sided: $O(h)$')
    ax.loglog(h_vals, err_c, color=COLORS['secondary'], lw=2, label=r'Centered: $O(h^2)$')
    h_ref = np.logspace(-1, -8, 20)
    ax.loglog(h_ref, 0.3*h_ref, color=COLORS['primary'], ls=':', lw=1)
    ax.loglog(h_ref, 0.05*h_ref**2, color=COLORS['secondary'], ls=':', lw=1)
    ax.set_xlabel('h'); ax.set_ylabel('|error|')
    ax.set_title('Finite Difference Accuracy: Centered vs One-Sided')
    ax.legend(); fig.tight_layout(); plt.show()

# (c) Gradient check on matrix loss
W = np.array([[1., 2.], [3., 4.]])
f_loss = lambda t: 0.5 * np.dot(W @ t, W @ t)
grad_fn = lambda t: W.T @ (W @ t)  # analytic gradient: W^T W t

theta = np.array([1.5, -0.7])
rel_err, grad_an, grad_fd = grad_check(f_loss, theta, grad_fn)
print(f'\n(c) Gradient check:')
print(f'  Analytic: {grad_an}')
print(f'  FD:       {grad_fd}')
print(f'  Relative error: {rel_err:.2e}')
check_true('Gradient check passed (rel_err < 1e-5)', rel_err < 1e-5)

print('\nTakeaway: Centered FD approximates gradient with O(h^2) error; use h~1e-5 for gradient checking.')

print("Exercise 7 solution complete.")

Exercise 8 (★★★): Cross-Entropy Limit and Entropy Continuity

The Shannon entropy of a Bernoulli(p)(p) distribution is:

H(p)=plogp(1p)log(1p)H(p) = -p \log p - (1-p) \log(1-p)

with the convention 0log0=00 \log 0 = 0 (justified by the limit limp0+plogp=0\lim_{p \to 0^+} p \log p = 0).

(a) Prove analytically that limp0+plnp=0\lim_{p \to 0^+} p \ln p = 0 using L'Hôpital's Rule.

(b) Implement H(p)H(p) using the stable formula (using log1p where appropriate) and verify it is continuous at p=0p = 0 and p=1p = 1 to full floating-point precision.

(c) The cross-entropy between true distribution (p,1p)(p, 1-p) and predicted (q,1q)(q, 1-q) is:

CE(p,q)=plogq(1p)log(1q)CE(p, q) = -p \log q - (1-p) \log(1-q)

Show that limq0+CE(1,q)=+\lim_{q \to 0^+} CE(1, q) = +\infty (infinite loss when predicting 0 for a certain event). Verify numerically and explain the ML implication.

Code cell 26

# Your Solution
# Exercise 8 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 8.")

Code cell 27

# Solution
# Exercise 8 - reference solution

import numpy as np
import matplotlib.pyplot as plt

COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988', 'error': '#CC3311'}

# (a) lim_{p->0+} p*ln(p)
# Rewrite: p*ln(p) = ln(p) / (1/p)  -- form -inf / +inf
# L'Hopital: (1/p) / (-1/p^2) = -p -> 0 as p->0+
# Therefore lim p*ln(p) = 0

p_vals = np.array([1e-1, 1e-2, 1e-4, 1e-8, 1e-12, 1e-15])
xlnx_vals = p_vals * np.log(p_vals)

header('Exercise 8: Cross-Entropy Limit')
print('(a) p*ln(p) -> 0 as p->0+')
print(f'Rewrite: p*ln(p) = ln(p)/(1/p). L Hopital: (1/p)/(-1/p^2) = -p -> 0')
for p, v in zip(p_vals, xlnx_vals):
    print(f'  p={p:.0e}: p*ln(p) = {v:.6e}')
check_close('lim p*ln(p) = 0 at p=1e-15', xlnx_vals[-1], 0.0, tol=1e-12)

# (b) Stable binary entropy
def entropy(p):
    p = float(p)
    if p <= 0 or p >= 1:
        return 0.0
    # For p near 0: -p*ln(p) uses xlnx
    # For p near 1: -(1-p)*ln(1-p) uses log1p
    term1 = -p * np.log(p)  # stable for p not too small
    term2 = -(1-p) * np.log1p(-p)  # stable for p near 1
    return term1 + term2

print()
p_arr = np.array([0.0, 1e-10, 0.1, 0.3, 0.5, 0.7, 0.9, 1-1e-10, 1.0])
H_vals = np.array([entropy(p) for p in p_arr])

print('(b) Binary entropy H(p):')
for p, h in zip(p_arr, H_vals):
    print(f'  H({p:.1e}) = {h:.8f}')

check_close('H(0) = 0 (convention)', entropy(0.0), 0.0)
check_close('H(1) = 0 (convention)', entropy(1.0), 0.0)
check_close('H(0.5) = ln(2) (maximum)', entropy(0.5), np.log(2), tol=1e-10)
check_true('H is continuous: H(1e-10) close to H(0)', abs(entropy(1e-10) - entropy(0.0)) < 1e-8)

# (c) Cross-entropy limit
def ce(p, q, eps=1e-300):
    q = max(q, eps)  # numerical floor
    q1 = max(1-q, eps)
    if p == 1:
        return -np.log(q)
    elif p == 0:
        return -np.log(q1)
    return -p*np.log(q) - (1-p)*np.log(q1)

q_small = np.array([1e-1, 1e-2, 1e-4, 1e-8, 1e-15])
ce_vals = [ce(1.0, q) for q in q_small]

print()
print('(c) CE(p=1, q) as q->0+:')
for q, cv in zip(q_small, ce_vals):
    print(f'  CE(1, {q:.0e}) = {cv:.4f} = -ln({q:.0e}) = {-np.log(q):.4f}')

check_true('CE(1, q) -> inf as q->0+ (grows without bound)', ce_vals[-1] > ce_vals[0])
check_true('CE(1, q) = -ln(q) diverges', ce_vals[-1] > 30)

if HAS_MPL:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    p_plot = np.linspace(0, 1, 1000)
    H_plot = np.array([entropy(p) for p in p_plot])
    axes[0].plot(p_plot, H_plot, color=COLORS['primary'], lw=2.5)
    axes[0].set_xlabel('p'); axes[0].set_ylabel('H(p)')
    axes[0].set_title('Binary Entropy H(p) = $-p\\ln p - (1-p)\\ln(1-p)$')
    axes[0].annotate('H(0)=0', (0, 0), fontsize=11, ha='left')
    axes[0].annotate('H(0.5)=ln2', (0.5, np.log(2)+0.02), fontsize=11, ha='center')

    q_plot = np.linspace(1e-3, 0.999, 1000)
    ce_plot = [-np.log(q) for q in q_plot]
    axes[1].plot(q_plot, ce_plot, color=COLORS['error'], lw=2.5)
    axes[1].set_xlabel('q (predicted probability)'); axes[1].set_ylabel('CE(p=1, q)')
    axes[1].set_title('CE(p=1,q) = -ln(q): diverges as q->0')
    axes[1].set_ylim(0, 10)

    fig.tight_layout(); plt.show()

print('\nTakeaway: p*ln(p)->0 by L Hopital; entropy is continuous everywhere;')
print('CE(1,q)->inf as q->0 -- penalizes confidently wrong predictions infinitely.')

print("Exercise 8 solution complete.")

Exercise 9 (★★★): Temperature Limits of Softmax

Let zRKz \in \mathbb{R}^K and define the temperature-scaled softmax

pi(T)=exp(zi/T)jexp(zj/T),T>0.p_i(T) = \frac{\exp(z_i/T)}{\sum_j \exp(z_j/T)}, \qquad T > 0.
  1. Show that as T0+T \to 0^+ the distribution concentrates on the largest logit when the maximum is unique.
  2. Show that as TT \to \infty the distribution approaches the uniform distribution.
  3. Implement a stable temperature softmax and verify both limits numerically.

Code cell 29

# Your Solution
# Exercise 9 - learner workspace
# Implement the stable temperature softmax and test the two limits here.
print("Learner workspace ready for Exercise 9.")

Code cell 30

# Solution
# Exercise 9 - temperature softmax limits
header("Exercise 9: temperature softmax limits")

logits = np.array([1.0, 2.5, -0.5, 0.0])

def softmax_temperature(z, T):
    z = np.asarray(z, dtype=float) / T
    z = z - np.max(z)
    e = np.exp(z)
    return e / e.sum()

low_T = softmax_temperature(logits, 0.02)
high_T = softmax_temperature(logits, 1e6)
argmax_dist = np.eye(len(logits))[np.argmax(logits)]
uniform = np.ones_like(logits) / len(logits)

print("low temperature:", low_T)
print("high temperature:", high_T)
check_close("T -> 0 concentrates on argmax", low_T, argmax_dist, tol=1e-8)
check_close("T -> infinity approaches uniform", high_T, uniform, tol=1e-6)
print("Takeaway: generation temperature is a continuity/limit control on categorical uncertainty.")

Exercise 10 (★★★): Huber Loss Continuity and Smoothness

The Huber loss with threshold δ>0\delta > 0 is

Lδ(r)={12r2,rδ,δ(r12δ),r>δ.L_\delta(r) = \begin{cases} \frac{1}{2}r^2, & |r| \le \delta, \\ \delta(|r| - \frac{1}{2}\delta), & |r| > \delta. \end{cases}
  1. Prove the two branches meet continuously at r=±δr = \pm \delta.
  2. Prove the first derivative also matches at r=±δr = \pm \delta.
  3. Verify this numerically for δ=1.5\delta = 1.5.

Code cell 32

# Your Solution
# Exercise 10 - learner workspace
# Check the values and slopes of both branches at +/- delta.
print("Learner workspace ready for Exercise 10.")

Code cell 33

# Solution
# Exercise 10 - Huber continuity
header("Exercise 10: Huber loss continuity")

def huber(r, delta=1.5):
    r = np.asarray(r, dtype=float)
    return np.where(np.abs(r) <= delta, 0.5 * r**2, delta * (np.abs(r) - 0.5 * delta))

def huber_grad(r, delta=1.5):
    r = np.asarray(r, dtype=float)
    return np.where(np.abs(r) <= delta, r, delta * np.sign(r))

delta = 1.5
for point in [-delta, delta]:
    left_val = huber(point - 1e-8, delta)
    right_val = huber(point + 1e-8, delta)
    left_grad = huber_grad(point - 1e-8, delta)
    right_grad = huber_grad(point + 1e-8, delta)
    print(f"r={point:+.1f}: values {left_val:.10f}, {right_val:.10f}; slopes {left_grad:.10f}, {right_grad:.10f}")
    check_close(f"continuous at {point:+.1f}", left_val, right_val, tol=1e-6)
    check_close(f"C1 at {point:+.1f}", left_grad, right_grad, tol=1e-6)
print("Takeaway: Huber loss is robust to outliers while keeping a continuous gradient for optimization.")

What to Review After Finishing

  • Exercise 1: Can you compute all three limits without looking at the solution? Practise until algebraic manipulation feels automatic.
  • Exercise 2: Do you understand why left ≠ right implies the two-sided limit doesn't exist?
  • Exercise 3: Can you identify the indeterminate form first, then apply L'Hôpital correctly?
  • Exercise 4: Do you know all three discontinuity types and when each applies?
  • Exercise 5: Can you state the Squeeze Theorem and apply it? Did you understand the bisection proof?
  • Exercise 6: Can you write a complete ε-δ proof from scratch?
  • Exercise 7: Do you understand why centered differences are more accurate? Can you implement gradient checking?
  • Exercise 8: Do you see why 0*log(0)=0 is the right convention for entropy?

References

  1. Stewart, J. (2015). Calculus: Early Transcendentals (8th ed.). Cengage.
  2. Spivak, M. (2006). Calculus (4th ed.). Publish or Perish.
  3. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. (Ch. 4)
  4. Robbins, H., & Monro, S. (1951). A stochastic approximation method. Ann. Math. Stat.
  5. Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.
PreviousNext