Private notes
0/8000

Notes stay private to your browser until account sync is configured.

Part 3
3 min read4 headingsSplit lesson page

Lesson overview | Previous part | Lesson overview

Positional Encodings: Part 10: Exercises to References

10. Exercises

  1. (*) Compute a sinusoidal position row.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  2. (*) Show token plus position addition.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  3. (*) Build a relative distance matrix.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  4. (**) Create a relative attention bias matrix.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  5. (**) Apply a RoPE rotation and check norm preservation.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  6. (**) Verify a RoPE relative-offset dot-product identity in two dimensions.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  7. (**) Build an ALiBi matrix.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  8. (***) Compute learned position table parameters.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  9. (***) Check decode position ids for a KV cache.

    • (a) State the scheme.
    • (b) Compute the numeric example.
    • (c) Explain the LLM consequence.
  10. (***) Design a long-context position diagnostic.

  • (a) State the scheme.
  • (b) Compute the numeric example.
  • (c) Explain the LLM consequence.

11. Why This Matters for AI

ConceptAI impact
Sinusoidal encodingsProvide fixed absolute order features without learned position rows.
Learned position tablesWork well in-range but tie the model to trained maximum positions.
Relative biasesLet attention reason about pairwise distances.
RoPESupports relative offset behavior through rotations and is common in modern decoder LLMs.
ALiBiAdds simple distance penalties that extrapolate without a learned position table.
Position idsMatter for KV-cache decoding and long-context serving correctness.
Long-context diagnosticsExpose lost-in-the-middle, recency bias, and extrapolation failures.
Mask interactionEnsures order signals do not override causal or padding visibility.

12. Conceptual Bridge

The backward bridge is attention. Attention computes content-based interactions, but position mechanisms determine whether those interactions know sequence order and distance.

The forward bridge is language-model probability. In next-token prediction, position affects which prefix states are visible, how generated tokens receive ids, and whether the model can use long contexts reliably.

+-------------+      +----------------+      +----------------------+
| attention   | ---> | position signal | ---> | ordered next-token    |
| content mix |      | absolute/rel    |      | prediction            |
+-------------+      +----------------+      +----------------------+

The practical habit is to test length behavior, not only maximum accepted length. Position encodings can be mathematically valid and still behave poorly outside the training regime.

References

Skill Check

Test this lesson

Answer 4 quick questions to lock in the lesson and feed your adaptive practice queue.

--
Score
0/4
Answered
Not attempted
Status
1

Which module does this lesson belong to?

2

Which section is covered in this lesson content?

3

Which term is most central to this lesson?

4

What is the best way to use this lesson for real learning?

Your answers save locally first, then sync when account storage is available.
Practice queue