patryk.perduta

Writing

Notes on software craft, autonomous systems, and whatever language I am currently learning the hard way.

latest

BlueDot Technical AI Safety Puzzle #1 - Submission

A walkthrough of how a frozen MiniLM head represents the country feature nonlinearly at layer L: a dartboard of radius and angle, the ReLU that deletes the linear copy, and a weirder locker-code representation trained on top.

2026-06-15 / interpretability / 2 min
hello_world();
Archive2 entries
  • 2026Jun 15 interpretability

    BlueDot Technical AI Safety Puzzle #1 - Submission

    A walkthrough of how a frozen MiniLM head represents the country feature nonlinearly at layer L: a dartboard of radius and angle, the ReLU that deletes the linear copy, and a weirder locker-code representation trained on top.

    2 min
  • 2026Jun 15 notes

    Hello, world!

    A first field note, a small signal from a new corner of the web, and enough lorem ipsum to test the typesetting.

    2 min