When I work with a colleague on a feature that spans several days, we
keep a shared document. Not formal documentation: a working record. What we
decided, why, what we rejected, what questions remain open. If either of us
is absent for a day, the other picks up where we left off. Neither of us
relies on memory alone. The document is our external memory — it persists
what individual recall cannot.
With AI coding assistants, the conversation is still largely the record.
Some tools now offer persistent memory features (Claude’s project memory,
Cursor’s rules files, Copilot’s workspace indexing) but these operate at
the project level, not the feature level. They remember that the project
uses Fastify, not that yesterday’s session rejected a
RetryQueue abstraction for specific reasons. For feature-level
decisions, every constraint, every piece of reasoning still lives in the
chat history and nowhere else. This creates a dynamic I have come to
recognize as a vicious cycle: developers keep conversations running far
longer than they should, not because long sessions are productive, but
because closing the session means losing everything. The context lives
nowhere else. There is no external record. And so the conversation stretches
on, growing unwieldy, while the AI’s ability to recall earlier decisions
quietly degrades. The longer I hold on, the less reliable the thing I am
holding on to becomes.
Could I close this conversation right
now and start a new one without anxiety
Here is a test I find
revealing: could I close this conversation right now and start a new one
without anxiety? If that question creates discomfort, if I feel I would
lose something important — my context is trapped inside a medium that was
never designed to preserve it.
Why Context Erodes
The degradation is not random. It follows from how large language models
process context.
Every model has a finite context window: a hard
limit on how many tokens it can attend to at once. Current models offer
windows ranging from hundreds of thousands to over a million tokens. These
numbers sound generous, but a productive development session generates
context quickly: code snippets, design discussions, decision rationale, file
contents. The window fills faster than most developers expect.
Research confirms what practitioners experience intuitively. A 2023 study
from Stanford and Berkeley (“Lost
in the Middle” by Liu et al.) demonstrated that language models perform significantly
worse on information placed in the middle of long contexts compared to the
beginning or end. The effect is substantial: recall accuracy drops
measurably for content that is neither recent nor at the very start of the
conversation. This is not a quirk of a particular model; it is a property of
the attention mechanism itself. Recent tokens and system-level instructions
receive disproportionate weight. Everything in between competes for a
shrinking share of the model’s focus.
The study establishes that things fade by position. What it does not
address — and what I have observed repeatedly in practice — is what
fades first. In my experience, the reasoning behind decisions
degrades faster than the decisions themselves. The AI might remember “we
are using PostgreSQL” but forget why PostgreSQL was chosen over
MongoDB: the need for JSONB support, the team’s operational expertise,
the multi-tenancy requirements that ruled out document stores. This is a
subtle but expensive failure mode: the AI continues to follow the stated
decision while making suggestions that violate its intent. It proposes a
schema structure that would work well in a document store but fights
against PostgreSQL’s relational strengths. Technically compliant with the
stated choice, but architecturally misguided.
The solution is the same one developers apply instinctively to their own
cognition: externalize what matters. Persist it outside the medium that
forgets.
Some tools attempt to manage this problem automatically, compacting or
summarizing earlier conversation history as the context window fills. But
this introduces a different concern: the compaction is a black box. The
developer has no visibility into what was preserved verbatim, what was
summarized, and what was silently dropped. The algorithm optimizes for
general coherence, not for the specific nuances that matter to a
particular design decision. And the reasoning behind decisions, being
verbose, explanatory, and contextual, is precisely the kind of content
most vulnerable to automated compression. The what survives; the
why does not. Trusting an opaque process to preserve what matters
is not a strategy; it is a hope.
This is the missing piece in the
alignment techniques I have described elsewhere — sharing curated project
context with AI (what I call Knowledge
Priming) and structuring design conversations in sequential levels (Design-First collaboration)
both build a shared mental model between human and AI. But that alignment
is, by default, as transient as the conversation that created it. The
shared mental model we invest in building erodes as the session lengthens
— and vanishes entirely when the session ends.
Context anchoring is the practice of making that alignment durable.
External Memory
The solution is to treat decision context as external state: a living
document that exists outside the conversation, captures decisions as they
happen, and serves as the authoritative reference for both human and AI
across sessions.
This is not the same as the priming document from that earlier work.
The distinction matters:
A priming document captures project-level context: the tech stack,
architecture patterns, naming conventions, code examples. It is relatively
stable, updated quarterly, or when significant architectural changes
occur. It is shared across all features and all sessions. It tells the AI
“here is how this project works.”
A feature document captures feature-level context: the specific
decisions made during development, the constraints that shaped them, what
was considered and rejected, what remains open, and the current state of
progress. It evolves rapidly, potentially every session. It tells the AI
“here is where we are on this specific piece of work, and how we got
here.”
Together, they form two layers of the same context strategy. When
starting a new session, both are loaded: the project context as the stable
foundation, the feature context as the record of where things stand. The
priming document provides the vocabulary. The feature document provides
the history.
A natural objection is that modern AI tools (Cursor with its file
references, Copilot with workspace indexing) can already read codebases
directly. If the AI can see the code, why maintain a separate
document?
Because code captures outcomes, not reasoning. A codebase that uses
BullMQ directly for retry handling tells the reader nothing about whether
a RetryQueue abstraction was proposed, debated, and deliberately
rejected — or whether the direct approach was simply the first thing
generated and never questioned. The rejected alternative is invisible in
the code. The constraint that drove the decision is invisible. The open
question that remains is invisible.
There is a practical byproduct worth noting. A feature document of fifty lines
carries the same decision context that hundreds or thousands of lines of implementation
code cannot express at all, and it does so at a fraction of the token cost. Less context
in the window means the model’s attention holds up better; the degradation that long contexts
produce simply has less to degrade. Token efficiency is not the reason to maintain a feature
document (reasoning preservation is) but it is a compounding benefit whose cost
implications at scale deserve separate examination.
This is exactly the gap that Michael Nygard identified when he proposed
Architecture Decision Records (ADRs) in 2011. Code shows what was built.
It does not show what was rejected, what constraints shaped the choice,
what trade-offs were accepted, or what remains unresolved. ADRs exist
because experienced engineers recognized that the reasoning behind code is
at least as valuable as the code itself — and far more fragile.
The feature document fills this same gap for AI collaboration. It is,
in essence, a living ADR, one that evolves in real-time as decisions are
made, rather than being written after the fact.
For teams already using ADRs, the feature document is an ADR in
progress. When the feature ships, significant decisions graduate to formal
ADRs. For teams not yet using ADRs, this is a natural entry point:
lighter-weight, more iterative, and immediately practical.
There is one more dimension that purely individual tools miss:
coordination across the team. When multiple developers work on the same
feature (each with their own AI sessions) the feature document becomes
the shared record. Developer A’s design decisions, made with AI in one
session, are available to Developer B’s AI session started independently.
Without the document, Developer B’s AI might re-propose the very
abstractions Developer A already rejected. The shared mental model is not
just shared between one human and one AI; it is shared across the team,
across sessions, across time.
The feature doc survives what the context window cannot.
What This Looks Like in Practice
The notification service from that earlier Design-First work provides a
useful illustration.
After the design conversation (capabilities confirmed, components
debated, contracts agreed) I had a set of decisions worth preserving.
BullMQ used directly for retries, no wrapper abstraction. Functional
services, no classes. Email-only for v1. SendGrid for delivery. These
decisions, and crucially the reasoning behind each, went into a feature
document alongside the current constraints, open questions, and the state
of implementation.
The document was brief, under fifty lines. Not a formal template, but
a working record: decisions with their reasoning, current constraints the
AI must respect, open questions that remain unresolved, and a simple
checklist of what was done versus what remained. Enough to capture the
essential state without becoming documentation for its own sake.
# Feature: Notification Service v1 ## Decisions | Decision | Reason | Rejected Alternative | |-----------------------------+-----------------------------------------+-----------------------------------------------------| | BullMQ directly, no wrapper | Native retry with backoff is sufficient | RetryQueue abstraction (unnecessary indirection) | | Functional services | Match codebase convention | Class-based (rejected: convention) | | SendGrid for delivery | Deliverability + team experience | SES (cheaper, less reliable), Mailgun (no team exp) | ## Constraints - Email-only for v1 (no SMS/push) - All queries include tenantId (multi-tenant) - Must use existing auth middleware ## Open Questions - [ ] Rate limiting strategy (awaiting product input) ## State - [x] Design approved (all 5 levels) - [x] NotificationHandler + TemplateRenderer implemented - [ ] DeliveryTracker (next session)
The value became clear at the start of the third session. Rather than
reconstructing forty-five minutes of prior conversation (re-explaining
the tech stack, re-establishing the design decisions, re-stating the
constraints) I shared the feature document. The AI had full alignment in
thirty seconds. Not because it remembered the previous sessions, but
because the decisions had been externalized into a form it could read
fresh. Every new session became a warm start rather than a cold one. The
shared mental model did not need to be rebuilt; it was loaded.
In practice, the updates happened at natural pause points: at the end
of a design level, when a significant decision was made, or when an open
question was resolved. Sometimes I wrote the update myself. Sometimes I
asked the AI to summarize the decision and its reasoning, then edited that
summary into the document. The effort was minimal: a few lines after each
significant moment, not a documentation exercise.
There was a secondary benefit I had not anticipated. The discipline of
updating the document streamlined my own thinking. Writing down why we
chose direct BullMQ integration over a wrapper forced me to articulate the
reasoning clearly — and occasionally revealed that my reasoning was weaker
than I thought. The document was not just external memory for the AI. It
was a forcing function for clarity in my own decision-making.
Over three sessions, the document evolved: new decisions accumulated,
open questions were resolved, the implementation state progressed. A
colleague joining the feature — or a new AI session — could read this
document and have the full context of days of work in minutes. No
repetition. No re-explanation. The document carried the shared
understanding forward.
Calibration
Context anchoring is not needed everywhere. It is specifically valuable
when a feature spans multiple sessions — when the risk of losing context
is real and the cost of re-establishing it is high.
| Scenario | Anchoring Needed? | Why |
|---|---|---|
| Quick question, single utility | No | Conversation is short enough that decay is irrelevant |
| Single-session feature (under an hour) | Lightweight — capture key decisions if revisiting is possible | A few bullet points of decisions and state, enough to restart |
| Multi-day feature spanning sessions | Yes — full feature document | The cost of lost context is hours, not minutes |
| Feature with multiple developers | Yes — shared document | Coordinates decisions across independent AI sessions |
For a quick debugging question or a one-off utility, the overhead of
maintaining a document is not justified. For a feature that takes an
afternoon, a lightweight capture may be worthwhile if there is any chance
of revisiting. For work that stretches across days, full context anchoring
pays for itself many times over. This is where the vicious cycle of
clinging to long conversations is most likely to emerge, and where
externalizing decisions breaks the cycle most effectively.
The litmus test returns here: if I can close my chat session and start
a new one without anxiety, without feeling I have lost something that
cannot be recovered — my context is properly anchored. If I feel the need
to keep a session alive, that discomfort is the signal. It means decisions
exist only in the conversation, and the conversation is the wrong place
for them to live permanently.
Conclusion
This is, at its core, a shift from chat-driven development to
document-driven development. The conversation remains the medium for
making decisions, but the document becomes the record. Conversations are
disposable by design — they are where thinking happens, not where
conclusions are stored. The document persists.
The shared mental model between human and AI does not have to be
transient. It can be documented, durable, and shareable. Together with the
techniques that precede it — sharing curated project context before a
session begins, structuring design conversations in sequential levels
before any code is written — context anchoring completes a progression:
static context, dynamic alignment, persistent decisions. Each layer builds
on the last.
And the simplest test of whether it is working is also the most
practical: close the session. Start fresh. If that feels effortless — if
starting over costs thirty seconds of document-sharing rather than thirty
minutes of re-explanation — the context is where it belongs. Outside the
conversation, in a form that both human and AI can read, anytime.



Speak Your Mind