What is the Context layer in an AI system?

The Context layer is the layer of an AI system that sits between Retrieval and Inference, where retrieval results are turned into meaning the model can reason with. It runs the five-step process — curate, synthesize, consolidate, prioritize, store — and holds its own durable storage so Inference reaches consolidated meaning first and only round-trips to raw Data via Retrieval when the Context store is insufficient. Most AI systems skip this layer entirely.

What is memory-as-files?

Memory-as-files is the convergent storage primitive at the Context layer of an AI system — durable, addressable, human-inspectable artifacts (think AGENTS.md, CLAUDE.md, archival memory) that hold synthesized, prioritized, consolidated meaning rather than raw data. Anthropic, Paper Compute, LangChain, Cloudflare, and Letta all landed on this pattern within a single seven-day window in April 2026. Files are inspectable, diffable, governable, and exportable across vendor boundaries. Vectors are none of those things.

Data Is Not Context.

Q: What are the four layers of an AI system?

An AI system has four layers: (1) Data — storage at every time scale, from raw inputs to persistent memory; (2) Retrieval — reach: queries, tool calls, and API hits; (3) Context — where retrieval results become meaning the model can reason with, with its own durable storage; and (4) Inference — generation, where the agent operates as a principal rather than a tool. Most systems route Retrieval straight into Inference and skip the Context layer entirely.

Q: What are the five steps of context architecture?

The five steps are: (1) Curation — selectively picking from Data via Retrieval based on session and user intent; (2) Synthesis — extracting insights across sources; (3) Consolidation — finding cross-cutting patterns over time; (4) Prioritization — ranking by goal-awareness for the decision at hand; and (5) Intelligent Storage — storing consolidated insights at the Context layer itself with priority-aware indexing. All five live at the Context layer, between Retrieval and Inference.

Q: What does it mean for agents to be principals?

Agents as principals reframes agents from tools-that-consume-context to autonomous decision-makers operating on context. At the Inference layer, the agent is no longer a tool — it is an identity-bearing actor with its own role, authority scope, and audit trail. The architectural question shifts from "did we synthesize good context?" to "did we synthesize context this specific agent will use to decide well, given who it is and what it's allowed to do?"

Q: Why does context architecture matter more than larger context windows?

The bottleneck is not how much context you can fit — it's how well that context has been selected and compressed for the decision at hand. Expert decision-makers process less information than novices, but they process the right things.

Q: Who coined context architecture?

Riché Zamor coined the term 'context architecture' based on two decades of building AI products at companies including Suzy, Grandstage, Helm Labs, and IBM.

An AI system has four layers — Data, Retrieval, Context, Inference. The five-step process lives at the Context layer, the one most systems skip entirely.

Last updated · May 7, 2026This thesis is evolving as the market around context rapidly evolves. This is a snapshot of my thinking as of this date.

The Problem

Why do most AI systems confuse data with context?

They chunk documents. Embed them. Store them in a vector database. Retrieve the top-k results at query time. Ship the output.

This is not context. This is data retrieval with a similarity score.

The result is predictable: hallucinations, irrelevant responses, context windows stuffed with noise, and products that feel impressive in a demo but collapse under real-world conditions. Research has consistently shown that LLM performance degrades non-uniformly as you add more context — even on simple tasks.1Lost in the Middle: How Language Models Use Long ContextsLiu et al. (2023) demonstrated that LLM performance degrades significantly when relevant information is placed in the middle of long contexts, even on simple retrieval tasks.arxiv.org/abs/2307.03172 → Information positioned in the middle of the context window sees 20%+ accuracy drops.2Same study — “U-shaped” attention curveThe same research found a U-shaped performance curve: models attend most to the beginning and end of context, with 20%+ accuracy degradation for information in the middle positions.arxiv.org/abs/2307.03172 →

More context is not better context. And most of what's being retrieved was never actually context to begin with.

The retrieval debate also misses a deeper structural problem. An AI system has four layers — Data, Retrieval, Context, Inference. Most teams route Retrieval straight into Inference and skip the Context layer entirely. That's where the five-step process lives, and that's the gap.

The AI System Stack

What are the four layers of an AI system?

An AI system has four layers — Data, Retrieval, Context, Inference. Data stores. Retrieval reaches. Context generates meaning. Inference decides. The five-step process lives entirely at the Context layer — the one most teams skip when they wire Retrieval straight into Inference.

Layer 01Stores

Data

Storage at every time scale — raw inputs, telemetry, archives, and persistent memory alike. The substrate the system reaches into.

Layer 02Reaches

Retrieval

The reach layer. Queries, tool calls, API hits, vector lookups — how the system gets to the Data when it needs it. Optimizing retrieval alone doesn’t produce meaning; it just delivers raw chunks faster.

Layer 03Generates meaning

Context

Where retrieval results become meaning the model can reason with. The full five-step process — curate, synthesize, consolidate, prioritize, store — lives here, with its own durable storage. The convergent primitive is memory-as-files: durable, addressable, human-inspectable artifacts that are diffable, governable, and exportable across vendor boundaries.

All five steps live here

Layer 04Decides

Inference

Generation. Where the agent stops being a tool and becomes a principal — an identity-bearing actor with its own role, authority scope, and audit trail. The architectural question shifts from “did we synthesize good context?” to “did we synthesize context this specific agent will use to decide well, given who it is and what it’s allowed to do?”

Agents as principals

Most AI systems skip the Context layer entirely.They retrieve and shove — chunks come out of a vector store and go straight into the prompt, with no synthesis, no consolidation, no goal-aware prioritization, no consolidated store to hit first. That's the entire pipeline. The result: context windows stuffed with unprocessed retrievals, mediocre outputs, and products that feel impressive in a demo but collapse under real-world conditions.

The Five Steps · At the Context Layer

What are the five steps of context architecture?

All five live at the Context layer — between Retrieval and Inference. Curation reaches into Data via Retrieval based on session and user needs. Synthesis combines what comes back. Consolidation runs as a background loop that compounds meaning over time. Prioritization ranks by goal-awareness for the decision at hand. Intelligent storage holds consolidated meaning at the Context layer itself, so Inference reaches it first.

Curation

Selectively picking from the Data layer based on assumed session and user needs. Context-layer work that reaches into Data via Retrieval. Intent-driven selection, not passive intake filtering — the system decides what to pull, in what form, and how fresh it must be.

Synthesis

Classifying inputs, extracting insights, combining information across sources, and producing understanding that no single document contained. The active processing step that turns raw retrieval results into meaning the model can reason with.

Consolidation

The periodic, background process of replaying accumulated knowledge to find cross-cutting patterns, merge redundant information, prune stale facts, and extract higher-order insights. Context-layer operation against the Data layer over time. Sleep, for AI systems.

Prioritization

Ranking information by goal-awareness for the decision at hand. Compression without goal-awareness is just making data smaller. Prioritization makes it actionable. Expert decision-makers process less information than novices — and the right things.

Intelligent Storage

Storing consolidated insights at the Context layer itself with priority-aware indexing — so Inference reaches the Context store first for consolidated meaning and only round-trips to raw Data via Retrieval when needed. The convergent primitive is memory-as-files: durable, addressable, human-inspectable artifacts that are diffable, governable, and exportable across vendor boundaries. Vectors are none of those things.

Less Context, Better Decisions

Why does context architecture matter more than larger context windows?

The bottleneck is not how much context you can fit. It's how well that context has been selected and compressed for the decision at hand.

Expert decision-makers don't process more information than novices. They process less — and they process the right things.3Sources of Power: How People Make DecisionsGary Klein's research on naturalistic decision making showed that experts use pattern recognition, not exhaustive analysis. They recognize the situation and act on the first viable option.MIT Press →

Research on ecological rationality showed that simple heuristics using minimal cues match or outperform complex statistical models under real-world uncertainty.4Simple Heuristics That Make Us SmartGigerenzer, Todd & the ABC Research Group (1999) demonstrated that fast-and-frugal heuristics using minimal information can match or exceed the accuracy of complex statistical models in uncertain environments.Oxford University Press → Fireground commanders used explicit option comparison less than 5% of the time — they recognized the pattern and acted.5Recognition-Primed Decision ModelKlein (1989) found that experienced firefighters used recognition-primed decision making in 80%+ of cases, generating a single course of action through pattern matching rather than comparing options.doi.org →

The question is not “how do we fit more in?” It's “how do we build systems that know precisely what to leave out?”

65%

Enterprise AI failures from context drift6

30–60%

Effective vs. advertised context window7

20%+

Accuracy drop in mid-window information2

Context Architecture Is Product Strategy

How you architect context determines your product's quality, defensibility, and unit economics.

Context architecture is the practice of designing the informational environment that surrounds AI systems — shaping what they know, how they retrieve it, and how that knowledge is structured for human decision-making. This is not a plumbing decision — it's the most consequential product strategy decision in any AI system. Companies like Glean have built multi-billion dollar valuations on context layers, not model capability.8Glean valuation: $4.6B (2024)Glean, an enterprise AI search and knowledge platform built on context architecture, reached a $4.6B valuation in its Series E — demonstrating that context infrastructure is a venture-scale opportunity.glean.com → No formal framework exists for measuring context quality pre-inference, modeling context ROI, or defining cost-per-decision metrics.

The companies that figure this out will own the next era of AI products. The ones that don't will keep swapping models every quarter and wondering why their outputs haven't improved.

What I've Built

Who coined context architecture?

Riché Zamor coined the term “context architecture” based on two decades of building AI products that turn raw data into decision-ready context. He's not making this argument from the sidelines.

Suzy

Led the transformation from consumer survey platform to Decision Engine — an enterprise product that synthesizes fragmented marketing intelligence into decisions 350+ brands can act on. The platform’s three capabilities map directly to the five-step framework. Shipped in six weeks.

Grandstage

Built a context system for market intelligence that fused 10,000+ data sources into synthesized, goal-ranked context. Scaled to 90 B2B companies in 3 months at $0 CAC — because a hierarchical relevance model lifted retention from 50% to 80% by getting the context right.

Helm Labs

Generated $3.25M in pipeline before the product launched by selling the vision of how data about 200 million Americans should be curated, prioritized, and presented to decision-makers. The pipeline was built on context architecture as a value proposition.

IBM

Millions in revenue came from personalized context systems — e-nurture streams, onboarding flows, and recommendation engines that answered one question: what does this specific customer need to see, right now, to make a decision?

Go Deeper

Want to explore this further?

I write about context architecture, AI product strategy, and the lessons from building these systems. If you're working on this problem, I'd like to hear from you.

Follow on LinkedIn →Get in touch →