Context Loading

January 30, 2026

How you give agents access to information matters more than you think. The choice between passive context (always available) and active retrieval (fetched on demand) can mean the difference between 100% and 53% accuracy.

The Core Tradeoff

Passive context: Embed knowledge directly in the prompt or system context. The agent doesn't need to decide to look it up — it's already there.

Active retrieval: Give the agent tools to fetch information when needed. Skills, RAG, MCP servers, function calls.

The intuition says active retrieval should win. Why waste tokens on information the agent might not need? Let it pull what's relevant.

But Vercel's research found the opposite.

LLMs Are Lazy

In Vercel's evals, agents with access to skills invoked them only 44% of the time. More than half the time, the agent had access to documentation that would help — and chose not to use it.

This isn't a bug. It's a known limitation of current models. Agents don't reliably use available tools, even when those tools would help.

The fix was counterintuitive: compress the docs and embed them directly. An 8KB compressed index in AGENTS.md achieved 100% accuracy, while skills maxed out at 79%.

The What Matters More Than The How

As Vercel put it: "Context that's always available beats context that requires a decision to access."

This applies broadly:

The cost is tokens. Passive context uses tokens every turn, whether needed or not. But if the information is frequently relevant, the reliability gain outweighs the cost.

When Active Retrieval Makes Sense

Active retrieval isn't dead. It's still the right choice when:

MCP servers, function calls, and tools are still essential for agents that interact with the world. The insight is about knowledge, not capabilities.

Memory: The Third Dimension

Context loading isn't just prompt vs retrieval. There's also memory:

As one researcher noted:

Most agents shine in-session, then restart from zero. No persistent identity = no true long-horizon growth.

The question becomes: if the agent restarts, is it still the same agent?

Practical Guidelines

Embed directly when:

Use retrieval when:

Hybrid approach:


Sources