Pi Context Management Research¶

Last Updated: 2026-03-19

Research focus: How Pi assembles and manages context within a conversation.

Architecture Overview¶

Pi is a monorepo with three layers relevant to context management:

Layer	Package	Role
LLM call	`pi-ai`	Unified multi-provider API, streaming, overflow detection
Agent runtime	`pi-agent-core`	Agent loop, message types, tool execution, context transform hooks
Coding agent	`pi-coding-agent`	System prompt construction, compaction, session management, custom message types

Context Assembly Flow¶

Every LLM call goes through streamAssistantResponse in agent-loop.ts:238-259:

AgentMessage[]                    ← Full conversation history (includes custom types)
    │
    ▼ transformContext()          ← Optional hook: prune, inject external context
    │                               (agent-core provides interface, coding-agent does NOT use it)
    │
    ▼ convertToLlm()             ← Key transform: AgentMessage[] → Message[]
    │                               - bashExecution → user message
    │                               - compactionSummary → user message (wrapped in <summary> tags)
    │                               - branchSummary → user message
    │                               - custom → user message
    │                               - excludeFromContext messages filtered out
    │                               - user/assistant/toolResult passed through
    │
    ▼ Context { systemPrompt, messages, tools }
    │
    ▼ streamSimple() → Send to LLM

Key insight: Context is a single array that accumulates indefinitely. Every user message, assistant response, tool call, and tool result is appended. The LLM sees the full history on every call until compaction fires.

System Prompt Construction¶

system-prompt.ts builds the system prompt by concatenating sections in order:

Role definition — "You are an expert coding assistant operating inside pi..."
Available tools — Dynamically generated based on enabled tools (read/bash/edit/write/grep/find/ls)
Guidelines — Dynamically derived from tool combinations (e.g., has edit → "read before edit", has grep → "prefer grep over bash")
Pi documentation — File path references to docs/examples
appendSystemPrompt — Custom user-provided text
Project Context — AGENTS.md and other context files, each as ## {path}\n\n{content}
Skills — Available skills list (only if read tool is available)
Date + working directory — Always appended last

If customPrompt is provided, it replaces sections 1-4 entirely but still appends 5-8.

System prompt is ~300 words in its default form (without project context or skills).

Message Type System¶

Pi uses a two-level message type system:

LLM-level (pi-ai): - UserMessage — { role: "user", content, timestamp } - AssistantMessage — { role: "assistant", content, usage, stopReason, ... } - ToolResultMessage — { role: "toolResult", toolCallId, content, isError, ... }

Agent-level (pi-coding-agent extends via declaration merging): - BashExecutionMessage — { role: "bashExecution", command, output, exitCode, ... } - CustomMessage — { role: "custom", customType, content, display, ... } - CompactionSummaryMessage — { role: "compactionSummary", summary, tokensBefore, ... } - BranchSummaryMessage — { role: "branchSummary", summary, fromId, ... }

The convertToLlm() function converts agent-level messages to LLM-level messages at the call boundary. This separation allows the agent to carry UI-specific and bookkeeping messages without polluting the LLM context.

Compaction (Context Compression)¶

Trigger Conditions¶

Two cases, checked after each agent_end event and before each prompt submission:

Threshold: contextTokens > contextWindow - reserveTokens (default reserve: 16384 tokens)
Overflow: LLM returns context overflow error (detected via regex patterns matching 15+ provider error formats in overflow.ts)

Algorithm¶

Find cut point: Walk backwards from newest message, accumulate estimated token counts until reaching keepRecentTokens (default: 20000). Cut at that point.
Cut rules: Valid cut points are user/assistant/custom/bashExecution messages. Never cut at toolResult (must stay paired with its toolCall).
Generate summary: Use LLM to create structured summary of discarded messages with format: Goal / Constraints & Preferences / Progress (Done/In Progress/Blocked) / Key Decisions / Next Steps / Critical Context.
Incremental update: If previous compaction exists, use UPDATE_SUMMARIZATION_PROMPT to merge new information into existing summary rather than regenerating from scratch.
Split turn handling: If cut point falls mid-turn, generate a separate turn prefix summary to provide context for the retained suffix.
File tracking: Append read/modified file lists to summary for continuity.

Token Estimation¶

Uses chars / 4 heuristic (conservative, overestimates). Images estimated at 1200 tokens (4800 chars).

When available, uses actual usage data from the last assistant message for more accurate context size tracking.

Context Lifecycle¶

[sys prompt] + [msg1] + [msg2] + ... + [msgN]       ← Accumulate
                                                       ↓ Trigger threshold
[sys prompt] + [compaction summary] + [recent msgs]  ← After compaction
                     ↓ Continue accumulating
[sys prompt] + [compaction summary] + [recent] + [msgN+1] + ...
                                                       ↓ Trigger again
[sys prompt] + [updated summary] + [recent msgs]     ← Incremental update

Compaction summaries are injected as user messages wrapped in <summary> tags.

Extension Hook¶

Extensions can intercept compaction via session_before_compact event and provide their own CompactionResult, completely replacing the default summarization logic.

Subagent Model¶

Pi's agent-core has no built-in subagent concept. Subagents are implemented as an extension example (examples/extensions/subagent/).

Implementation¶

Subagents are spawned as separate OS processes: spawn("pi", ["--mode", "json", "--no-session", ...]).

Process-level isolation: Each subagent has its own context window
--no-session: No persistence, disposable
One-way information flow: Only the subagent's final text output returns to the main context. Subagent cannot see main context.
Three modes: Single, Parallel (max 8 tasks, 4 concurrent), Chain (sequential with {previous} placeholder)

Agent Definitions¶

Agents are markdown files with YAML frontmatter specifying name, description, tools, and model. Stored in ~/.pi/agent/agents/ (user-level) or .pi/agents/ (project-level).

Design Philosophy¶

Minimalism: ~300 word system prompt, 4 core tools (read/write/edit/bash), no built-in subagent
Full context trust: No token budgeting or selective inclusion — send everything until physical limit
Late intervention: Compaction only fires when approaching the context window limit, not proactively
Extension over built-in: Features like subagents, custom compaction, and context transforms are extension points, not core features
Two-level message abstraction: Agent-level messages carry rich metadata; conversion to LLM format happens only at the call boundary

Comparison Notes¶

Aspect	Pi	OpenClaw (ContextEngine)
Context strategy	Accumulate all, compact when full	Assemble per-call with token budget
Token budgeting	None (full context until limit)	Per-section budget allocation via `assemble()`
Compaction	LLM-generated structured summary	Pluggable via ContextEngine slot
Subagent	Extension (process isolation)	(TBD - needs research)
System prompt	~300 words, dynamic tool/guideline sections	Complex multi-section construction