A3S Docs
A3S Code

Context Management

Context compaction and pluggable context providers for RAG

Context Management

A3S Code provides two context management features: automatic context compaction when conversations grow long, and pluggable context providers for retrieval-augmented generation (RAG).

Context Compaction

When context usage exceeds the threshold (default 80% of model's context window), the agent automatically summarizes the conversation to stay within limits.

How It Works

Keep first 2 messages (system context)
Keep last 20 messages (recent context)
Summarize middle messages via LLM call
Insert summary as a synthetic message

This preserves important context while reducing token usage.

Configuration

use a3s_code_core::SessionOptions;

let session = agent.session("/project", Some(
    SessionOptions::new()
        .with_context_threshold(0.8)  // Compact at 80% (default)
))?;

Events

The agent emits events during compaction:

Prop

Type

Context Providers

Context providers inject additional information into the LLM's system prompt before each generation. This enables RAG (retrieval-augmented generation), memory recall, and integration with external knowledge bases.

Overview

The agent queries registered context providers before each LLM call. Providers return ContextItem entries that are formatted as XML blocks and prepended to the system prompt:

System Prompt
├── Base instructions
├── Context blocks:          ← injected by context providers
│   ├── [resource] API docs for auth module
│   ├── [memory] User prefers TypeScript
│   └── [resource] Related code snippets
└── Tool definitions

Events are emitted during resolution: ContextResolving (with provider names) and ContextResolved (with total items and token count).

ContextProvider Trait

#[async_trait]
pub trait ContextProvider: Send + Sync {
    /// Query this provider for relevant context
    async fn query(&self, query: ContextQuery) -> Result<ContextResult>;

    /// Optional: extract and store context after a turn completes
    async fn on_turn_complete(&self, _messages: &[Message]) -> Result<()> {
        Ok(()) // default no-op
    }
}

The query() method receives a ContextQuery and returns matching ContextItem entries. The optional on_turn_complete() hook allows providers to extract and store information from conversation turns (e.g., memory extraction).

ContextQuery

Prop

Type

Builder methods make construction ergonomic:

let query = ContextQuery::new("authentication flow")
    .with_types(vec![ContextType::Resource])
    .with_depth(ContextDepth::Overview)
    .with_max_results(5)
    .with_max_tokens(4000)
    .with_session_id("session-123");

ContextType

Prop

Type

ContextDepth

Prop

Type

ContextItem

Prop

Type

Items are formatted as XML for injection into the system prompt via to_xml().

Built-in Provider: MemoryContextProvider

Bridges the Memory system to the context provider interface. Performs semantic search over memory items and converts them to ContextItem entries with relevance scores.

use a3s_code_core::context::MemoryContextProvider;
use a3s_code_core::memory::MemoryManager;

let memory = Arc::new(MemoryManager::new());
let provider = Arc::new(MemoryContextProvider::new(memory));

let session = agent.session("/project", Some(
    SessionOptions::new().with_context_provider(provider)
))?;

Custom Context Provider Example

use a3s_code_core::context::{ContextProvider, ContextQuery, ContextResult, ContextItem, ContextType};
use async_trait::async_trait;

struct DocsContextProvider {
    docs_path: PathBuf,
}

#[async_trait]
impl ContextProvider for DocsContextProvider {
    async fn query(&self, query: ContextQuery) -> Result<ContextResult> {
        // Search documentation files
        let matches = search_docs(&self.docs_path, &query.query)?;

        let items: Vec<ContextItem> = matches.into_iter()
            .map(|doc| ContextItem {
                id: doc.path.to_string_lossy().to_string(),
                context_type: ContextType::Resource,
                content: doc.content,
                token_count: doc.content.split_whitespace().count(),
                relevance: doc.score,
                source: Some(format!("docs:{}", doc.path.display())),
                metadata: HashMap::new(),
            })
            .collect();

        Ok(ContextResult { items })
    }
}

// Register with session
let provider = Arc::new(DocsContextProvider {
    docs_path: PathBuf::from("./docs"),
});

let session = agent.session("/project", Some(
    SessionOptions::new().with_context_provider(provider)
))?;

Multiple Providers

Register multiple context providers to combine different sources:

let memory_provider = Arc::new(MemoryContextProvider::new(memory));
let docs_provider = Arc::new(DocsContextProvider { docs_path });
let code_provider = Arc::new(CodeContextProvider { repo_path });

let session = agent.session("/project", Some(
    SessionOptions::new()
        .with_context_provider(memory_provider)
        .with_context_provider(docs_provider)
        .with_context_provider(code_provider)
))?;

All providers are queried in parallel, and results are merged before injection into the system prompt.

Best Practices

Set token budgets — Use max_tokens to prevent context overflow
Filter by type — Use context_types to retrieve only relevant context
Adjust depth — Use Abstract for summaries, Full for detailed content
Implement relevance scoring — Return items sorted by relevance (0.0–1.0)
Use on_turn_complete — Extract and store information from conversations
Cache results — Implement caching in your provider to reduce latency

Events

Context-related events emitted during streaming:

Prop

Type

See Sessions for full event reference.

API Reference

SessionOptions

Prop

Type

ContextQuery fields

Prop

Type

ContextItem fields

Prop

Type

VectorContextConfig (Rust)

Prop

Type

On this page