Ripgrep Context Provider
Fast, indexless code search using ripgrep-style pattern matching
Ripgrep Context Provider
A3S Code includes a RipgrepContextProvider that searches workspace files in real-time using fast regex-based pattern matching. Unlike vector-based RAG systems, this approach:
- No pre-indexing required — Works directly on raw source files
- Real-time updates — Reflects code changes immediately without re-indexing
- Fast search — Uses ripgrep-style pattern matching with relevance scoring
- Low overhead — No embedding API calls or vector storage needed
Ripgrep context provider is the default for code search. It requires no configuration and works out of the box.
How It Works
Core Architecture
The ripgrep context provider uses an indexless real-time search architecture, fundamentally different from vector RAG:
Traditional Vector RAG:
Source files → Chunking → Embedding API → Vector store → Similarity search → Results
↑_______________Re-index on code change_______________↑
Ripgrep approach:
Source files → Regex match → Relevance scoring → Results
↑_________Real-time search (no index)_________↑Search Pipeline
The provider uses a five-step search pipeline:
- Pattern Extraction — Extracts keywords from the user's query
- File Walking — Traverses the workspace respecting
.gitignore - Regex Matching — Searches file contents using fast regex
- Relevance Scoring — Ranks results by match count and file size
- Context Extraction — Returns matching lines with surrounding context
Relevance Scoring Algorithm
// relevance = match_count / sqrt(total_lines)
let relevance = (matches.len() as f32) / (lines.len() as f32).sqrt();This formula ensures:
- Files with more matches rank higher
- Smaller, focused files rank higher than large files with incidental matches
- Square root prevents over-penalizing large files
Performance Optimizations
- Parallel search — Uses
tokio::task::spawn_blockingto run in a background thread pool - Early filtering — Checks file size and pattern matching before reading file contents
- Smart skipping — Automatically skips binary files, empty files, and excluded directories
- Incremental results — Supports
max_resultsandmax_tokenslimits to avoid over-searching
Quick Start
FileSystem Context (Simple Keyword Matching)
The simplest context provider — injects relevant file contents based on keyword matching.
const session = agent.session('/my-project', {
permissive: true,
fsContext: '/my-project',
});
const result = await session.send('How does authentication work?');
console.log(result.text);session = agent.session("/my-project",
permissive=True,
fs_context="/my-project",
)
result = session.send("How does authentication work?")
print(result.text)Ripgrep Context Provider (Fast Pattern Matching)
use a3s_code_core::context::{RipgrepContextProvider, RipgrepContextConfig};
use std::sync::Arc;
let config = RipgrepContextConfig::new("/my-project")
.with_case_insensitive(true)
.with_context_lines(3)
.with_include_patterns(vec![
"**/*.rs".to_string(),
"**/*.md".to_string()
]);
let provider = RipgrepContextProvider::new(config);
// Register with session
let session = agent.session("/my-project")
.with_context_provider(Arc::new(provider))
.build()?;import { Agent, RipgrepContextProvider } from '@a3s-lab/code';
const provider = new RipgrepContextProvider({
rootPath: '/my-project',
includePatterns: ['**/*.ts', '**/*.tsx', '**/*.md'],
excludePatterns: ['**/node_modules/**', '**/dist/**'],
caseInsensitive: true,
contextLines: 3,
});
const session = agent.session('/my-project', {
contextProviders: [provider],
});from a3s_code import Agent, RipgrepContextProvider
provider = RipgrepContextProvider(
root_path="/my-project",
include_patterns=["**/*.py", "**/*.md"],
exclude_patterns=["**/venv/**", "**/__pycache__/**"],
case_insensitive=True,
context_lines=3,
)
session = agent.session("/my-project",
context_providers=[provider],
)Configuration
Prop
Type
Default Include Patterns
vec![
"**/*.rs", // Rust
"**/*.py", // Python
"**/*.ts", // TypeScript
"**/*.tsx", // TypeScript React
"**/*.js", // JavaScript
"**/*.jsx", // JavaScript React
"**/*.go", // Go
"**/*.java", // Java
"**/*.c", // C
"**/*.cpp", // C++
"**/*.h", // C/C++ headers
"**/*.hpp", // C++ headers
"**/*.md", // Markdown
"**/*.toml", // TOML config
"**/*.yaml", // YAML config
"**/*.yml", // YAML config
"**/*.json", // JSON
]Default Exclude Patterns
vec![
"**/target/**", // Rust build output
"**/node_modules/**", // Node.js dependencies
"**/.git/**", // Git metadata
"**/dist/**", // Build output
"**/build/**", // Build output
"**/*.lock", // Lock files
"**/vendor/**", // Third-party dependencies
"**/__pycache__/**", // Python cache
]API Reference
RipgrepContextConfig
Prop
Type
Examples
Example 1: Search for Authentication Code
use a3s_code_core::context::{ContextQuery, ContextProvider};
let provider = RipgrepContextProvider::new(
RipgrepContextConfig::new("/my-project")
);
let query = ContextQuery::new("authentication")
.with_max_results(5)
.with_max_tokens(2000);
let result = provider.query(&query).await?;
for item in result.items {
println!("File: {}", item.title);
println!("Relevance: {:.2}", item.relevance.unwrap_or(0.0));
println!("Matches:\n{}\n", item.content);
}Example 2: Rust-Only Search
let config = RipgrepContextConfig::new("/my-project")
.with_include_patterns(vec!["**/*.rs".to_string()])
.with_exclude_patterns(vec![
"**/target/**".to_string(),
"**/tests/**".to_string(),
]);
let provider = RipgrepContextProvider::new(config);Example 3: Large Codebase Optimization
let config = RipgrepContextConfig::new("/large-project")
.with_max_file_size(512 * 1024) // Limit to 512KB
.with_context_lines(1) // Reduce context
.with_exclude_patterns(vec![
"**/vendor/**".to_string(),
"**/third_party/**".to_string(),
"**/*.min.js".to_string(), // Skip minified files
]);Example 4: Multi-Depth Queries
use a3s_code_core::context::ContextDepth;
// Abstract — only file path and match count
let query = ContextQuery::new("database")
.with_depth(ContextDepth::Abstract);
// Overview — first 3 matches per file
let query = ContextQuery::new("database")
.with_depth(ContextDepth::Overview);
// Full — all matches with complete context
let query = ContextQuery::new("database")
.with_depth(ContextDepth::Full);Comparison: Ripgrep vs Vector RAG
| Feature | Ripgrep Provider | Vector RAG (Removed) |
|---|---|---|
| Indexing | None required | Pre-indexing required |
| Real-time updates | ✅ Instant | ❌ Requires re-indexing |
| Setup complexity | ✅ Zero config | ❌ Embedding API + vector DB |
| Search speed | ✅ Fast (< 1s) | ⚠️ Depends on index size |
| Semantic search | ❌ Keyword-based | ✅ Meaning-based |
| Cost | ✅ Free | ❌ Embedding API costs |
| Exact match | ✅ Perfect | ⚠️ May miss exact terms |
| Fuzzy match | ❌ Requires regex | ✅ Handles synonyms |
Best Practices
1. Adjust Include Patterns
Limit search to relevant file types for better performance:
let config = RipgrepContextConfig::new("/my-project")
.with_include_patterns(vec![
"**/*.rs".to_string(),
"**/*.toml".to_string(),
"**/README.md".to_string(),
]);2. Use Exclude Patterns
Skip build artifacts and dependencies:
let config = RipgrepContextConfig::new("/my-project")
.with_exclude_patterns(vec![
"**/target/**".to_string(),
"**/node_modules/**".to_string(),
"**/.git/**".to_string(),
"**/dist/**".to_string(),
"**/*.min.js".to_string(),
"**/*.map".to_string(),
]);3. Set Context Lines
Balance detail vs token usage:
.with_context_lines(1) // Minimal — saves tokens
.with_context_lines(2) // Standard — balanced
.with_context_lines(5) // Rich — more detail4. Monitor File Size Limits
Prevent searching large generated files:
.with_max_file_size(512 * 1024) // 512KB5. Combine with Memory
Pair ripgrep with MemoryContextProvider for conversation history:
let session = agent.session("/my-project")
.with_context_provider(Arc::new(ripgrep_provider))
.with_context_provider(Arc::new(memory_provider))
.build()?;Performance Benchmarks
On a typical mid-size Rust project (~50k lines):
| Operation | Time | Notes |
|---|---|---|
| Single keyword search | ~100ms | Searching "authentication" |
| Multi-keyword search | ~150ms | Searching "user login session" |
| Large codebase (500k lines) | ~800ms | With proper exclude patterns |
| Vector RAG indexing | ~30s | Initial index time (comparison) |
| Vector RAG query | ~200ms | Query time (comparison) |
Performance depends on codebase size, file count, and disk speed. SSD and proper exclude patterns give the best results.
Troubleshooting
Search is too slow
- Add more exclude patterns to skip irrelevant directories
- Reduce
max_file_sizeto skip large files - Use more specific include patterns
Expected results not found
- Check if files are matched by exclude patterns
- Verify file size is within
max_file_sizelimit - Try
case_insensitive: true - Check search term spelling (ripgrep uses literal matching)
Token usage too high
- Reduce
context_lines - Lower
max_results - Use
ContextDepth::AbstractorOverview - Set stricter
max_tokenslimit
Related
- Context Providers — Context system overview
- Memory — Persistent memory storage
- Quick Start — Basic usage examples