A3S Docs
A3S CodeExamples

Ripgrep Context Provider

Fast, indexless code search using ripgrep-style pattern matching

Ripgrep Context Provider

A3S Code includes a RipgrepContextProvider that searches workspace files in real-time using fast regex-based pattern matching. Unlike vector-based RAG systems, this approach:

  • No pre-indexing required — Works directly on raw source files
  • Real-time updates — Reflects code changes immediately without re-indexing
  • Fast search — Uses ripgrep-style pattern matching with relevance scoring
  • Low overhead — No embedding API calls or vector storage needed

Ripgrep context provider is the default for code search. It requires no configuration and works out of the box.

How It Works

Core Architecture

The ripgrep context provider uses an indexless real-time search architecture, fundamentally different from vector RAG:

Traditional Vector RAG:
Source files → Chunking → Embedding API → Vector store → Similarity search → Results
              ↑_______________Re-index on code change_______________↑

Ripgrep approach:
Source files → Regex match → Relevance scoring → Results
              ↑_________Real-time search (no index)_________↑

Search Pipeline

The provider uses a five-step search pipeline:

  1. Pattern Extraction — Extracts keywords from the user's query
  2. File Walking — Traverses the workspace respecting .gitignore
  3. Regex Matching — Searches file contents using fast regex
  4. Relevance Scoring — Ranks results by match count and file size
  5. Context Extraction — Returns matching lines with surrounding context

Relevance Scoring Algorithm

// relevance = match_count / sqrt(total_lines)
let relevance = (matches.len() as f32) / (lines.len() as f32).sqrt();

This formula ensures:

  • Files with more matches rank higher
  • Smaller, focused files rank higher than large files with incidental matches
  • Square root prevents over-penalizing large files

Performance Optimizations

  1. Parallel search — Uses tokio::task::spawn_blocking to run in a background thread pool
  2. Early filtering — Checks file size and pattern matching before reading file contents
  3. Smart skipping — Automatically skips binary files, empty files, and excluded directories
  4. Incremental results — Supports max_results and max_tokens limits to avoid over-searching

Quick Start

FileSystem Context (Simple Keyword Matching)

The simplest context provider — injects relevant file contents based on keyword matching.

const session = agent.session('/my-project', {
  permissive: true,
  fsContext: '/my-project',
});
const result = await session.send('How does authentication work?');
console.log(result.text);
session = agent.session("/my-project",
    permissive=True,
    fs_context="/my-project",
)
result = session.send("How does authentication work?")
print(result.text)

Ripgrep Context Provider (Fast Pattern Matching)

use a3s_code_core::context::{RipgrepContextProvider, RipgrepContextConfig};
use std::sync::Arc;

let config = RipgrepContextConfig::new("/my-project")
    .with_case_insensitive(true)
    .with_context_lines(3)
    .with_include_patterns(vec![
        "**/*.rs".to_string(),
        "**/*.md".to_string()
    ]);

let provider = RipgrepContextProvider::new(config);

// Register with session
let session = agent.session("/my-project")
    .with_context_provider(Arc::new(provider))
    .build()?;
import { Agent, RipgrepContextProvider } from '@a3s-lab/code';

const provider = new RipgrepContextProvider({
  rootPath: '/my-project',
  includePatterns: ['**/*.ts', '**/*.tsx', '**/*.md'],
  excludePatterns: ['**/node_modules/**', '**/dist/**'],
  caseInsensitive: true,
  contextLines: 3,
});

const session = agent.session('/my-project', {
  contextProviders: [provider],
});
from a3s_code import Agent, RipgrepContextProvider

provider = RipgrepContextProvider(
    root_path="/my-project",
    include_patterns=["**/*.py", "**/*.md"],
    exclude_patterns=["**/venv/**", "**/__pycache__/**"],
    case_insensitive=True,
    context_lines=3,
)

session = agent.session("/my-project",
    context_providers=[provider],
)

Configuration

Prop

Type

Default Include Patterns

vec![
    "**/*.rs",      // Rust
    "**/*.py",      // Python
    "**/*.ts",      // TypeScript
    "**/*.tsx",     // TypeScript React
    "**/*.js",      // JavaScript
    "**/*.jsx",     // JavaScript React
    "**/*.go",      // Go
    "**/*.java",    // Java
    "**/*.c",       // C
    "**/*.cpp",     // C++
    "**/*.h",       // C/C++ headers
    "**/*.hpp",     // C++ headers
    "**/*.md",      // Markdown
    "**/*.toml",    // TOML config
    "**/*.yaml",    // YAML config
    "**/*.yml",     // YAML config
    "**/*.json",    // JSON
]

Default Exclude Patterns

vec![
    "**/target/**",         // Rust build output
    "**/node_modules/**",   // Node.js dependencies
    "**/.git/**",           // Git metadata
    "**/dist/**",           // Build output
    "**/build/**",          // Build output
    "**/*.lock",            // Lock files
    "**/vendor/**",         // Third-party dependencies
    "**/__pycache__/**",    // Python cache
]

API Reference

RipgrepContextConfig

Prop

Type

Examples

Example 1: Search for Authentication Code

use a3s_code_core::context::{ContextQuery, ContextProvider};

let provider = RipgrepContextProvider::new(
    RipgrepContextConfig::new("/my-project")
);

let query = ContextQuery::new("authentication")
    .with_max_results(5)
    .with_max_tokens(2000);

let result = provider.query(&query).await?;

for item in result.items {
    println!("File: {}", item.title);
    println!("Relevance: {:.2}", item.relevance.unwrap_or(0.0));
    println!("Matches:\n{}\n", item.content);
}
let config = RipgrepContextConfig::new("/my-project")
    .with_include_patterns(vec!["**/*.rs".to_string()])
    .with_exclude_patterns(vec![
        "**/target/**".to_string(),
        "**/tests/**".to_string(),
    ]);

let provider = RipgrepContextProvider::new(config);

Example 3: Large Codebase Optimization

let config = RipgrepContextConfig::new("/large-project")
    .with_max_file_size(512 * 1024)  // Limit to 512KB
    .with_context_lines(1)            // Reduce context
    .with_exclude_patterns(vec![
        "**/vendor/**".to_string(),
        "**/third_party/**".to_string(),
        "**/*.min.js".to_string(),    // Skip minified files
    ]);

Example 4: Multi-Depth Queries

use a3s_code_core::context::ContextDepth;

// Abstract — only file path and match count
let query = ContextQuery::new("database")
    .with_depth(ContextDepth::Abstract);

// Overview — first 3 matches per file
let query = ContextQuery::new("database")
    .with_depth(ContextDepth::Overview);

// Full — all matches with complete context
let query = ContextQuery::new("database")
    .with_depth(ContextDepth::Full);

Comparison: Ripgrep vs Vector RAG

FeatureRipgrep ProviderVector RAG (Removed)
IndexingNone requiredPre-indexing required
Real-time updates✅ Instant❌ Requires re-indexing
Setup complexity✅ Zero config❌ Embedding API + vector DB
Search speed✅ Fast (< 1s)⚠️ Depends on index size
Semantic search❌ Keyword-based✅ Meaning-based
Cost✅ Free❌ Embedding API costs
Exact match✅ Perfect⚠️ May miss exact terms
Fuzzy match❌ Requires regex✅ Handles synonyms

Best Practices

1. Adjust Include Patterns

Limit search to relevant file types for better performance:

let config = RipgrepContextConfig::new("/my-project")
    .with_include_patterns(vec![
        "**/*.rs".to_string(),
        "**/*.toml".to_string(),
        "**/README.md".to_string(),
    ]);

2. Use Exclude Patterns

Skip build artifacts and dependencies:

let config = RipgrepContextConfig::new("/my-project")
    .with_exclude_patterns(vec![
        "**/target/**".to_string(),
        "**/node_modules/**".to_string(),
        "**/.git/**".to_string(),
        "**/dist/**".to_string(),
        "**/*.min.js".to_string(),
        "**/*.map".to_string(),
    ]);

3. Set Context Lines

Balance detail vs token usage:

.with_context_lines(1)  // Minimal — saves tokens
.with_context_lines(2)  // Standard — balanced
.with_context_lines(5)  // Rich — more detail

4. Monitor File Size Limits

Prevent searching large generated files:

.with_max_file_size(512 * 1024)  // 512KB

5. Combine with Memory

Pair ripgrep with MemoryContextProvider for conversation history:

let session = agent.session("/my-project")
    .with_context_provider(Arc::new(ripgrep_provider))
    .with_context_provider(Arc::new(memory_provider))
    .build()?;

Performance Benchmarks

On a typical mid-size Rust project (~50k lines):

OperationTimeNotes
Single keyword search~100msSearching "authentication"
Multi-keyword search~150msSearching "user login session"
Large codebase (500k lines)~800msWith proper exclude patterns
Vector RAG indexing~30sInitial index time (comparison)
Vector RAG query~200msQuery time (comparison)

Performance depends on codebase size, file count, and disk speed. SSD and proper exclude patterns give the best results.

Troubleshooting

Search is too slow

  1. Add more exclude patterns to skip irrelevant directories
  2. Reduce max_file_size to skip large files
  3. Use more specific include patterns

Expected results not found

  1. Check if files are matched by exclude patterns
  2. Verify file size is within max_file_size limit
  3. Try case_insensitive: true
  4. Check search term spelling (ripgrep uses literal matching)

Token usage too high

  1. Reduce context_lines
  2. Lower max_results
  3. Use ContextDepth::Abstract or Overview
  4. Set stricter max_tokens limit

On this page