Headless Browser
Chrome DevTools Protocol integration for JavaScript-rendering engines
Headless Browser
Engines like Google, Baidu, and Bing China require JavaScript rendering. A3S Search provides a BrowserPool that manages a shared headless browser instance (Lightpanda or Chrome) with tab concurrency control.
Browser Backends
A3S Search supports two headless browser backends:
- Lightpanda (default) — Lightweight browser (59MB, <100ms startup), Linux/macOS only
- Chrome (fallback) — Chrome for Testing (200MB, 1-2s startup), all platforms
The SDK automatically selects Lightpanda on supported platforms and falls back to Chrome on Windows or if Lightpanda is unavailable.
Feature Gate
Headless browser support is enabled by default in SDKs, opt-in in Rust library:
# Rust library (opt-in for smaller binary)
[dependencies]
a3s-search = "0.9" # HTTP engines only
# With Chrome backend
[dependencies]
a3s-search = { version = "0.9", features = ["chromium"] }
# With Lightpanda backend (Linux/macOS only)
[dependencies]
a3s-search = { version = "0.9", features = ["lightpanda"] }
# With both backends
[dependencies]
a3s-search = { version = "0.9", features = ["all-headless"] }Python/Node.js SDKs: Both Chrome and Lightpanda backends are enabled by default.
BrowserPool
BrowserPool manages a single browser process (Lightpanda or Chrome) with a tab semaphore for concurrency control:
use a3s_search::browser::{BrowserPool, BrowserPoolConfig, BrowserBackend};
use std::sync::Arc;
let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
backend: BrowserBackend::Lightpanda, // or BrowserBackend::Chrome
max_tabs: 4,
headless: true,
chrome_path: None,
lightpanda_path: None,
proxy_url: None,
launch_args: vec![],
}));Configuration
Prop
Type
Lifecycle
The browser is lazily initialized on the first acquire_browser() call:
let browser = pool.acquire_browser().await?;
pool.shutdown().await;BrowserFetcher
BrowserFetcher implements the PageFetcher trait using BrowserPool:
use a3s_search::browser::BrowserFetcher;
use a3s_search::WaitStrategy;
let fetcher = Arc::new(
BrowserFetcher::new(pool.clone())
.with_wait(WaitStrategy::Selector {
css: "div.g".into(),
timeout_ms: 5000,
})
.with_user_agent("Mozilla/5.0 ...")
);Wait Strategies
Control when a page is considered "loaded":
Prop
Type
Each built-in headless engine uses an appropriate strategy:
Prop
Type
Browser Auto-Detection
When chrome_path or lightpanda_path is None, A3S Search automatically detects or downloads the browser:
Lightpanda (Default)
LIGHTPANDAenvironment variable- System PATH (
lightpandacommand) - Cache:
~/.a3s/lightpanda/<tag>/lightpanda - Auto-download from GitHub releases (59MB)
Supported platforms: macOS (arm64, x64), Linux (x64, arm64). First run downloads ~59MB, startup <100ms.
Chrome (Fallback)
CHROMEenvironment variable- System PATH (
google-chrome,chromium,chrome, etc.) - Well-known install paths (macOS
/Applications/..., Linux/usr/bin/..., WindowsC:\Program Files\...) - Cache:
~/.a3s/chromium/<version>/ - Auto-download Chrome for Testing from Google CDN (200MB)
Supported platforms: macOS (arm64, x64), Linux (x64), Windows (x64, x86). First run downloads ~150MB, startup 1-2s.
Using Headless Engines
use a3s_search::{Search, SearchQuery};
use a3s_search::browser::{BrowserPool, BrowserPoolConfig, BrowserFetcher};
use a3s_search::engines::{Google, Baidu, BingChina, DuckDuckGo};
use a3s_search::WaitStrategy;
use std::sync::Arc;
let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
max_tabs: 4,
headless: true,
chrome_path: None,
proxy_url: None,
launch_args: vec![],
}));
let google_fetcher = Arc::new(
BrowserFetcher::new(pool.clone())
.with_wait(WaitStrategy::Selector {
css: "div.g".into(),
timeout_ms: 5000,
})
);
let baidu_fetcher = Arc::new(
BrowserFetcher::new(pool.clone())
.with_wait(WaitStrategy::Selector {
css: "div.c-container".into(),
timeout_ms: 5000,
})
);
let mut search = Search::new();
search.add_engine(DuckDuckGo::new());
search.add_engine(Google::new(google_fetcher));
search.add_engine(Baidu::new(baidu_fetcher));
let results = search.search(SearchQuery::new("rust programming")).await?;
pool.shutdown().await;PageFetcher Trait
All fetchers implement this trait:
#[async_trait]
pub trait PageFetcher: Send + Sync {
async fn fetch(&self, url: &str) -> Result<String>;
}Prop
Type