A3S Docs
A3S Search

Headless Browser

Chrome DevTools Protocol integration for JavaScript-rendering engines

Headless Browser

Engines like Google, Baidu, and Bing China require JavaScript rendering. A3S Search provides a BrowserPool that manages a shared headless browser instance (Lightpanda or Chrome) with tab concurrency control.

Browser Backends

A3S Search supports two headless browser backends:

  • Lightpanda (default) — Lightweight browser (59MB, <100ms startup), Linux/macOS only
  • Chrome (fallback) — Chrome for Testing (200MB, 1-2s startup), all platforms

The SDK automatically selects Lightpanda on supported platforms and falls back to Chrome on Windows or if Lightpanda is unavailable.

Feature Gate

Headless browser support is enabled by default in SDKs, opt-in in Rust library:

# Rust library (opt-in for smaller binary)
[dependencies]
a3s-search = "0.9"  # HTTP engines only

# With Chrome backend
[dependencies]
a3s-search = { version = "0.9", features = ["chromium"] }

# With Lightpanda backend (Linux/macOS only)
[dependencies]
a3s-search = { version = "0.9", features = ["lightpanda"] }

# With both backends
[dependencies]
a3s-search = { version = "0.9", features = ["all-headless"] }

Python/Node.js SDKs: Both Chrome and Lightpanda backends are enabled by default.

BrowserPool

BrowserPool manages a single browser process (Lightpanda or Chrome) with a tab semaphore for concurrency control:

use a3s_search::browser::{BrowserPool, BrowserPoolConfig, BrowserBackend};
use std::sync::Arc;

let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
    backend: BrowserBackend::Lightpanda,  // or BrowserBackend::Chrome
    max_tabs: 4,
    headless: true,
    chrome_path: None,
    lightpanda_path: None,
    proxy_url: None,
    launch_args: vec![],
}));

Configuration

Prop

Type

Lifecycle

The browser is lazily initialized on the first acquire_browser() call:

let browser = pool.acquire_browser().await?;
pool.shutdown().await;

BrowserFetcher

BrowserFetcher implements the PageFetcher trait using BrowserPool:

use a3s_search::browser::BrowserFetcher;
use a3s_search::WaitStrategy;

let fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.g".into(),
            timeout_ms: 5000,
        })
        .with_user_agent("Mozilla/5.0 ...")
);

Wait Strategies

Control when a page is considered "loaded":

Prop

Type

Each built-in headless engine uses an appropriate strategy:

Prop

Type

Browser Auto-Detection

When chrome_path or lightpanda_path is None, A3S Search automatically detects or downloads the browser:

Lightpanda (Default)

  1. LIGHTPANDA environment variable
  2. System PATH (lightpanda command)
  3. Cache: ~/.a3s/lightpanda/<tag>/lightpanda
  4. Auto-download from GitHub releases (59MB)

Supported platforms: macOS (arm64, x64), Linux (x64, arm64). First run downloads ~59MB, startup <100ms.

Chrome (Fallback)

  1. CHROME environment variable
  2. System PATH (google-chrome, chromium, chrome, etc.)
  3. Well-known install paths (macOS /Applications/..., Linux /usr/bin/..., Windows C:\Program Files\...)
  4. Cache: ~/.a3s/chromium/<version>/
  5. Auto-download Chrome for Testing from Google CDN (200MB)

Supported platforms: macOS (arm64, x64), Linux (x64), Windows (x64, x86). First run downloads ~150MB, startup 1-2s.

Using Headless Engines

use a3s_search::{Search, SearchQuery};
use a3s_search::browser::{BrowserPool, BrowserPoolConfig, BrowserFetcher};
use a3s_search::engines::{Google, Baidu, BingChina, DuckDuckGo};
use a3s_search::WaitStrategy;
use std::sync::Arc;

let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
    max_tabs: 4,
    headless: true,
    chrome_path: None,
    proxy_url: None,
    launch_args: vec![],
}));

let google_fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.g".into(),
            timeout_ms: 5000,
        })
);

let baidu_fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.c-container".into(),
            timeout_ms: 5000,
        })
);

let mut search = Search::new();
search.add_engine(DuckDuckGo::new());
search.add_engine(Google::new(google_fetcher));
search.add_engine(Baidu::new(baidu_fetcher));

let results = search.search(SearchQuery::new("rust programming")).await?;
pool.shutdown().await;

PageFetcher Trait

All fetchers implement this trait:

#[async_trait]
pub trait PageFetcher: Send + Sync {
    async fn fetch(&self, url: &str) -> Result<String>;
}

Prop

Type

On this page