A3S Docs
A3S Search

Headless Browser

Chrome DevTools Protocol integration for JavaScript-rendering engines

Headless Browser

Engines like Google, Baidu, and Bing China require JavaScript rendering. A3S Search provides a BrowserPool that manages a shared headless Chrome instance with tab concurrency control.

Feature Gate

Headless browser support is enabled by default via the headless Cargo feature:

# Enabled by default (9 engines)
[dependencies]
a3s-search = "0.8"

# Disable headless for smaller binary (6 engines)
[dependencies]
a3s-search = { version = "0.8", default-features = false }

BrowserPool

BrowserPool manages a single Chrome process with a tab semaphore for concurrency control:

use a3s_search::browser::{BrowserPool, BrowserPoolConfig};
use std::sync::Arc;

let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
    max_tabs: 4,
    headless: true,
    chrome_path: None,
    proxy_url: None,
    launch_args: vec![],
}));

Configuration

Prop

Type

Lifecycle

The browser is lazily initialized on the first acquire_browser() call:

let browser = pool.acquire_browser().await?;
pool.shutdown().await;

BrowserFetcher

BrowserFetcher implements the PageFetcher trait using BrowserPool:

use a3s_search::browser::BrowserFetcher;
use a3s_search::WaitStrategy;

let fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.g".into(),
            timeout_ms: 5000,
        })
        .with_user_agent("Mozilla/5.0 ...")
);

Wait Strategies

Control when a page is considered "loaded":

Prop

Type

Each built-in headless engine uses an appropriate strategy:

Prop

Type

Chrome Auto-Detection

When chrome_path is None, A3S Search looks for Chrome in this order:

  1. CHROME environment variable
  2. System PATH (google-chrome, chromium, chrome, etc.)
  3. Well-known install paths (macOS /Applications/..., Linux /usr/bin/..., Windows C:\Program Files\...)
  4. Auto-download Chrome for Testing from Google CDN
  5. Cache in ~/.a3s/chromium/

Supported platforms: macOS (arm64, x64), Linux (x64), and Windows (x64, x86). The first run may download ~150MB.

Using Headless Engines

use a3s_search::{Search, SearchQuery};
use a3s_search::browser::{BrowserPool, BrowserPoolConfig, BrowserFetcher};
use a3s_search::engines::{Google, Baidu, BingChina, DuckDuckGo};
use a3s_search::WaitStrategy;
use std::sync::Arc;

let pool = Arc::new(BrowserPool::new(BrowserPoolConfig {
    max_tabs: 4,
    headless: true,
    chrome_path: None,
    proxy_url: None,
    launch_args: vec![],
}));

let google_fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.g".into(),
            timeout_ms: 5000,
        })
);

let baidu_fetcher = Arc::new(
    BrowserFetcher::new(pool.clone())
        .with_wait(WaitStrategy::Selector {
            css: "div.c-container".into(),
            timeout_ms: 5000,
        })
);

let mut search = Search::new();
search.add_engine(DuckDuckGo::new());
search.add_engine(Google::new(google_fetcher));
search.add_engine(Baidu::new(baidu_fetcher));

let results = search.search(SearchQuery::new("rust programming")).await?;
pool.shutdown().await;

PageFetcher Trait

All fetchers implement this trait:

#[async_trait]
pub trait PageFetcher: Send + Sync {
    async fn fetch(&self, url: &str) -> Result<String>;
}

Prop

Type

On this page