Build a DeepResearch Agent

Build a production-grade research agent with A3S Code — parallel web search across multiple machines, streaming synthesis, and structured report generation.

In this tutorial you'll build a DeepResearch agent that goes beyond a simple search-and-summarize loop. It uses A3S Code's core capabilities to run real parallel research:

  1. Decomposes a question into focused sub-queries
  2. Dispatches each search to external workers (separate machines/processes) via the Lane queue
  3. Streams synthesis in real time as results arrive
  4. Produces a structured markdown report with citations

Install the SDK first: pip install a3s-code. Configure ~/.a3s/config.hcl with your LLM provider key.


Walkthrough

01

Project structure

Three files: agent.hcl for the agent + queue + search config, main.py for the coordinator, and worker.py for the remote task executor.

02

Agent config (HCL)

agent.hcl wires up the model, enables the Lane queue with query_max_concurrency = 8 for parallel searches, and configures the built-in multi-engine search (Google, Bing, DuckDuckGo).

03

Create agent and session

Agent.create() loads the config. SessionQueueConfig mirrors the HCL queue settings in code. set_lane_handler("execute", mode="external") routes all bash/write tool calls to external workers instead of running locally.

04

Decompose the question

Send a planning prompt to the agent. It returns a JSON array of sub-queries. We parse these before starting the parallel phase — if parsing fails we fall back to the original question.

05

Parallel search across workers

Use asyncio.to_thread + asyncio.gather to run all sub-queries concurrently. Each task calls session.tool("web_search", ...) directly — no LLM round-trip. The Lane queue caps concurrency at 8.

06

External task handler

A background thread polls session.pending_external_tasks(). When the agent emits a bash/write task, the poller picks it up, ships it to a remote worker, then calls session.complete_external_task() to resume the agent. In production, replace worker.execute() with gRPC / SSH / message queue.

07

Stream the synthesis

Pass all search results back to the agent. Iterate session.stream()text_delta events print live output, tool_start/tool_end show tool activity, end carries final token usage.

08

Full coordinator

The complete main.py. Run it:

python main.py "What are the latest advances in confidential computing?"
project structure
deep-research/
├── agent.hcl # agent + queue + search config
├── main.py # coordinator: plan → parallel search → synthesize
└── worker.py # remote worker: executes ExternalTasks

Going Further

Switch workers dynamically

Change the lane handler at runtime — internal for local, external for remote, hybrid to do both simultaneously:

# Switch to hybrid: execute locally AND notify external systems
session.set_lane_handler("execute", mode="hybrid", timeout_ms=30_000)

Monitor queue pressure

Use queue stats to auto-scale workers when the queue backs up:

if session.has_queue():
stats = session.queue_stats()
print(f"pending: {stats['total_pending']}")
print(f"active: {stats['total_active']}")
print(f"failed: {stats['total_failed']}")
print(f"dlq size: {stats['dlq_size']}")

Add planning and goal tracking

Enable the built-in planner so the agent decomposes the research task autonomously before searching:

session = agent.session(
".",
queue_config=qc,
planning=True,
goal_tracking=True,
permissive=True,
)