Build a DeepResearch Agent
Build a production-grade research agent with A3S Code — parallel web search across multiple machines, streaming synthesis, and structured report generation.
In this tutorial you'll build a DeepResearch agent that goes beyond a simple search-and-summarize loop. It uses A3S Code's core capabilities to run real parallel research:
- Decomposes a question into focused sub-queries
- Dispatches each search to external workers (separate machines/processes) via the Lane queue
- Streams synthesis in real time as results arrive
- Produces a structured markdown report with citations
Install the SDK first: pip install a3s-code. Configure ~/.a3s/config.hcl with your LLM provider key.
Walkthrough
Project structure
Three files: agent.hcl for the agent + queue + search config, main.py for the coordinator, and worker.py for the remote task executor.
Agent config (HCL)
agent.hcl wires up the model, enables the Lane queue with query_max_concurrency = 8 for parallel searches, and configures the built-in multi-engine search (Google, Bing, DuckDuckGo).
Create agent and session
Agent.create() loads the config. SessionQueueConfig mirrors the HCL queue settings in code. set_lane_handler("execute", mode="external") routes all bash/write tool calls to external workers instead of running locally.
Decompose the question
Send a planning prompt to the agent. It returns a JSON array of sub-queries. We parse these before starting the parallel phase — if parsing fails we fall back to the original question.
Parallel search across workers
Use asyncio.to_thread + asyncio.gather to run all sub-queries concurrently. Each task calls session.tool("web_search", ...) directly — no LLM round-trip. The Lane queue caps concurrency at 8.
External task handler
A background thread polls session.pending_external_tasks(). When the agent emits a bash/write task, the poller picks it up, ships it to a remote worker, then calls session.complete_external_task() to resume the agent. In production, replace worker.execute() with gRPC / SSH / message queue.
Stream the synthesis
Pass all search results back to the agent. Iterate session.stream() — text_delta events print live output, tool_start/tool_end show tool activity, end carries final token usage.
Full coordinator
The complete main.py. Run it:
python main.py "What are the latest advances in confidential computing?"
deep-research/├── agent.hcl # agent + queue + search config├── main.py # coordinator: plan → parallel search → synthesize└── worker.py # remote worker: executes ExternalTasks
Going Further
Switch workers dynamically
Change the lane handler at runtime — internal for local, external for remote, hybrid to do both simultaneously:
# Switch to hybrid: execute locally AND notify external systemssession.set_lane_handler("execute", mode="hybrid", timeout_ms=30_000)
Monitor queue pressure
Use queue stats to auto-scale workers when the queue backs up:
if session.has_queue():stats = session.queue_stats()print(f"pending: {stats['total_pending']}")print(f"active: {stats['total_active']}")print(f"failed: {stats['total_failed']}")print(f"dlq size: {stats['dlq_size']}")
Add planning and goal tracking
Enable the built-in planner so the agent decomposes the research task autonomously before searching:
session = agent.session(".",queue_config=qc,planning=True,goal_tracking=True,permissive=True,)