A3S Docs
A3S Code

Providers & Configuration

Complete HCL configuration reference — LLM providers, model fields, OpenAI-compatible APIs, and local models via Ollama / GPUStack / A3S Power / OneAPI.

All configuration lives in a single .hcl file. This page is the complete reference for every field, plus ready-to-paste configs for common LLM providers and local inference servers.

A3S Code uses HCL (HashiCorp Configuration Language) as its primary config format. The env() function reads environment variables at parse time — use it to keep secrets out of config files.


Config File Location

Agent::new("agent.hcl") resolves the path relative to the current working directory. Any filename and any path (relative or absolute) work.

ConventionUse case
./agent.hclPer-project config, can be checked into the repo
~/.a3s/config.hclUser-level default shared across projects
/etc/a3s/config.hclSystem-wide config for multi-user servers

Minimal Config

The only required top-level field is default_model. For cloud providers, api_key at the provider level is also required.

default_model = "anthropic/claude-sonnet-4-20250514"

providers {
  name    = "anthropic"
  api_key = env("ANTHROPIC_API_KEY")

  models {
    id        = "claude-sonnet-4-20250514"
    name      = "Claude Sonnet 4"
    tool_call = true
  }
}

For fully local providers (e.g., Ollama) no api_key is needed — only a base_url.


Top-Level Fields

Prop

Type


Provider Fields

Prop

Type


Model Fields

Prop

Type

modalities block

Declares what content types the model can receive and produce:

# Multimodal model (text + vision)
modalities {
  input  = ["text", "image", "pdf"]
  output = ["text"]
}

# Text-only model
modalities {
  input  = ["text"]
  output = ["text"]
}

cost block

Pricing in USD per million tokens, consumed by the Telemetry module for cost tracking:

cost {
  input       = 3.0    # per 1M input tokens
  output      = 15.0   # per 1M output tokens
  cache_read  = 0.3    # per 1M prompt-cache read tokens (Anthropic)
  cache_write = 3.75   # per 1M prompt-cache write tokens (Anthropic)
}

limit block

limit {
  context = 200000  # context window in tokens
  output  = 64000   # max output tokens per response
}

Multiple Providers

Define as many providers as needed. Switch between them per session via SessionOptions:

default_model = "anthropic/claude-sonnet-4-20250514"

providers {
  name    = "anthropic"
  api_key = env("ANTHROPIC_API_KEY")

  models {
    id        = "claude-sonnet-4-20250514"
    name      = "Claude Sonnet 4"
    tool_call = true
  }

  models {
    id        = "claude-opus-4-5-20251101"
    name      = "Claude Opus 4.5"
    reasoning = true
    tool_call = true
  }
}

providers {
  name    = "openai"
  api_key = env("OPENAI_API_KEY")

  models {
    id        = "gpt-4o"
    name      = "GPT-4o"
    tool_call = true
  }
}

Select at runtime:

const session = agent.session('.', { model: 'openai/gpt-4o' });
session = agent.session(".", model="openai/gpt-4o")

Per-Model API Key & Base URL

Provider-level api_key and base_url are defaults; model-level values override them. This enables:

  • API key rotation — different keys per model
  • Proxy routing — send specific models through a gateway
  • Regional endpoints — lower latency or data-residency compliance
  • Self-hosted endpoints — point individual models to local servers
providers {
  name     = "anthropic"
  api_key  = env("ANTHROPIC_API_KEY")       # default for all models
  base_url = "https://api.anthropic.com"    # default base URL

  models {
    id        = "claude-sonnet-4-20250514"
    name      = "Claude Sonnet 4"
    tool_call = true
    # inherits provider api_key and base_url
  }

  models {
    id        = "claude-sonnet-4-20250514"
    name      = "Claude Sonnet 4 (EU)"
    api_key   = env("ANTHROPIC_EU_KEY")           # override
    base_url  = "https://eu.api.anthropic.com"    # override
    tool_call = true
  }
}

OpenAI-Compatible Providers

Any service implementing the OpenAI Chat Completions API (POST /v1/chat/completions) works as a provider. Set base_url to point to the endpoint. The name field is arbitrary.

This covers: Ollama, GPUStack, A3S Power, OneAPI, Together AI, Groq, Fireworks, LM Studio, vLLM, llama.cpp, and any other OpenAI-compatible backend.


Local LLM Providers

Run inference locally for privacy, cost control, offline use, or to serve fine-tuned models. All four options below expose an OpenAI-compatible endpoint and follow the same config pattern.

Ollama

Ollama runs open-source models locally and serves them at http://localhost:11434/v1.

Setup:

# macOS
brew install ollama
ollama serve

# Pull models
ollama pull llama3.2
ollama pull qwen2.5-coder:7b
ollama pull deepseek-r1:8b

Config:

default_model = "ollama/llama3.2"

providers {
  name     = "ollama"
  base_url = "http://localhost:11434/v1"  # no api_key needed

  models {
    id          = "llama3.2"
    name        = "Llama 3.2 (Ollama)"
    family      = "llama"
    tool_call   = true
    temperature = true

    modalities {
      input  = ["text"]
      output = ["text"]
    }

    limit {
      context = 128000
      output  = 4096
    }
  }

  models {
    id          = "qwen2.5-coder:7b"
    name        = "Qwen 2.5 Coder 7B (Ollama)"
    family      = "qwen"
    tool_call   = true
    temperature = true
    limit { context = 32000  output = 4096 }
  }

  models {
    id          = "deepseek-r1:8b"
    name        = "DeepSeek-R1 8B (Ollama)"
    family      = "deepseek"
    tool_call   = false
    reasoning   = true
    temperature = false
    limit { context = 64000  output = 8192 }
  }
}

Ollama does not require an API key. If the client library requires a non-empty value, use any placeholder string (e.g., api_key = "ollama").


GPUStack

GPUStack manages GPU clusters and serves models via an OpenAI-compatible API. API keys use the format gpustack_<token>. Model IDs follow the convention model--<org>--<model-name> — find the exact value in your GPUStack dashboard.

Single-node config:

default_model = "gpustack/model--zhipuai--glm-4.7"

providers {
  name     = "gpustack"
  api_key  = env("GPUSTACK_API_KEY")
  base_url = "http://your-gpustack-host/v1"  # shared for all models

  models {
    id          = "model--zhipuai--glm-4.7"
    name        = "GLM-4.7 (GPUStack)"
    family      = "glm"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }

  models {
    id          = "model--meta--llama-3.2-3b-instruct"
    name        = "Llama 3.2 3B Instruct (GPUStack)"
    family      = "llama"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }
}

Multi-node config — override base_url per model to route to different nodes:

providers {
  name    = "gpustack"
  api_key = env("GPUSTACK_API_KEY")

  models {
    id       = "model--zhipuai--glm-4.7"
    name     = "GLM-4.7 (node-1)"
    base_url = "http://node-1.gpustack.internal/v1"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }

  models {
    id       = "model--meta--llama-3.2-3b-instruct"
    name     = "Llama 3.2 3B (node-2)"
    base_url = "http://node-2.gpustack.internal/v1"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }
}

A3S Power

A3S Power provides managed inference backed by A3S infrastructure, with an OpenAI-compatible API.

default_model = "power/qwen2.5-72b"

providers {
  name     = "power"
  api_key  = env("A3S_POWER_API_KEY")
  base_url = "http://your-a3s-power-host/v1"

  models {
    id          = "qwen2.5-72b"
    name        = "Qwen 2.5 72B (A3S Power)"
    family      = "qwen"
    tool_call   = true
    temperature = true
    reasoning   = false

    modalities {
      input  = ["text"]
      output = ["text"]
    }

    limit {
      context = 128000
      output  = 8192
    }
  }

  models {
    id          = "deepseek-r1:70b"
    name        = "DeepSeek-R1 70B (A3S Power)"
    family      = "deepseek"
    tool_call   = true
    reasoning   = true
    temperature = false
    limit { context = 64000  output = 8192 }
  }
}

OneAPI

OneAPI is an open-source API aggregator that proxies multiple upstream providers (Anthropic, OpenAI, Azure, Gemini, etc.) behind a single endpoint and key.

default_model = "oneapi/claude-sonnet-4-20250514"

providers {
  name     = "oneapi"
  api_key  = env("ONEAPI_TOKEN")
  base_url = "http://your-oneapi-host/v1"

  models {
    id          = "claude-sonnet-4-20250514"
    name        = "Claude Sonnet 4 (via OneAPI)"
    tool_call   = true
    temperature = true
    limit { context = 200000  output = 8192 }
  }

  models {
    id          = "gpt-4o"
    name        = "GPT-4o (via OneAPI)"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }

  models {
    id          = "deepseek-chat"
    name        = "DeepSeek Chat (via OneAPI)"
    tool_call   = true
    temperature = true
    limit { context = 64000  output = 4096 }
  }
}

The id field must match the channel model name configured in your OneAPI instance, not necessarily the upstream provider's model ID. Verify the exact value in your OneAPI admin panel.


Full Configuration Reference

# ─── Global ──────────────────────────────────────────────────────────────────
default_model   = "anthropic/claude-sonnet-4-20250514"
max_tool_rounds = 20        # default: 50
thinking_budget = 4096      # reasoning token budget (optional)

# ─── Extensions ──────────────────────────────────────────────────────────────
skill_dirs = ["./skills"]   # directories with *.md skill files
agent_dirs = ["./agents"]   # directories with agent definition files

# ─── Storage ─────────────────────────────────────────────────────────────────
storage_backend = "file"                    # "memory" | "file" | "custom"
sessions_dir    = "~/.a3s/sessions"         # path for "file" backend
storage_url     = "redis://localhost:6379"  # URL for "custom" backend

# ─── Cloud Provider ──────────────────────────────────────────────────────────
providers {
  name     = "anthropic"
  api_key  = env("ANTHROPIC_API_KEY")
  base_url = "https://api.anthropic.com"    # optional override

  models {
    id           = "claude-sonnet-4-20250514"
    name         = "Claude Sonnet 4"
    family       = "claude-sonnet"
    tool_call    = true
    reasoning    = false
    temperature  = true
    attachment   = true
    release_date = "2025-05-14"

    modalities {
      input  = ["text", "image", "pdf"]
      output = ["text"]
    }

    cost {
      input       = 3.0
      output      = 15.0
      cache_read  = 0.3
      cache_write = 3.75
    }

    limit {
      context = 200000
      output  = 64000
    }
  }
}

# ─── OpenAI-compatible cloud ─────────────────────────────────────────────────
providers {
  name    = "openai"
  api_key = env("OPENAI_API_KEY")

  models {
    id          = "gpt-4o"
    name        = "GPT-4o"
    tool_call   = true
    temperature = true
    attachment  = true

    modalities {
      input  = ["text", "image"]
      output = ["text"]
    }

    cost {
      input  = 2.5
      output = 10.0
    }

    limit {
      context = 128000
      output  = 16384
    }
  }

  # Per-model API key + base_url override
  models {
    id          = "gpt-4o"
    name        = "GPT-4o (via proxy)"
    api_key     = env("PROXY_API_KEY")
    base_url    = "https://proxy.example.com/v1"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }
}

# ─── Local provider (Ollama) ─────────────────────────────────────────────────
providers {
  name     = "ollama"
  base_url = "http://localhost:11434/v1"

  models {
    id          = "llama3.2"
    name        = "Llama 3.2 (Ollama)"
    tool_call   = true
    temperature = true
    limit { context = 128000  output = 4096 }
  }
}

# ─── Queue (optional) ────────────────────────────────────────────────────────
queue {
  query_max_concurrency    = 5    # default: 5
  execute_max_concurrency  = 2    # default: 2
  generate_max_concurrency = 1    # default: 1
  enable_metrics           = true
  enable_dlq               = true

  retry_policy {
    strategy         = "exponential"  # "fixed" | "exponential"
    max_retries      = 3
    initial_delay_ms = 100
  }
}

# ─── Search (optional) ───────────────────────────────────────────────────────
search {
  timeout = 30

  health {
    max_failures    = 3
    suspend_seconds = 60
  }

  engine {
    ddg   { enabled = true  weight = 1.5 }
    wiki  { enabled = true  weight = 1.2 }
    brave { enabled = true  weight = 1.0  timeout = 20 }
  }
}

The env() Function

Use env("VAR") anywhere a string value is expected. The variable is resolved once at parse time — if unset, Agent::new() returns an error immediately at startup.

providers {
  name    = "anthropic"
  api_key = env("ANTHROPIC_API_KEY")
}
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GPUSTACK_API_KEY="gpustack_..."
$env:ANTHROPIC_API_KEY = "sk-ant-..."
$env:OPENAI_API_KEY = "sk-..."
$env:GPUSTACK_API_KEY = "gpustack_..."
set ANTHROPIC_API_KEY=sk-ant-...
set OPENAI_API_KEY=sk-...
set GPUSTACK_API_KEY=gpustack_...

On this page