A3S Docs
A3S Power

Structured Output

JSON Schema constrained generation via GBNF grammar

Structured Output

A3S Power supports JSON Schema constrained generation. The schema is converted to a GBNF grammar that constrains the model's output token-by-token.

JSON Mode

Force the model to output valid JSON:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "messages": [{"role": "user", "content": "List 3 programming languages with their year of creation"}],
    "response_format": {"type": "json_object"}
  }'

JSON Schema

Constrain output to a specific schema:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "messages": [{"role": "user", "content": "Extract the person info from: John Doe, age 30, engineer"}],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "age": {"type": "integer"},
          "occupation": {"type": "string"}
        },
        "required": ["name", "age", "occupation"]
      }
    }
  }'

Response is guaranteed to match the schema:

{"name": "John Doe", "age": 30, "occupation": "engineer"}

Python SDK

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")

class Person(BaseModel):
    name: str
    age: int
    occupation: str

response = client.beta.chat.completions.parse(
    model="llama3.2:3b",
    messages=[{"role": "user", "content": "Extract: Jane Smith, 25, data scientist"}],
    response_format=Person
)
person = response.choices[0].message.parsed
print(person.name, person.age, person.occupation)

Complex Schemas

{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "type": "object",
      "properties": {
        "colors": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "name": {"type": "string"},
              "hex": {"type": "string", "pattern": "^#[0-9A-Fa-f]{6}$"},
              "rgb": {
                "type": "object",
                "properties": {
                  "r": {"type": "integer", "minimum": 0, "maximum": 255},
                  "g": {"type": "integer", "minimum": 0, "maximum": 255},
                  "b": {"type": "integer", "minimum": 0, "maximum": 255}
                }
              }
            },
            "required": ["name", "hex"]
          }
        }
      },
      "required": ["colors"]
    }
  }
}

How It Works

The JSON Schema is converted to a GBNF (Grammar-Based Next-token Filter) grammar. During generation, only tokens that are valid continuations of the grammar are allowed. This guarantees schema compliance without post-processing or retries.

On this page