Structured Output#

Furiosa-LLM supports structured output generation, enabling you to constrain the model’s output to follow specific formats, schemas, or patterns. This feature is essential for applications requiring consistent, parsable responses such as JSON data extraction and any-formatted text generation.

Note

Furiosa-LLM uses llguidance as the default backend for guided decoding. This backend ensures efficient constraint enforcement during the token generation process. Additional backends like XGrammar will be supported soon, providing even more grammar format options and enhanced performance.

Supported Methods via OpenAI API#

Users can generate structured outputs using both OpenAI’s Completions API and Chat Completions API. Furiosa-LLM supports four main types of structured output constraints:

  • Choices: Forces the output to be one of predefined options

  • Regular Expressions: Generates text matching a specific regular expression pattern

  • JSON Schema: Produces JSON output following a predefined schema

  • Context-free Grammar: Generates text following a context-free grammar specification

Choices#

The guided_choice parameter constrains the model output to be one of a predefined set of choices. This is particularly useful for classification tasks or when you need the model to select from specific options.

Usage Example:

from openai import OpenAI

base_url = "http://localhost:8000/v1" # Replace this with your base URL
api_key = "EMPTY"
client = OpenAI(api_key=api_key, base_url=base_url)

# Sample review to classify
review = "This movie was absolutely fantastic!"

response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[{"role": "user", "content": f"Classify sentiment: '{review}'"}],
    extra_body={"guided_choice": ["positive", "negative", "neutral"]},
    temperature=0.0,
)

print(response.choices[0].message.content)

Use Cases:

  • Sentiment analysis

  • Category classification

  • Multiple choice questions

  • Binary decisions (yes/no, true/false)

Regular Expressions#

The guided_regex parameter ensures the generated text matches a specific regular expression pattern. This is useful for generating structured text like email addresses, phone numbers, or custom formats.

Usage Example:

from openai import OpenAI

base_url = "http://localhost:8000/v1" # Replace this with your base URL
api_key = "EMPTY"
client = OpenAI(api_key=api_key, base_url=base_url)

# Email address pattern
email_pattern = r"[a-z0-9.]{1,20}@\w{6,10}\.com\n"

response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[{"role": "user", "content": "Generate an email address for Ada Lovelace, who works in Analytical. End in .com and new line. Example result: 'ada.lovelace@analytical.com\n'"}],
    extra_body={"guided_regex": email_pattern},
    temperature=0.0,
)

print(response.choices[0].message.content)

Use Cases:

  • Email address generation

  • Phone number formatting

  • ID or code generation

  • Custom text patterns

  • URL or file path generation

JSON Schema#

The guided_json parameter is the most commonly used structured output method. It ensures the generated output is valid JSON that conforms to a specified schema. This is ideal for extracting structured data or creating consistent API responses.

Usage Example:

#!/usr/bin/env python3
"""
Example: Guided JSON - Data Extraction
"""

import enum
import json
from openai import OpenAI
import pydantic

base_url = "http://localhost:8000/v1" # Replace this with your base URL
api_key = "EMPTY"
client = OpenAI(api_key=api_key, base_url=base_url)

class Color(str, enum.Enum):
    BLUE = "blue"
    GREEN = "green"
    PURPLE = "purple"

class FruitDescription(pydantic.BaseModel):
    name: str
    color: Color
    taste: str

response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[{"role": "user", "content": "Generate a JSON with the name, color, and taste of your favorite fruit."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "fruit-description",
            "schema": FruitDescription.model_json_schema(),
        },
    },
    temperature=0.0,
)

result = json.loads(response.choices[0].message.content)
print(json.dumps(result, indent=2))

Use Cases:

  • Data extraction from unstructured text

  • API response formatting

  • Database record creation

  • Configuration file generation

Context-free Grammar#

The guided_grammar parameter allows you to define a context-free grammar specification that the generated text must follow. This provides the most flexible control over output structure.

Grammar Format Support:

The llguidance backend currently supports multiple grammar formats:

  • EBNF (Extended Backus-Naur Form): Standard grammar notation with ::= syntax

  • Lark: a modern parsing library for Python (Lark document)

Usage Example:

from openai import OpenAI

base_url = "http://localhost:8000/v1" # Replace this with your base URL
api_key = "EMPTY"
client = OpenAI(api_key=api_key, base_url=base_url)


# Grammar for SQL SELECT statements
sql_grammar = """
root ::= select_statement

select_statement ::= "SELECT " column " FROM " table " WHERE " condition

column ::= "username" | "email"

table ::= "users"

condition ::= "id = " number

number ::= "1" | "2" | "3"
"""

response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[{"role": "user", "content": "Generate an SQL query to show the 'username' from the 'users' table."}],
    extra_body={"guided_grammar": sql_grammar},
    temperature=0.0,
    max_tokens=100
)

print(response.choices[0].message.content)

The example above uses EBNF format with the ::= notation. You can also use Lark format with different syntax rules depending on your grammar complexity requirements.

Use Cases:

  • SQL query generation

  • Code generation with specific syntax

  • Mathematical expressions

  • Custom domain-specific languages

  • Complex structured formats