File Structure
A SirenSpec workflow is a YAML file with a top-level mapping. Three fields are required; the rest are optional.version
Required. The schema version string. Currently only "0.1" is supported.
agents
Required. A mapping of agent IDs to agent definitions. Each agent wraps an LLM with a system prompt.
| Field | Required | Type | Description |
|---|---|---|---|
model | Yes | string | Provider URI — provider:model format. |
system | Yes | string | System prompt sent to the model. |
guardrails | No | list of strings or {name, config} objects | Agent-level guardrail override. Replaces the workflow-level list. Configurable guardrails (schema, cost_cap, pii) use the {name, config} form — see Guardrails. |
nodes
Required. A mapping of node IDs to node definitions. Six node types are supported:
Agent node (default) — binds an LLM agent to an output path:
Agent node fields
| Field | Required | Type | Description |
|---|---|---|---|
agent | Yes | string | Agent ID from the agents map. |
writes | Yes | string | Dot-notation context path where the agent’s response is stored. |
streaming | No | boolean | When true (default), use token streaming if the provider supports it. Set false to opt this node out — --no-stream on the CLI is the global override. Guardrails always apply to the fully assembled response, not per-chunk. |
retry | No | object | Per-node retry policy. Overrides defaults.retry. See Retry Policies. |
on_failure | No | object | Failure action when retries are exhausted. Overrides defaults.on_failure. See Retry Policies. |
max_tokens_per_call | No | integer | Optional ceiling on completion tokens for a single call. Forwarded to the provider so the LLM truncates its own response. |
Human node fields
Ahuman node pauses execution to collect a response from a human operator. The node consumes no LLM tokens. The rendered prompt is shown via the configured input source (stdin by default), and the response is written to the workflow context just like an agent’s output. Downstream when: conditions can gate on the response.
| Field | Required | Type | Description |
|---|---|---|---|
type | Yes | "human" | Discriminator that marks this as a human node. |
writes | Yes | string | Dot-notation context path where the response is stored. |
prompt | No | string | Template string shown to the operator. Supports {{ expr }} interpolation. |
timeout | No | number | Wall-clock seconds before on_timeout fires. Omitted means wait indefinitely. |
on_timeout | No | "abort" | "skip" | "use_default" | Action when the timeout fires. Default "abort" raises HumanInputError. "skip" writes the empty string. "use_default" writes default_output. |
default_output | Required when on_timeout is "use_default" | string | Fallback response used on timeout. |
Factory node fields
A factory node supports three mutually exclusive execution modes: Mode 1: Agent + for_each — one agent call per item in a runtime list (native list, JSON array, or fenced```json).
```json).
| Field | Required | Type | Description |
|---|---|---|---|
agent | One of agent or swrm | string | Named agent from the workflow’s top-level agents map. Mutually exclusive with swrm. |
swrm | One of agent or swrm | object | Inline swrm spec (with agents, optional synthesis, optional concurrency). Executed per item. Mutually exclusive with agent. |
for_each | One of for_each or swarm_size | string | Template expression resolved to a list at runtime. Accepts a native Python list, a plain JSON array string, or a fenced ```json block — no manual unwrapping of upstream tool or agent output needed. |
swarm_size | One of for_each or swarm_size | int or string | Static count or template expression for parallel agent instances. Only valid with agent mode. |
inputs | No | object | Template strings for each input. Supports {{ item }}, {{ index }}, and {{ total }}. Each resolved value is also exposed to the spawned agent’s prompt as {{ inputs.<key> }}, in addition to being joined into the user message. |
concurrency | No | integer | Max parallel worker instances. Default: 1. |
timeout_per_instance | No | integer | Per-instance timeout in seconds. Default: 60. |
on_failure | No | "abort" | "continue" | Failure policy: abort (raise) or continue (skip). Default: abort. |
writes | Yes | string | Dot-notation context path where the outputs list is stored. |
Loop Variables
Insideinputs: templates, agent prompts, and swrm agent/synthesis prompts (when using swrm mode), three special variables are available:
| Variable | Availability | Type | Description |
|---|---|---|---|
{{ item }} | for_each mode only | any | Current list element. |
{{ index }} | All modes | int | Zero-based position in the list. |
{{ total }} | All modes | int | Total number of items in the list. |
inputs: is additionally available in the spawned agent’s system prompt as {{ inputs.<key> }} once resolved. For example, inputs: { task: "{{ item }}" } makes {{ inputs.task }} reference the current item inside the agent’s prompt.
Swrm Within Factory
Theswrm field in factory nodes accepts:
| Field | Required | Type | Description |
|---|---|---|---|
agents | Yes | list of agents | One or more agents to run in parallel for each factory item. |
synthesis | No | synthesis object | Optional synthesis step run after all agents complete per item. |
concurrency | No | integer | Max agents to run concurrently per item. Default: all agents. |
{{ item }}, {{ index }}, {{ total }}).
See the following pages for detailed field reference and examples:
- Tool Nodes — HTTP and Python adapters
- Swrm — Parallel agents and synthesis
- Workflow Nodes — Sub-workflow composition
Context paths
Thewrites field uses dot-notation to specify where output is written in the workflow context:
| Path prefix | Description |
|---|---|
output.* | Final output — included in the trace’s output field. |
working.* | Intermediate state — readable by downstream nodes but not in final output. |
output.reply— final response available in the JSON trace.working.intent— intermediate classification readable by the next node.working.triage.intent— nested intermediate state.
edges
Optional. A list of directed edges connecting nodes. If omitted, all nodes are treated as roots and execute in definition order.
| Field | Required | Type | Description |
|---|---|---|---|
from | Yes | string | Source node ID. |
to | Yes | string | Destination node ID. |
when | No | string | Python expression evaluated after the source node completes. |
when expressions
The when field enables conditional branching. The expression is evaluated after the source node writes its output to the context.
Available names in when expressions:
| Name | Type | Description |
|---|---|---|
working | object | The current working context (dot-access). |
output | object | The current output context (dot-access). |
_budget | object | Budget state from cost_cap guardrail (if active). Fields: total_tokens (int), estimated_usd (float or None). |
true / false / null | literals | YAML boolean/null literals. |
len / bool / str / int / float / abs / min / max | builtins | A small set of safe built-ins, e.g. len(working.items) > 0. |
__builtins__ is otherwise cleared, so arbitrary code execution is not possible — only the names above are in scope.
when expression raises an error (missing key, syntax error, type mismatch), it is treated as false and the edge is not traversed. Edges without when are always traversed.
input
Optional. A static input message for the first node. Can be overridden at runtime with the --input CLI flag.
| Field | Required | Type | Description |
|---|---|---|---|
message | No | string | Static user message passed to the first (root) node. |
input.message nor --input is provided, the CLI exits with an error.
state
Optional. Initial state to seed the workflow context before execution begins.
working and output context buckets at startup. Nodes can read and overwrite these values during execution.
env_file
Optional. Path to a .env file (relative to the workflow file) loaded into os.environ before execution. This lets provider clients and {{ env.* }} templates pick up credentials without exporting them manually.
load_workflow() — before the Workflow object is returned — so provider clients that read os.environ at construction time see the values regardless of call order. Existing environment variables are never overwritten: if a key in the .env file is already present in os.environ as an empty string, an EnvFileShadowWarning is emitted (the existing empty value wins, which is usually a misconfiguration). A missing env_file raises FileNotFoundError.
defaults
Optional. Workflow-wide defaults for retry and failure handling, inherited by all nodes that do not specify their own.
guardrails
Optional. A list of guardrail names (or guardrail specs with configuration) applied globally to all agents. Defaults to ["injection"] if omitted.
Simple guardrails (no configuration)
Guardrails with configuration
Some guardrails require configuration. Use thename and config fields:
| Guardrail | Description | Configuration |
|---|---|---|
injection | Detects prompt-injection patterns. Applied by default. | None |
length | Truncates output to 4000 characters. | Optional: max_chars, mode |
pii | Detects and redacts (or blocks on) email, phone, SSN, and credit-card data. | Optional: entities, action, replacement |
schema | Validates JSON output against a JSON Schema Draft 7 dict. | Required: schema |
cost_cap | Enforces token and/or USD budget ceilings. | Required: at least one of max_usd or max_tokens; optional: action |
[]) disables all guardrails for the entire workflow. Individual agents can override this with their own guardrails field.
See Guardrails for full details.
budget
Optional. Workflow-level cumulative budget for token, USD, and wall-clock spend. The executor checks the running totals after every node and enforces the ceilings declared here. At least one of max_tokens, max_cost_usd, or max_duration_s must be set — an empty budget: block is rejected at validation time.
| Field | Required | Type | Description |
|---|---|---|---|
max_tokens | One of the three | integer | Maximum total tokens across all nodes in a run. |
max_cost_usd | One of the three | number | Maximum estimated USD spend across all nodes. Falls back to None for models without pricing entries (e.g. Ollama). |
max_duration_s | One of the three | number | Maximum wall-clock seconds for the full run. |
on_exceeded | No | "abort" | "warn" | "skip_remaining" | Action when any ceiling is hit. Default "abort" raises BudgetExceededError. "warn" logs and continues. "skip_remaining" finishes without new LLM calls. |
Budget status in the trace
When abudget: block is configured, the trace summary includes a budget block:
Per-node max_tokens_per_call
Agent nodes also support a per-call ceiling:
max_tokens_per_call is forwarded to the provider as the max_tokens API parameter so the LLM truncates its own response. Combined with the workflow budget: block, this gives you two layers of protection: each individual call is bounded and the cumulative spend is bounded.