Skip to content

Changelog

All notable changes to the OpenArmature specification are documented in this file.

The format is adapted from Keep a Changelog — subsection labels render as bold paragraphs (rather than H3) to keep the rendered docs-site right-rail TOC focused on releases, and there is no [Unreleased] section since the spec tags after every acceptance PR. The spec follows Semantic Versioning.

[0.20.1] — 2026-05-24

Changed

  • llm-provider §8 framing gained a Per-mapping subsection structure paragraph recommending the canonical §8.X subsection template (Request mapping / Response mapping / Error mapping / Concurrency / Structured output, in that order) used by §8.1. Provider-specific sub-subsections (e.g., §8.X.1.1 for content-block wire mapping, §8.X.5.1 for fallback) are permitted and expected; providers MAY add additional top-level subsections at the end of the canonical five for features without §8.1 analogues (e.g., §8.X.6 Caching). SHOULD-level rather than MUST-level — when a §8.X proposal diverges, the proposal text SHOULD explain the divergence in its Detailed design so reviewers can confirm it's structural rather than ergonomic. Resolves 0019's open-question #2 (per-mapping section structure). (proposal 0026)

Notes

  • Pre-1.0 PATCH bump. Purely textual structural recommendation. No new types, no new error categories, no behavioral change. All v0.20.0 conformance fixtures pass unchanged. §8.1 already follows the template by construction (it IS the template source). Matches the v0.16.1 / v0.17.1 precedent for spec-text clarifications.
  • Cross-language consistency story. The template lock-in is sequenced so §8.2 Anthropic and §8.3 Gemini follow-ons land against the same canonical structure — readers who know §8.1's organization can navigate §8.X by reflex, fixture sidecars reference subsection numbers predictably across mappings, and cross-language consistency (Python ↔ TypeScript siblings) extends to the spec-text structure as well as the wire shapes.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.20.0 may target v0.20.1 directly.

[0.20.0] — 2026-05-24

Added

  • llm-provider §5 complete() gained an optional tool_choice parameter. Four modes: "auto" (model decides), "required" (model MUST call at least one tool), "none" (model MUST NOT call tools), and {type: "tool", name: <string>} (model MUST call the named tool). When omitted (None / absent), the engine omits the wire-level tool_choice field and the provider's own default applies — preserving v0.4.0 behavior exactly. Pre-send validation routes three new failure modes through provider_invalid_request (§7): (1) required with empty / absent tools; (2) force-specific with empty / absent tools; (3) force-specific with name not in supplied tools. The framework does NOT enforce the constraint post-hoc — whether the model honored it is observable from Response.finish_reason / Response.message.tool_calls but is not framework-policed (per §6's transparency principle). (proposal 0025)
  • llm-provider §8.1.1 OpenAI request mapping gains a tool_choice row covering the four modes plus the None-omitted-from-wire case. The spec {type: "tool", name: X} discriminator renames to OpenAI's {type: "function", function: {name: X}} wire shape (implementation performs the rename when constructing the wire body). (proposal 0025)
  • Conformance fixtures 029-tool-choice-modes, 030-tool-choice-force-specific, 031-tool-choice-validation (llm-provider). New harness primitive: expected_wire_request_checks.tool_choice_absent: true (sibling-to-expected_wire_request block asserting a key is absent from the wire body, distinct from present-with-null; follows fixture 027's expected_wire_request_checks.response_format_absent precedent). Fixture 029 establishes the precedent that the mock provider returns constraint-compliant responses for the required and none cases; assertions verify end-to-end response mapping, not framework enforcement.

Changed

  • llm-provider §7 provider_invalid_request description extended to enumerate the three new validation failure modes for tool_choice (required-with-empty-tools, force-with-empty-tools, force-name-not-in-list). No new category — the existing surface absorbs the new failure modes. (proposal 0025)

Notes

  • Pre-1.0 MINOR bump. Implementations passing the v0.19.0 fixtures need actual work to pass the new fixtures (extend complete() with the new parameter, add pre-send validation, add the §8.1.1 wire mapping row). The no-tool_choice path is backward-compatible: existing callers passing no tool_choice continue to see the same wire shape they did in v0.4.0. Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.19.0 may target v0.20.0 directly.
  • Sequenced ahead of §8.2 Anthropic + §8.3 Gemini follow-ons. Adding tool_choice to complete() BEFORE the per-provider mappings ship avoids retrofitting three §8.X subsections in lockstep. The §8.1.1 mapping row lands here; future §8.X follow-ons (per the §8.X subsection template proposed in 0026) add their own per-provider tool_choice mapping rows.
  • Framework does NOT enforce "none" post-hoc. Per §5's clarifying paragraph, if a provider returns tool calls despite tool_choice="none", the implementation MUST surface what the provider returned without re-validating. Provider compliance is observable from finish_reason / tool_calls but is not framework-policed.

[0.19.0] — 2026-05-24

Changed

  • graph-engine §6 drain operation gained an optional timeout parameter and now MUST return a summary. Drain returns once all observer events deliver OR once the caller-supplied timeout elapses, whichever happens first. When the timeout fires, workers MUST be cancelled or otherwise terminated such that the compiled graph remains usable for subsequent invocations — partial delivery state from one drain MUST NOT leak into the next invocation. The summary MUST include at minimum undelivered_count (the count of events still queued or in-flight when the timeout fired) and timeout_reached (a boolean flag). Implementations MAY provide richer detail (per-observer counts, sampled event metadata). When called without a timeout, drain still waits indefinitely (the existing v0.3.0 behavior) and the summary's undelivered_count is 0, timeout_reached is false — callers receive a consistent shape regardless of whether they supplied a timeout. (proposal 0010)
  • Conformance fixtures 022-drain-timeout-elapses-with-undelivered, 023-drain-timeout-not-reached-fast-observers, 024-drain-timeout-clean-state-for-next-invocation, 025-drain-no-timeout-waits-for-all (graph-engine). New harness primitives: observers[].sleep_ms_per_event (uniform or {first_invocation, subsequent_invocations} form), invoke.drain.timeout_seconds, expected.drain_summary.{timeout_reached, undelivered_count, undelivered_count_min}, multi-invocation invocations: block for cross-drain state-isolation testing, and invariants drain_returned_within_timeout / graph_state_intact_after_timeout / second_invocation_drain_independent_of_first / drain_waited_for_all_events.

Notes

  • Pre-1.0 MINOR bump. Implementations passing the v0.18.0 fixtures need actual work to pass the new drain fixtures (return a summary, accept a timeout, cancel cleanly under timeout, preserve graph state across cross-drain boundaries). The no-timeout drain path is backward-compatible — existing callers passing no timeout continue to get "wait until everything delivers" — but the return type now carries a summary where v0.3.0 returned nothing. Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.18.0 may target v0.19.0 directly.
  • Cancellation mechanism is implementation-defined. When the timeout elapses while an observer is mid-call, the implementation MUST terminate the call in time to honor the deadline. How it does so — task.cancel() in Python, an AbortSignal in TypeScript, refusing to hand the worker the next event once the deadline is within an observer's expected latency budget — is implementation-defined and SHOULD be documented per-impl. The hard deadline itself is not negotiable. Observers SHOULD be written to be cancellation-safe (idempotent writes, try/finally cleanup).
  • Summary shape is language-idiomatic. The two required fields (undelivered_count, timeout_reached) are mandated; the shape that carries them (a Python dict/dataclass, a TypeScript object, etc.) is per-language ergonomics. Implementations MAY add additional fields (per-observer counts, sampled event metadata) as long as the minimum two are present.
  • Downstream interactions (informative — no normative changes to other capabilities): under a timeout, late observer events may be lost. The OTel observer (observability §6) may have openarmature spans that never reach the exporter; downstream OTel exporters' own buffer/retry settings cover this. Checkpoint save events (pipeline-utilities §10.8) may not surface as observer-stream spans under timeout, but the underlying checkpoint save was synchronous and durable per §10.3 / §10.1.1 — resume correctness is unaffected. Event production remains deterministic (graph-engine §5); only event delivery is bounded by the timeout.

[0.18.0] — 2026-05-24

Added

  • pipeline-utilities §10.11 — per-instance fan-out resume contract. Defines fan_out_progress field semantics (per-fan-out-node mapping with per-instance status, result field carrying the durable accumulator contribution, completed_inner_positions for in_flight capture). The completed state is a correctness guarantee that exactly one accumulator entry per instance heads into the fan-in step. Sub-sections cover reducer interaction (§10.11.1, with append being the load-bearing correctness case), error_policy composition (§10.11.2 fail_fast and collect modes), instance_middleware composition (§10.11.3, retry budget resets on resume), and configurable Checkpointer-level batching for fan-out internal saves (§10.11.4, with explicit cost trade-off — buffered-but-unflushed saves lost on crash are acceptable because re-execution under §10.11.1's rules contributes for the first time, not as a double-merge). (proposal 0009)
  • Conformance fixtures 048-checkpoint-fan-out-per-instance-resume-skips-completed through 054-checkpoint-fan-out-batching-buffered-saves-lost-on-crash (pipeline-utilities). New harness primitives: fan_out_progress matchers under saved_record_assertions (with state, result, completed_inner_positions, state_one_of for execution-mode variation), instances_executed_during_resume / instances_skipped_during_resume resume assertions, instance_N_attempt_index_on_resume per-instance attempt assertions, abort_after_instance fan-out abort directive, and a batched-Checkpointer primitive (kind: in_memory_batched with fan_out_internal_save_batching.flush_every).

Changed

  • pipeline-utilities §10.7 — fan-out resume contract replaced from atomic-restart with per-instance. When a fan-out is in flight at crash time, resume re-runs only the instances that did not complete-and-record their contribution. Completed instances are skipped; their accumulator entries (fan_out_progress[].instances[].result) roll forward to the fan-in step (per §9.3) unchanged. The atomic-restart behavior from v1 (a crash mid-fan-out re-running the entire fan-out) is superseded. (proposal 0009)
  • pipeline-utilities §10.3 — save granularity extended to fan-out instance internal nodes. The engine now fires Checkpointer.save at every completed event from inside a fan-out instance (in addition to outermost-graph nodes, subgraph-internal nodes, and the fan-out node itself). The v1 "engine does NOT save during fan-out instance execution" elision is removed. Fan-out node's own completion save now also finalizes fan_out_progress to mark all instances complete. Volume concerns for high-instance-count fan-outs are addressed via the configurable batching knob in §10.11.4 (opt-in, off by default). (proposal 0009)
  • pipeline-utilities §10.2 fan_out_progress field — promoted from reserved to populated. The v1 placeholder language ("reserved field for the v2 per-instance fan-out resume follow-on proposal") is replaced; the field now carries per-fan-out-node entries when one or more fan-outs are in flight at save time, per §10.11. The field shape is fully specified in §10.11. (proposal 0009)
  • pipeline-utilities — existing §10.11 "Reference implementations and backend layering" renumbered to §10.13 to accommodate the new §10.11. Cross-reference in §10.12.1 (the SQLiteCheckpointer reference implementation (per §10.11) mention) updated to §10.13. (proposal 0009)

Removed

  • Conformance fixture 028-checkpoint-fan-out-atomic-restart (pipeline-utilities). The v1 atomic-restart contract it verified no longer applies under the per-instance resume model. Replaced by fixtures 048–054. (proposal 0009)

Notes

  • Pre-1.0 MINOR bump. The fan-out resume contract changes (atomic → per-instance) and the engine's save granularity changes (now saves inside fan-out instances) are implementation-visible: a v1-compliant implementation that does atomic restart fails the new per-instance fixtures. Matches the v0.16.0 precedent for behavioral category changes being MINOR pre-1.0. Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.17.1 may target v0.18.0 directly.
  • Batching default is off. §10.11.4 configurable batching for fan-out internal saves is opt-in per Checkpointer instance. The default behavior is "every fan-out internal save is synchronously durable" — the simpler correctness story. Backends document their batching defaults and configuration shape; users opt in with eyes open.
  • completed_inner_positions is observational, not state-restore. §10.11's per-instance completed_inner_positions field captures how far an in_flight instance had progressed within its inner subgraph at save time. On resume, the instance re-enters its inner subgraph at the declared entry node; the completed_inner_positions field does NOT serve as a per-inner-node resume point. This is the deliberate scope cut: per-instance resume treats the instance as an atomic unit, not as a re-entry point for inner nodes. Per-inner-node resume inside a fan-out instance would require a different contract and significantly complicate §10.11.1's reducer-interaction story.
  • Parallel branches (§11) atomic-restart unchanged. The §11.9 composition-with-checkpointing note has been tightened to remove the "deferred alongside per-instance fan-out resume" framing — per-branch resume is its own follow-on and inherits whatever lessons fall out of the per-instance fan-out work.

[0.17.1] — 2026-05-24

Changed

  • llm-provider §8 reframed from "OpenAI-compatible wire format" to "Wire-format mappings". The existing OpenAI-compatible body is now nested under §8.1 "OpenAI-compatible mapping"; its subsections renumber §8.1 (Request mapping) → §8.1.1, §8.2 (Response mapping) → §8.1.2, §8.3 (Error mapping) → §8.1.3, §8.4 (Concurrency) → §8.1.4, §8.5 (Structured output) → §8.1.5, with the deeper §8.1.1 (Content-block wire mapping) → §8.1.1.1, §8.5.1 (Fallback) → §8.1.5.1, §8.5.2 (Response mapping) → §8.1.5.2. A new §8 framing paragraph catalogs the wire-format mapping section as the home for cross-language provider mappings, establishes the default placement rule (any mapping intended for implementation across multiple OA language implementations MUST land in §8.X), reserves out-of-tree for genuinely single-language / opt-out / experimental cases, and carries over the "compliance label" opt-in. (proposal 0019)
  • Conformance-fixture sidecars under spec/llm-provider/conformance/ updated to reference the new section numbers (§8.1.1, §8.1.2, §8.1.3, §8.1.5, §8.1.5.1, §8.1.1.1). Fixture YAML and behavior are unchanged.

Notes

  • Pre-1.0 PATCH bump. Purely textual reframing — no new types, no new error categories, no behavioral change. All v0.17.0 conformance fixtures pass under the renumbered structure without modification. Matches the v0.16.1 precedent (spec-text clarification with no fixture changes). The §3 / §4 / §5 / §6 / §7 contract remains the normative cross-provider surface; §8 is reorganized as a catalog of concrete mappings.
  • Per-mapping subsection structure is not normatively prescribed. §8.1 (the OpenAI-compatible mapping) uses Request / Response / Error / Concurrency / Structured-output subsections; follow-on proposals adding §8.2+ (Anthropic Messages, Google Gemini, Mistral, …) MAY mirror this structure or diverge per provider. The first follow-on may establish a recommended template if reviewer signal warrants.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.17.0 may target v0.17.1 directly.

[0.17.0] — 2026-05-22

Added

  • observability §5.5 expanded with LLM input/output payload attributes (default-off). New openarmature.llm.input.messages (JSON-encoded §3 message list), openarmature.llm.output.content (assistant content verbatim), and openarmature.llm.request.extras (RuntimeConfig extras JSON-encoded). Gated by a new observer-level disable_llm_payload: bool = True flag — default-off for privacy and storage-cost safety; users wanting LLM-aware backend (Langfuse, Phoenix, Honeycomb LLM lens) message rendering flip the flag once at integration. (proposal 0024)
  • observability §5.5.2 — RuntimeConfig request parameters emitted under the OpenTelemetry GenAI semantic conventions (gen_ai.request.temperature, gen_ai.request.max_tokens, gen_ai.request.top_p, gen_ai.request.seed). Direct emission under the GenAI namespace (no OA-prefixed parallels) because these cross-vendor LLM parameters have no OpenArmature-specific semantics. Establishes a precedent for future spec touchpoints: OA-prefix for OA-specific state; GenAI semconv for cross-vendor LLM parameters and response metadata when the semconv name is stable. Absence of an attribute means "the field was not supplied," distinct from "supplied with a zero value." (proposal 0024)
  • observability §5.5.3 — GenAI semconv response attributes (gen_ai.system, gen_ai.request.model, gen_ai.response.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons as one-element array, gen_ai.response.id). Emitted by default so LLM-aware OTel backends render generations correctly out of the box without per-user attribute-mapping shims. The OpenAI-compatible provider defaults gen_ai.system to "openai"; callers using the provider with a non-OpenAI endpoint (vLLM, LM Studio, llama.cpp) MUST be able to override per provider instance. Suppressible via a new disable_genai_semconv: bool = False flag. (proposal 0024)
  • observability §5.5.5 — truncation contract for the §5.5.1 payload attributes. Default 64 KiB per-attribute cap, configurable per observer with a 256-byte minimum. Five-step truncation algorithm: compute the marker, compute target prefix size N = cap - L_marker, backtrack from N to the nearest UTF-8 code-point boundary (preventing split multi-byte sequences for CJK / emoji / combining marks), emit prefix + marker. Marker is the literal suffix …[truncated, M bytes total] appended outside any JSON encoding so backends get a clean truncation signal without a flag attribute. Image content blocks with inline base64 sources MUST be replaced with a redacted placeholder ({type: "image", source: {type: "inline_redacted", byte_count}, media_type, detail?}) before JSON encoding — media_type and detail stay at the image-block level per llm-provider §3.1.2; inline image bytes MUST NOT appear on the span under any configuration. (proposal 0024)
  • observability §5.5.6 — cross-implementation consistency rules for §5.5.1 through §5.5.5. Implementations MUST agree on attribute names, value types, JSON serialization shape (sorted keys, UTF-8, no insignificant whitespace, within-implementation determinism), truncation marker string, inline-image placeholder shape, and the three opt-out flag defaults. Cross-implementation bytewise stability is NOT mandated — JSON encoding rules vary across language standard libraries; conformance fixtures assert parse-shape equivalence rather than bytewise equality. A follow-on MAY adopt a canonical JSON scheme (e.g., RFC 8785 JCS) if cross-impl bytewise stability becomes load-bearing. (proposal 0024)
  • Conformance fixtures 012-otel-llm-payload-default-off through 021-otel-llm-disable-genai-semconv (observability), covering the default-off payload behavior, payload-enabled emission, truncation, image redaction, request-parameter emission (full and partial), RuntimeConfig extras, the GenAI semconv minimum set, gen_ai.system caller-set override, and the disable_genai_semconv opt-out. New harness primitives: disable_llm_payload, disable_genai_semconv, attributes_absent, attribute_parses_as_messages, attribute_parses_as_object, attribute_truncation, attribute_does_not_contain, content_repeat, base64_data_synthetic, provider.genai_system, and a config block under calls_llm (with temperature, max_tokens, top_p, seed, and an extras sub-block for the §6 extra="allow" pass-through fields).

Notes

  • Pre-1.0 MINOR bump. Additions only — no existing attribute is renamed and no v0.7.0 behavior is removed. Implementations currently passing the v0.16.1 fixtures continue to pass; the new fixtures (012–021) extend the suite with cases for the additions. Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.16.1 may target v0.17.0 directly.
  • Default-on by feature, default-off by privacy. §5.5.2 request parameters and §5.5.3 response attributes emit by default (disable_genai_semconv = False) — this is the value that makes LLM-aware backend rendering work without per-user shims. §5.5.1 payload attributes (messages, response content, extras) emit only when explicitly opted in (disable_llm_payload = True by default) — protecting users from inadvertent PII leakage and surprise storage costs. A deliberate divergence from the industry-default-on convention (OpenInference, LangSmith, Phoenix all default-on for content emission); the proposal's Alternatives considered records the rationale.

[0.16.1] — 2026-05-16

Changed

  • graph-engine §6 attempt_index description clarified. The original text ("For nodes not wrapped by retry middleware … attempt_index MUST be 0. For nodes wrapped by retry middleware that re-attempts execution, attempt_index increments per attempt…") was ambiguous on whether "wrapped" included transitive wrapping via middleware on a containing subgraph. Tightened to make explicit that attempt_index increments per attempt for nodes wrapped by retry middleware EITHER directly (the node's own per-node middleware chain) OR transitively (via §9.7 instance middleware or §11.7 branch middleware). Fixture 036 (pipeline-utilities/036-parallel-branches-with-branch-middleware-retry) already encoded the transitive-wrapping reading via its alpha_inner_attempt_indices_seen: [0, 1] invariant and its companion .md prose; the spec text now matches what the fixture has required since v0.11.0.
  • pipeline-utilities §5 attempt-index paragraph clarified. Parallel tightening to the graph-engine §6 change. Also notes that the propagation mechanism is implementation-defined (Python contextvars.ContextVar set by the retry middleware before each next call, TypeScript AsyncLocalStorage or equivalent) so the retry middleware can publish its current attempt counter to events emitted from inner nodes of any subgraph the retry re-invokes. A cross-reference to graph-engine §6's nested-retry precedence rule (innermost-wins) is added at the end of the paragraph.

Notes

  • Pre-1.0 PATCH bump. Spec-text clarification to match existing fixture behavior. Implementations that already passed fixture 036 (alpha_inner_attempt_indices_seen: [0, 1]) under v0.11.0 see no behavior change. Implementations that read the §6 text as direct-wrapping-only — and therefore would have failed fixture 036 — need to add transitive propagation of the retry's attempt counter through the wrapping chain. The spec text now explicitly mandates what the fixture already required.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.16.0 may target v0.16.1 directly without implementing v0.16.0 first.

[0.16.0] — 2026-05-15

Added

  • pipeline-utilities §10.10 — new canonical configuration-time category checkpoint_state_migration_chain_ambiguous. Raised when the registered migration set contains an ambiguity that prevents the engine from picking a unique chain. Two cases trigger the category: a duplicate (from_version, to_version) pair at registration (per §10.12.1) and multiple distinct shortest paths between a source / target version pair at chain resolution (per §10.12.2). Non-transient. Mutually exclusive with the other three migration-related categories (checkpoint_record_invalid, checkpoint_state_migration_missing, checkpoint_state_migration_failed) on any given resume; chain-ambiguous routes first because it fires at build or load time before any migration runs or deserialization is attempted. (proposal 0018)
  • Conformance fixture 047-state-migration-chain-ambiguous (pipeline-utilities), covering both the duplicate-pair-at-registration case and the ambiguous-shortest-paths-at-resolution case via the new expected_chain_ambiguity_error harness primitive. The primitive accepts the named category surfacing at either build time or during resume, preserving §10.12.2's compile-time-SHOULD / load-time-acceptable carve-out so implementations detecting ambiguity at either point pass the same fixture.

Changed

  • pipeline-utilities §10.12.1 — duplicate-pair sentence names the category. "MUST raise a configuration-time error (the chain is ambiguous)" → "MUST raise checkpoint_state_migration_chain_ambiguous (per §10.10) at registration or compile time, before any resume attempt." (proposal 0018)
  • pipeline-utilities §10.12.2 step 2 — multi-shortest-path clause names the category. "MUST raise a configuration-time error — the same category §10.12.1 raises for duplicate (from_version, to_version) pairs" → "MUST raise checkpoint_state_migration_chain_ambiguous (per §10.10)." The "Implementations SHOULD detect ambiguity at compile time when feasible" guidance immediately following remains unchanged. (proposal 0018)
  • pipeline-utilities §10.10 — mutual-exclusion paragraph rewritten to list all four migration-related categories with the new routing precedence (registry well-formedness → version compatibility → chain application → deserialization). (proposal 0018)

Notes

  • Pre-1.0 MINOR bump. Although v0.15.0 already mandated "a configuration-time error" for both ambiguity cases, naming a canonical category that didn't exist before is implementation-visible: implementations that previously raised an arbitrary configuration error (a language-native ValueError, a generic Error, etc.) must now surface checkpoint_state_migration_chain_ambiguous to pass fixture 047. Matches the precedent set by proposal 0014's category additions (checkpoint_state_migration_missing / _failed), which shipped as the v0.12.0 MINOR bump. The change is small in scope (rename the category surfaced for one specific case) but is correctly classified MINOR per pre-1.0 SemVer.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.15.0 may target v0.16.0 directly without implementing v0.15.0 first.

[0.15.0] — 2026-05-14

Added

  • New capability: prompt-management. Creates spec/prompt-management/spec.md. Defines the contract by which named, versioned templates are fetched from one or more backends, rendered with caller-supplied variables, and turned into LLM-ready message sequences. Core abstractions: Prompt (unrendered template + identity metadata), PromptResult (rendered output + identity + content hashes), PromptManager (user-facing API; composes backends, fetches, renders), PromptBackend (fetch-only protocol; backends plug in), PromptGroup (tracing-grouping primitive for related prompts, N≥2 members). Specifies fetch/render separability with a convenience get(), strict-undefined-by-default variable handling (§7), composite-backend fallback semantics (§8 — fall back only on infrastructure failure, not on logical absence), three canonical error categories (prompt_not_found, prompt_render_error, prompt_store_unavailable), cross-spec touchpoints to llm-provider §3 (message shape) and observability §5.5 (prompt-identity span attributes including openarmature.prompt.name/version/label/template_hash/rendered_hash/group_name), and a deterministic-render contract (§12). (proposal 0017)
  • Conformance fixtures 001-fetch-success through 012-prompt-result-rendered-hash-stability (prompt-management), covering local-backend fetch success, prompt-not-found, prompt-store-unavailable, render success, render-undefined-variable, render determinism, composite-manager fallback on infrastructure unavailability, composite-manager NO-fallback on prompt_not_found, composite-manager all-unavailable, the get() convenience equivalence, PromptGroup shape, and within-implementation rendered_hash stability (cross-implementation stability deferred pending a follow-on tightening of the hash algorithm and canonical serialization).

Notes

  • New capability — no existing-behavior implications. The prompt-management capability is wholly new; no existing capability changes. Implementations MAY adopt it incrementally.
  • The capability composes with llm-provider and observability via cross-spec touchpoints in §11; it does not modify either of those specs in this version. A follow-on observability proposal MAY tighten the MAY propagation guidance in §11 once cross-implementation propagation mechanisms settle.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.14.0 may target v0.15.0 directly without implementing v0.14.0 first.

[0.14.0] — 2026-05-14

Added

  • llm-provider §5 — response_schema parameter on complete(). Optional JSON Schema describing the expected output shape. When None/absent, the call behaves as in v0.4.0 (free-form text content; no parsed value). When present, the top-level schema MUST be an object schema (type: "object" at the root), matching §4 Tool.parameters and OpenAI's strict-mode wire format. Single-method design — same complete() operation handles both free-form and structured-output calls; the response carries a new parsed field when applicable. (proposal 0016)
  • llm-provider §6 — parsed field on Response. Holds the parsed-and-validated structured value when the call supplied a response_schema and the model returned structured content. Absent on free-form calls and on finish_reason: "tool_calls" responses (regardless of whether message.content is also populated, per the §3 assistant-message contract). message.content carries the provider's content string preserved verbatim — implementations MUST NOT re-serialize parsed back into message.content. (proposal 0016)
  • llm-provider §7 — new error category structured_output_invalid. Raised when complete() was called with a response_schema and the provider returned content that could not be parsed as JSON OR did not validate against the schema. The error MUST expose the requested schema, the raw response content, and a description of the parse/validation failure. Non-transient by default — a model that fails schema compliance on a given prompt usually fails the same way on retry; users wanting retry semantics MAY include the category in a RetryMiddleware classifier's transient set. Distinct from provider_invalid_response (which covers wire-shape malformation, not content validation against the caller's schema). (proposal 0016)
  • llm-provider §8.5 Structured output wire mapping. OpenAI request body includes a response_format: { type: "json_schema", json_schema: { name, schema, strict } } field when response_schema is supplied. strict: true enables OpenAI's schema-constrained decoding when the schema satisfies strict-mode constraints; implementations SHOULD fall back to strict: false otherwise. §8.5.1 specifies a prompt-augmentation fallback for providers without native response_format support (construct a modified copy of the message list with a JSON-only directive — caller's messages MUST NOT be mutated). §8.5.2 documents the response mapping (message.content verbatim; parsed is its deserialization against response_schema). (proposal 0016)
  • Conformance fixtures 021-structured-output-success through 028-structured-output-no-schema-regression (llm-provider), covering happy-path success, JSON-parse failure routing, schema-validation failure routing, non-transient retry classification, tool-calls path with schema set (parsed absent), native wire-format mapping, prompt-augmentation fallback path, and the no-schema regression (v0.4.0 behavior preserved when response_schema is absent).

Changed

  • llm-provider §10 Out of scope — structured output deferral removed. The single "Structured output — JSON mode, schema-constrained decoding, response_format" entry is removed; §5/§6/§7/§8.5 collectively cover the capability. Other §10 entries (streaming, audio/video, token counting, provider-native wire formats, agent loop, retry/rate-limit, prompt template rendering, embeddings) unchanged. (proposal 0016)

Notes

  • Additive change to complete() signature and Response shape (pre-1.0 MINOR). Existing callers that don't supply response_schema see no behavior change — the parsed field is absent on free-form responses, and the wire body MUST NOT include response_format. The new structured-output path is fully opt-in.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.13.0 may target v0.14.0 directly without implementing v0.13.0 first.

[0.13.0] — 2026-05-14

Added

  • llm-provider §3.1 Content blocks. New subsection defining text and image blocks for use in user-message content. Text blocks carry a single text string; image blocks carry a source (url or inline base64), a conditional media_type (required for inline sources, ignored for URL sources; required to be one of image/png, image/jpeg, image/webp at minimum), and an optional detail hint ("auto" / "low" / "high"). A user message MAY mix text and image blocks freely; block order is preserved through the wire. v1 scope: image input on user messages only — assistant-output images, audio, and video remain deferred. (proposal 0015)
  • llm-provider §7 — new error category provider_unsupported_content_block. Raised when the bound model does not support a content block type used in the request (e.g., text-only model received an image block, or media_type/source variant unsupported). Pre-send validation or post-receive mapping; non-transient. (proposal 0015)
  • llm-provider §8.1.1 Content-block wire mapping. Each spec content block maps to one OpenAI content-array entry: TextBlock{ "type": "text", ... }; ImageBlock with URL source → { "type": "image_url", "image_url": { "url": ... } }; ImageBlock with inline source → { "type": "image_url", "image_url": { "url": "data:<media_type>;base64,<base64_data>" } } per RFC 2397. The detail hint maps to image_url.detail. Empty blocks rejected pre-send via provider_invalid_request. (proposal 0015)
  • Conformance fixtures 009-content-blocks-text-only-equivalence through 020-content-blocks-inline-image-missing-media-type (llm-provider), covering text-only equivalence with the string form, URL-image and inline-base64 image mapping, the detail hint, mixed-order preservation, empty-sequence and empty-text-block validation, image-block-missing-source structural rejection, invalid detail-value enum rejection, inline-image-missing-media-type rejection, unsupported-by-model error routing, and the user-only restriction.

Changed

  • llm-provider §3 Message shape — user-role content constraint. content on user messages MAY be either a non-empty string (the v1 form) OR a non-empty ordered sequence of content blocks per §3.1. All other roles remain text-string-only in this version. (proposal 0015)
  • llm-provider §8.1 Request mapping — user row. Updated to reflect the dual-shape input: string content maps directly to OpenAI's content string; content-block sequence maps to OpenAI's content-array form per §8.1.1. (proposal 0015)
  • llm-provider §10 Out of scope — multi-modal entry split. The single "multi-modal content (image, audio, video inputs and outputs)" entry split into two: "Multi-modal audio and video" (audio and video each warrant their own proposal — formats, codecs, wire mappings differ enough) and "Image outputs" (assistant-message-borne images; v1 image support is user-input-only). Image inputs are now covered by §3.1. (proposal 0015)

Notes

  • Additive change to §3 user-message content shape (pre-1.0 MINOR). Existing callers that pass content as a string continue to work unchanged; the new content-block sequence form is opt-in. Implementations that previously rejected non-string content via provider_invalid_request now accept the content-block sequence form when the message is a user message — an observable behavior change for that specific case, classified pre-1.0 MINOR.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.12.0 may target v0.13.0 directly without implementing v0.12.0 first.

[0.12.0] — 2026-05-14

Added

  • pipeline-utilities §10.12 State migrations. Activates the schema_version field that proposal 0008 reserved on CheckpointRecord and adds a registration surface for user-supplied transformations that run on checkpoint load when the stored record's schema_version does not match the current state schema's version. Specifies migration registration (§10.12.1, including backend-constraint requirements for class-bound serialization formats and the configuration-time-error rejection of duplicate (from_version, to_version) pairs), chain resolution (§10.12.2, including migration-function-failure handling), no-op fast path on matching versions (§10.12.3), and composition with checkpoint_record_invalid (§10.12.4). (proposal 0014)
  • pipeline-utilities §10.10 — two new error categories. checkpoint_state_migration_missing (raised on version mismatch when no migration chain connects stored to current; non-transient; carries the registered migration set in the error description) and checkpoint_state_migration_failed (raised when a registered migration function itself raises; non-transient; preserves the underlying exception as cause). The three migration-related categories (checkpoint_record_invalid, ..._missing, ..._failed) are mutually exclusive on any given resume per the §10.10 ordering. (proposal 0014)
  • Conformance fixtures 039-state-migration-additive-field through 046-state-migration-function-raises (pipeline-utilities), covering additive-field migration, chain application, missing/no-path registry, no-op when versions match, parent-state migration, post-migration deserialization failure routing to checkpoint_record_invalid, and migration-function-raise routing to checkpoint_state_migration_failed.

Changed

  • pipeline-utilities §10.2 schema_version description. Reframed as a user-facing identifier carried on the user's state schema, not an implementation-internal backend version. State classes that do not declare a schema_version carry an implementation-defined sentinel and are not migration-eligible. Users intending to evolve their schema across deploys MUST declare an explicit identifier so migrations can register against it. (proposal 0014)
  • pipeline-utilities §10.10 checkpoint_record_invalid description. Removed "incompatible schema_version" from the list of structural-failure reasons; raw schema_version mismatches now route through the migration system per §10.12. Added "post-migration state that fails to deserialize against the current state class per §10.12.4" as a covered case. The category remains non-transient. (proposal 0014)

Notes

  • Additive change to §10.10's category list (pre-1.0 MINOR). Resumes where the stored and current schema_version match see no behavior change. Resumes with a version mismatch observe the new routing: implementations that previously raised checkpoint_record_invalid on raw schema_version mismatch now route through checkpoint_state_migration_missing (when no migration chain connects), checkpoint_state_migration_failed (when a registered migration raises), or checkpoint_record_invalid (when the backend cannot support migration per §10.12.1). An observable behavior change for the version-mismatch case, classified pre-1.0 MINOR.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.11.0 may target v0.12.0 directly without implementing v0.11.0 first.

[0.11.0] — 2026-05-13

Added

  • pipeline-utilities §11 Parallel branches. A topology-driven concurrency primitive: a parallel-branches node dispatches M heterogeneous compiled subgraphs concurrently within a single parent invocation. Each branch is a separately compiled subgraph with potentially different state schema, different middleware, and different topology; per-branch projection in (inputs) and out (outputs) lets each branch read and write parent-state fields. Complements the §9 fan-out primitive (data-driven, N instances of one subgraph). Specifies configuration (§11.1, §11.1.1), per-branch projection (§11.2, §11.4), concurrent execution (§11.3), error policy (§11.5), composition with parent and per-branch middleware (§11.6, §11.7), determinism (§11.8), and the new error categories parallel_branches_no_branches (compile-time) and parallel_branches_branch_failed (runtime, non-transient) (§11.9). (proposal 0011)
  • graph-engine §3 Execution model — concurrency exception extended to parallel branches. The single-threaded execution rule now carves out two bounded exceptions: fan-out (§9) and parallel-branches (§11). Both may execute multiple subgraphs concurrently; single-threaded execution resumes for the parent run after the concurrent node completes. (proposal 0011)
  • graph-engine §6 Observer hooks — branch_name field on NodeEvent. Optional non-empty string, populated only on events from nodes inside a parallel-branches branch. Carries the branch's name as declared in the parallel-branches node's branches mapping. The event-source uniqueness invariant is extended to include branch_name: the combination of namespace, branch_name, fan_out_index, attempt_index, and phase uniquely identifies an event source. branch_name and fan_out_index are independent and MAY both be present simultaneously when a fan-out node executes inside a parallel-branches branch (or vice versa). (proposal 0011)
  • Conformance fixtures 032-parallel-branches-basic through 038-parallel-branches-compose-with-fan-out (pipeline-utilities) and 021-observer-branch-name (graph-engine).

Notes

  • Additive change to the §6 NodeEvent shape (pre-1.0 MINOR). Existing observers that ignore the new branch_name field continue to function unchanged; the field is absent on events from nodes not inside any parallel-branches branch. The change is backwards-compatible at the struct level.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.10.0 may target v0.11.0 directly without implementing v0.10.0 first.

[0.10.0] — 2026-05-09

Added

  • graph-engine §6 Observer hooks — fan_out_config field on NodeEvent. Optional structured value populated only on a fan-out node's own started and completed events. Carries the resolved values for the four observability §5.4 fan-out attributes: item_count (non-negative int), concurrency (positive int or null; null = unbounded, matching pipeline-utilities §9.2's resolved type), error_policy ("fail_fast" or "collect"), parent_node_name (string, equal to the event's node_name). Absent on all other events. When fan_out_config is populated, all four keys are always present (observers can rely on key presence); only concurrency is nullable, with the other three keys always non-null. The field is the canonical surfacing mechanism — observers source the §5.4 attributes from event.fan_out_config rather than from any implementation-private mechanism. The 0 sentinel in observability §5.4's openarmature.fan_out.concurrency OTel attribute is an attribute-mapping pragmatism (OTel primitives can't carry null) and does not appear on the canonical field. (proposal 0013)
  • observability §5.4 Fan-out span attributes — editorial cross-reference paragraph. Specifies how the existing §5.4 attributes are sourced from the new graph-engine §6 fan_out_config field, preserving §5.4's two-span-category distinction: item_count/concurrency/error_policy go on the fan-out node span and source from fan_out_config on the fan-out node's events; parent_node_name goes on per-instance instance spans (also surfaced via fan_out_config on the fan-out node's started event but cached by the observer and applied when synthesizing per-instance spans, since per-instance events don't carry fan_out_config); fan_out_index continues to source from event.fan_out_index on inner-node events. The paragraph also notes that §4's per-instance fan-out instance span layout applies regardless of detached mode (already true in §4's prose; the cross-reference makes it explicit for §5.4 readers). No new normative behavior in §5.4.

Notes

  • Additive change to the §6 NodeEvent shape (pre-1.0 MINOR). Existing observers that ignore the new field continue to function unchanged; the field is null on non-fan-out events. The change is backwards-compatible at the struct level.
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.9.0 may target v0.10.0 directly without implementing v0.9.0 first.
  • No new conformance fixtures. Conformance fixture observability/006-otel-fan-out-instance-attribution already exercises both pieces of the change — the four fan-out node-span attributes (now sourced from fan_out_config) and the per-instance subgraph span layout (already required by §4).
  • Cross-spec impact verified: pipeline-utilities §9 fan-out node configuration unchanged (the new field is sourced from the existing config; no shape change at the configuration boundary). observability §4 per-instance layout requirement unchanged (this proposal cross-references it without altering it). llm-provider §1-§9 untouched.
  • Surfaced during Phase 6.1 PR-C.2 scoping in openarmature-python — the initially recommended implementation-private ContextVar pattern (per coordination thread phase-6-1-pr-c-conformance-fixtures round 06) does not survive the observer's worker-task boundary because async-runtime context-copy semantics freeze the worker's context at task creation. ContextVar mutations on the engine side after worker creation are invisible to the worker. The data must flow through the canonical event payload to cross the queue. Three alternatives considered (ContextVar, typed pre_state subclass, sidecar extra mapping); fan_out_config field on canonical NodeEvent chosen for typed, language-portable surfacing.

[0.9.0] — 2026-05-09

Changed

  • graph-engine §3 Execution model — completed event fires after edge evaluation (BREAKING, but pre-1.0). Step 3 of the execution loop is amended: the completed observer event MUST be dispatched after the merge in step 2 AND the edge evaluation in step 4 both complete, rather than between them. The dispatched event captures the node's complete transition: body execution, reducer merge, and outgoing edge resolution. The failure list in step 3 extends to include routing_error (no matching edge) and edge_exception (edge function raised) — both now populate the error field of the preceding node's completed event rather than propagating without an event. (proposal 0012)
  • graph-engine §6 Observer hooks — routing_error and edge_exception share the preceding node's event pair (BREAKING, but pre-1.0). Replaces the v0.6.0 wording "routing_error does NOT produce its own node event pair" with a uniform "edge-resolution failures land on the preceding node's completed event with error populated; observer applies its standard §4.2 status-mapping path." All five §4 runtime error categories now land via the same mechanism. No new event flow; no implementation-side post-end span mutation; no observer code path additions for edge-resolution errors.

Added

  • Conformance fixture 020-observer-edge-error-events (graph-engine). Two sub-cases — routing_error_lands_on_preceding_node_completed, edge_exception_lands_on_preceding_node_completed — verify that edge-resolution failures share the preceding node's started/completed pair with error populated, the downstream node never runs, and the error category on the completed event matches the §4 category propagated to the invoke() caller.

Notes

  • Breaking change to v0.6.0+ §6 event-shape contract permitted by pre-1.0 SemVer (per GOVERNANCE.md). Same shape as v0.6.0's pair-model breaking bump (also pre-1.0 MINOR).
  • Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.8.2 may target v0.9.0 directly without implementing the v0.8.2 ordering first. openarmature-python's Phase 6.1 PR-C.1 is the canonical first implementation of this contract.
  • Cross-spec impact verified: observability §4.2 status mapping picks up routing_error and edge_exception automatically (the existing error-populated completed-event handler covers them). No changes required in observability §4.2/§5/§6, pipeline-utilities §6/§9/§10, or llm-provider §1-§9.
  • Surfaced during Phase 6.1 PR-C scoping in openarmature-python — conformance fixture observability/004-otel-routing-error-attribution could not drive cleanly under the v0.8.2 §3/§6 ordering. Two paths considered (sentinel routing_error event vs. ordering swap); swap was chosen for uniform §4 category treatment and to avoid implementation-defined post-end span mutation.

[0.8.2] — 2026-05-06

Fixed

  • Conformance fixture 029-checkpoint-subgraph-resume (pipeline-utilities) used namespace: ["inner"] (the subgraph's name) in its expected completed_positions entry where it should have used namespace: ["dispatch"] (the wrapper node's name in the parent graph). Per graph-engine §6 and the convention established by fixture 013-observer-subgraph-namespacing-and-ordering, namespace is the chain of containing-graph node names, not subgraph names. NodePosition.namespace excludes the node's own name, so for step_one inside subgraph "inner" dispatched by outer node "dispatch", the saved position carries namespace: ["dispatch"]. Bug introduced when fixture 029 was first written; caught during Phase 5 (checkpointing) implementation in openarmature-python — the engine implementation correctly follows §6's convention; the fixture was inconsistent. Without this fix, fixture 029 would reject any conformant implementation. Fixture-only correction; no spec text or contract changes. (PR #30)

[0.8.1] — 2026-05-05

Added

  • Conformance fixture 019-subgraph-two-level-nesting (graph-engine). Regression coverage at depth 3 — existing subgraph fixtures (006, 011, 013) only exercised depth 1, leaving the §6 len(parent_states) == len(namespace) - 1 invariant and the §2 default-projection chain untested at namespace length 3 / parent_states length 2. First graph-engine fixture using the plural subgraphs: form (already in use in observability and pipeline-utilities). No spec text or contract changes. (PR #28)

[0.8.0] — 2026-05-04

Added

  • pipeline-utilities §10 Checkpointing (created). A normative Checkpointer protocol — save / load / list / delete keyed by invocation_id — that lets a graph invocation persist state at well-defined save points and resume from a prior invocation_id without restarting from scratch. The protocol is backend-agnostic: §10 defines the contract; reference implementations (InMemoryCheckpointer, SQLiteCheckpointer) ship in core; durable-execution adapters (Temporal, DBOS, Restate, Redis) plug in as sibling packages. The engine fires a save at every graph-engine §6 completed event for outermost-graph nodes, subgraph-internal nodes, and the fan-out node itself (when the fan-out has fully completed). Fan-out instance internals do NOT save in v1, since v1 fan-out resume is atomic-restart and saving inner-instance state the engine cannot resume from would be dead weight. (proposal 0008)
  • §10.1.1 Registration and default behavior. Checkpointing is opt-in via Checkpointer registration at graph build time. Without a registered Checkpointer the engine never calls save() and invoke(resume_invocation=...) raises checkpoint_not_found. Mirrors the §6 observer-registration pattern; matches OA's broader "contract is normative; activation is an explicit choice" pattern.
  • §10.4 Resume model. invoke(resume_invocation=invocation_id) loads the prior record, restores state, mints a new invocation_id for the resumed run, preserves the original correlation_id as the cross-attempt join key, and resumes from the first node in graph topological order whose position is not in completed_positions. Subgraph re-entry uses parent_states. State-restore (not event-replay) — sufficient because graph-engine §5's determinism contract makes state at any boundary equivalent to "all prior nodes' merged contributions."
  • §10.5 Idempotency contract. Nodes MUST be idempotent under re-execution; mid-node crashes restart the node from its entry on resume. Three explicit escape hatches for nodes that cannot be made idempotent: application-level idempotency (idempotency keys, conditional writes — recommended); a sentinel-based skip middleware on top of pipeline-utilities §6; or skip checkpoint registration entirely.
  • §10.6 Retry on resume. attempt_index resets to 0 on resume; retry budgets restart fresh. Consistent with "resume is a new execution attempt" framing (§10.4 step 4).
  • §10.7 Fan-out resume — atomic in v1. A crash mid-fan-out causes the entire fan-out to re-run on resume. Couples directly to §10.3's "no fan-out internal saves" rule. A follow-on proposal will add per-instance fan-out resume with configurable backend batching for fan-out internal saves.
  • §10.8 Composition with §6 observer hooks. Checkpointer.save calls SHOULD emit a §6-style observer event so the observability mapping can surface saves as spans (openarmature.checkpoint.save recommended). SHOULD-level to allow high-throughput backends to suppress event emission.
  • §10.9 Composition with detached trace mode. Detached trace mode (observability §4.4) and checkpoint scope are independent. Detached trace mode is purely about trace UI organization; checkpoint scope is about execution recovery. One invoke() call produces one Checkpointer record set keyed by one invocation_id, regardless of how many detached traces it produced.
  • §10.10 New canonical runtime error categories. checkpoint_not_found (non-transient — raised when Checkpointer.load returns None); checkpoint_save_failed (engine behavior implementation-defined — transient via middleware OR raise to caller; implementation MUST document its choice); checkpoint_record_invalid (non-transient — raised when a loaded record's schema is incompatible with the current graph).
  • §10.11 Reference implementations and backend layering. Core ships InMemoryCheckpointer (not durable; tests, short-lived runs) and SQLiteCheckpointer (durable on a single host, WAL-mode, accepts pickleable or JSON-native state). Sibling-package adapters for Temporal, DBOS, Restate, and Redis are informative — not specified normatively.
  • 8 conformance fixtures 024-031: save-on-every-completed-event, resume-from-completed-position, record-shape, attempt-index-resets-on-resume, fan-out-atomic-restart, subgraph-resume, checkpoint-not-found, correlation-id-preserved-across-resume.

[0.7.0] — 2026-04-29

Added

  • observability capability (created). Establishes the observability surface; the first backend mapping is OpenTelemetry. Defines a span hierarchy rooted at an openarmature.invocation span with node, subgraph, fan-out instance, retry attempt, and LLM-provider child spans (§4); span status mapping (§4.2) where engine-raised errors per graph-engine §4 produce ERROR status with exception_recorded; the openarmature.* attribute namespace covering invocation, node, subgraph, fan-out, LLM-provider, and cross-cutting attributes (§5); opt-in detached trace mode per subgraph or per fan-out node (§4.4) for very large fan-outs and long-running subgraphs, where the dispatch span carries an OTel Link to a new trace_id; canonical span-name table (§4.5); a normative §6 TracerProvider isolation rule — openarmature MUST emit through its own private TracerProvider, never the OTel global one, preventing duplicate signals when callers run their own auto-instrumentation; a §5.5 LLM-provider span MUST emit rule with a disable_llm_spans opt-out for callers who prefer external instrumentation; OTel Logs Bridge integration so log records emitted during an invocation carry the active trace_id/span_id (§7); and a §8 determinism contract that asserts deterministic span content (hierarchy, names, attributes minus timing, status) while carving out IDs and timestamps. (proposal 0007)
  • §3 Cross-backend correlation ID — first-class architectural concept. A per-invocation correlation_id propagated across every backend the implementation emits to: caller-supplied verbatim or auto-generated UUIDv4 when absent; propagated via the language's idiomatic context primitive (Python ContextVar, TypeScript AsyncLocalStorage); reset between invocations; flows unchanged across detached subgraphs/fan-outs (invocation-scoped, not trace-scoped). For the OTel mapping it surfaces as openarmature.correlation_id on every span (§5.6) and every log record (§7); future backend mappings (Langfuse, etc.) follow the same per-backend "correlation ID realization" pattern.
  • §5.1 openarmature.invocation_id MUST UUIDv4. Framework-generated, canonical 36-character UUIDv4. Distinct from correlation_id: invocation_id ties spans of one invocation together within one backend; correlation_id is the cross-backend join key. Backends MUST NOT conflate them.
  • Conformance fixture suite 001-011 for observability: basic trace shape, subgraph hierarchy, error status, routing-error attribution to the preceding node span, LLM-provider span nested under the calling node (with disable_llm_spans and external-auto-instrumentation isolation sub-cases), fan-out instance attribution via fan_out_index, retry attempt spans (sibling-level), detached trace mode for both subgraph and fan-out, correlation_id cross-cutting + UUIDv4 + context-reset, log correlation including the detached-trace interaction, and determinism over the deterministic portion of span content.

[0.6.0] — 2026-04-28

Added

  • pipeline-utilities §9 Parallel fan-out (created). A fan_out node type that executes a compiled subgraph (or async callable) once per item in a parent state field, with bounded concurrency, and collects per-instance results back into a parent collection field. Two modes: items_field (data-driven; instance count = len(items_field_value), items projected per-instance via item_field) and count (count-driven; literal int OR callable (state) -> int; no per-item data). Mutually exclusive. Default concurrency: 10 (also int-or-callable). Default error_policy: "fail_fast" (cancel siblings on first failure); alternative "collect" (run all, omit failed slots, record errors in errors_field). New instance_middleware config wraps each instance's invocation as a unit (the seam for whole-instance retry vs. per-inner-node retry). Empty fan-out (items_field == [] or count == 0) raises fan_out_empty by default (on_empty: "raise"); user opts in to silent no-op via on_empty: "noop". Optional count_field writes the resolved instance count to a parent state field for programmatic inspection. New compile error categories fan_out_field_not_list, fan_out_count_mode_ambiguous. New runtime error categories fan_out_invalid_count, fan_out_invalid_concurrency, fan_out_empty (non-transient — does not auto-resolve via retry). (proposal 0005)
  • graph-engine §3 Execution model — fan-out concurrency exception. Single-threaded execution rule carved out so a fan-out node may execute multiple subgraph instances concurrently. Single-threaded execution resumes for the parent run after the fan-out completes.
  • graph-engine §6 — fan_out_index field on the node event shape. Optional non-negative integer; populated only on events from nodes inside a fan-out instance. The combination of namespace, fan_out_index, attempt_index, and phase uniquely identifies an event source.
  • graph-engine §6 — per-observer phase subscription. Optional phases parameter on observer registration. Accepted values: {"started", "completed"} (default), {"completed"} (v0.5.0-style; useful for metrics/log aggregators), {"started"} (useful for stuck-node alerting). Empty phase sets raise at registration. Engine filters delivery; phase filter applies at delivery, not dispatch.
  • Conformance fixtures for pipeline-utilities 017-023 (fan-out basic, fail-fast, collect, retry-middleware, instance-middleware-retry, count-and-concurrency-modes, empty-input) and for graph-engine 017-018 (fan-out index, phase subscription).

Changed

  • graph-engine §6 Event dispatch — replaced single-event-per-attempt with started/completed pairs (BREAKING, but pre-1.0). Each node attempt now produces TWO events: a started event before the node executes, and a completed event after the reducer merge (or after a failure is captured). Both events share node_name, namespace, step, attempt_index, fan_out_index, pre_state, parent_states. started events have post_state and error absent; completed events have exactly one of post_state or error populated. Required new phase field on the event shape. The pair model makes span boundaries cleaner for OpenTelemetry mapping and other observability backends; doubled event volume is mitigated by per-observer phase subscription.
  • graph-engine §6 — removed the v0.5.0 "Middleware-dispatched events" subsection. Under the pair model, the engine instruments at the inner-node-call level: each invocation of the wrapped node function produces a started/completed pair from the engine. Retry middleware no longer dispatches its own events — engine handles per-attempt events naturally. The "Middleware-dispatched events" mechanism added in v0.5.0 is no longer needed and is removed.
  • pipeline-utilities §6.1 Retry middleware — manual dispatch removed. Pseudocode simplified: no more dispatch_failed_attempt_event(...) calls. Each call to next(state) triggers a fresh started/completed pair from the engine. The "Per-attempt observer events" subsection rewritten to reflect engine-handled events.
  • pipeline-utilities §8 Out of scope — removed "Parallel fan-out / fan-in" (now in §9).
  • Existing v0.5.0 conformance fixtures updated for the pair model: graph-engine/conformance/012-016 (5 fixtures) and pipeline-utilities/conformance/011, 015 — every event in expected.observer_events split into a started/completed pair; delivery_order updated to include phase field.

Notes

  • Breaking change to v0.5.0 §6 contract permitted by pre-1.0 SemVer (per GOVERNANCE.md). Per the new "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.5.0 may target v0.6.0 directly without implementing the v0.5.0 contract first.

[0.5.0] — 2026-04-28

Added

  • pipeline-utilities capability (created). Establishes the foundational pipeline-utilities surface. §2 specifies the middleware primitive: an async wrapper around node execution with the shape (state, next) -> partial_update, supporting pre-node and post-node phases, short-circuit, exception recovery, and reentrant next calls. §3 mandates per-node and per-graph registration with per-graph-outside-per-node composition. §4 mandates strict bidirectional subgraph-boundary locality (parent middleware sees the subgraph as a single dispatch; subgraph middleware never sees parent state). §6 specifies two canonical middleware implementations MUST ship: retry (§6.1) with default classifier aligned to llm-provider §7 transient categories, exponential-with-full-jitter backoff, explicit cancellation propagation, and per-attempt observer event dispatch; timing (§6.2) with monotonic-clock duration record, on_complete callback, and per-node node_name capture. (proposal 0004)
  • New RetryMiddleware.classifier signature (exception, state) -> bool. Default classifier ignores state and matches purely on §7 transient categories; user-supplied classifiers MAY consult pre-merge state for context-dependent retry policies.
  • Conformance fixture suite 001-016 for pipeline-utilities, exercising basic firing, composition ordering, per-graph-vs-per-node nesting, short-circuit, error propagation, error recovery, retry success/exhaustion/passthrough/determinism, subgraph isolation, timing basic firing/failure path, timing+retry composition, retry per-attempt observer events, and retry state-aware classifier.

Changed

  • graph-engine §6 Observer hooks — attempt_index field added to node event shape. Non-negative integer, default 0. For nodes wrapped by retry middleware (pipeline-utilities §6.1) that re-attempts execution, attempt_index increments per attempt; combined with node_name and namespace it uniquely identifies events from a retried node. The len(parent_states) == len(namespace) - 1 invariant is unaffected. (proposal 0004)
  • graph-engine §6 Event dispatch — events fire per attempt, not per node execution. For nodes not wrapped by re-attempting middleware, this is exactly once per node execution (unchanged from v0.4.0). For nodes wrapped by retry middleware, one event fires per attempt: the engine dispatches the final attempt's event; the retry middleware dispatches events for any preceding failed attempts via the new "Middleware-dispatched events" subsection.
  • graph-engine §6 — new "Middleware-dispatched events" subsection. Middleware MAY dispatch additional node events through the engine's delivery queue. Pipeline-utilities canonical retry middleware MUST do so for non-final attempts. Implementation-defined dispatch mechanism; same delivery-queue rules and observer-error isolation as engine-dispatched events; same §5 determinism contract.
  • Graph-engine conformance fixture 016-observer-attempt-index-default — verifies the new attempt_index field defaults correctly to 0 for non-retry workflows.

Notes

  • Open question deferred from proposal 0004: per-conditional-branch middleware. Documented as an Out-of-scope item in pipeline-utilities §8 with workarounds (state markers + per-node middleware).

[0.4.0] — 2026-04-28

Added

  • llm-provider capability (created). Establishes the foundational LLM provider abstraction: typed Message (system/user/assistant/tool), Tool, ToolCall, and Response shapes; stateless async complete() operation; pre-flight ready() check with a strong "next call expected to succeed" contract; seven canonical error categories (provider_authentication, provider_unavailable, provider_invalid_model, provider_model_not_loaded, provider_rate_limit, provider_invalid_response, provider_invalid_request); a normative OpenAI-compatible wire format mapping (§8) covering vLLM, LM Studio, llama.cpp, and the OpenAI hosted API. Charter §3.1 principle 8 ("Transparency over abstraction") is realized by Response.raw (verbatim provider response, always populated) and by surfacing partial/malformed tool calls under finish_reason: "error" for application-level repair. (proposal 0006)
  • New canonical runtime category provider_model_not_loaded — distinct from provider_invalid_model. The model is configured but not currently serving (local-server warmup pattern); marked transient (retry MAY succeed once loading completes).
  • Response.raw field — the parsed provider response verbatim, MUST be populated on every successful complete() return. Provider-specific extensions (logprobs, vendor stats) surface here unchanged.
  • Tool-call id verbatim preservation rule — implementations MUST NOT rewrite or normalize provider-supplied ids. Documents cross-provider id round-tripping behavior for applications behind LLM gateways or routers.
  • Conformance fixture suite 001-008 for llm-provider, exercising basic completion, tool-call roundtrip with verbatim id preservation, pre-send message validation, error category mapping, OpenAI wire-format mapping with raw passthrough, usage accounting, the strengthened ready() contract, and partial/malformed tool calls under finish_reason: "error".

[0.3.1] — 2026-04-28

Fixed

  • Conformance fixture 013-observer-subgraph-namespacing-and-ordering was syntactically invalid YAML and could not be parsed by spec-conforming loaders (PyYAML, libyaml). The four parent_states: values inside the flow-style event mappings used block-style sub-sequences (- {...}), which YAML 1.2 §8.1.2 forbids inside a flow context. Converted those four sub-sequences to flow style ([{...}]); the parsed semantic content is unchanged. No spec text or fixture expectations changed.

[0.3.0] — 2026-04-27

Added

  • graph-engine §6 Observer hooks (promoted from informative to normative). Compiled graphs MUST expose a way to register observers (graph-attached and invocation-scoped, at minimum). Observers are async, fire-and-forget, and receive node events with node_name, namespace (ordered sequence), step (monotonic across the invocation including subgraph-internal nodes), pre_state, exactly one of post_state or error, and parent_states (ordered sequence of containing-graph state snapshots, outermost first; empty for outermost-graph events; len(parent_states) == len(namespace) - 1). pre_state/post_state carry the node-level state shape — outer state for outermost-graph nodes, subgraph state for inner nodes. Per-invocation delivery is strictly serial across all observers and all events; per-event order is graph-attached outermost→innermost, then invocation-scoped. Observer errors MUST NOT interrupt the graph run, prevent other observers from receiving the same event, or prevent subsequent events from being delivered. Compiled graphs MUST expose a drain operation. (proposal 0003)
  • graph-engine §3 Execution model — observer dispatch step. Between the reducer merge and the outgoing-edge evaluation, the engine MUST dispatch the node event onto the observer delivery queue. On a failed merge step, the event is dispatched (with error populated) before the failure propagates to the caller.
  • Conformance fixture 012-observer-basic-firing — linear graph with one graph-attached and one invocation-scoped observer; verifies per-node event firing, monotonic step, single-element namespace, and graph-attached-before-invocation-scoped delivery order.
  • Conformance fixture 013-observer-subgraph-namespacing-and-ordering — outer + subgraph each with an attached observer; verifies chained namespace, step monotonicity across the subgraph boundary, and outermost-first delivery for subgraph-internal events.
  • Conformance fixture 014-observer-error-event — failing-node event has error populated and post_state absent; engine still propagates the §4 node_exception to the caller after dispatch.
  • Conformance fixture 015-observer-error-isolation — first-registered observer raises on every event; verifies the second observer still receives every event, the graph run completes, and the raised exceptions do not propagate to invoke().

[0.2.0] — 2026-04-27

Added

  • graph-engine §2 Subgraph — explicit input/output mapping. A subgraph-as-node MAY declare optional inputs (subgraph field name → parent field name) and/or outputs (parent field name → subgraph field name) mappings. inputs is additive over the §2 default of no projection in; outputs replaces (does not extend) the §2 default of field-name matching for projection out. (proposal 0002)
  • New canonical compile-error category mapping_references_undeclared_field — added to the §2 Compiled graph mandated identifier list. Compilation MUST fail with this category when an inputs or outputs mapping names a field that is not declared in the relevant state schema.
  • Conformance fixture 011-subgraph-explicit-mapping — composes the same subgraph at three sites with different mapping configurations (both / inputs-only / outputs-only) and verifies projection-in copies, projection-out replacement vs. fallback, and per-site mapping independence.
  • Conformance fixture 007-compile-errors adds case mapping_references_undeclared_field.

[0.1.1] — 2026-04-18

Changed

  • graph-engine §2 Subgraph (clarification, non-behavioral). Rewrote the Subgraph section to align with conformance fixture 006-subgraph-composition, which already encoded the intended behavior. The corrected defaults: projection in is off (a subgraph runs from its own schema's field defaults, independent of the parent), and projection out uses field-name matching (subgraph fields whose names match parent fields merge back via the parent's reducers; non-matching subgraph fields are discarded). The previous wording said parent fields were copied into the subgraph's initial state by field-name matching at entry, which contradicted fixture 006. No fixtures change.
  • proposal 0002 (Draft) — Summary, Motivation, and Detailed design. Reworded so inputs is additive over the clarified "no projection in" default, while outputs continues to replace the default field-name matching for projection out. Added an asymmetry note explaining the design choice; tightened the Precedence rationale to outputs-only.

[0.1.0] — 2026-04-16

Added

  • Initial graph-engine capability: typed state, async nodes, static and conditional edges, reducers (last_write_wins, append, merge), subgraph composition, and the baseline execution model. (proposal 0001)
  • Conformance fixtures for graph-engine under spec/graph-engine/conformance/ (10 fixture pairs covering linear flow, conditional routing, each reducer, subgraph composition, compile-time errors, routing errors, node exception propagation, and determinism).

Notes

  • Mandated error-category identifiers (proposal 0001 supplement). §2 fixes the canonical compile-time categories (no_declared_entry, unreachable_node, dangling_edge, multiple_outgoing_edges, conflicting_reducers), and §4 fixes the canonical runtime categories (node_exception, edge_exception, reducer_error, routing_error, state_validation_error). Proposal 0001 described these cases but did not mandate identifier strings. Applied pragmatically during the initial implementation PR since no spec version had been released; from 0.1.0 onward, comparable changes require a follow-on proposal.
  • Routing error recoverable state (proposal 0001 supplement). §4 now requires that routing errors carry recoverable state, matching the node-exception contract. Proposal 0001 required recoverable state for node exceptions only. Same pragmatic-pre-release rationale as above.
  • Subgraph projection. Defaults to field-name matching for projection out, as clarified in §2. Alternative projection strategies (e.g., explicit input/output mapping) are deferred to proposal 0002 (Draft).