You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Forge v0.14.0 shipped OpenTelemetry Tracing v1 (initiative #108, PRs #122-#128). Phase 3 (#104) added span instrumentation across the A2A dispatcher, executor loop, LLM completions, and tool calls — but metadata-only: no prompt text, no completion text, no tool args, no tool results recorded as span attributes.
Phase 2 (#103) plumbed capture_content (bool, default false) and redact (bool, default true) through the config schema (observability.tracing in forge.yaml, --otel-capture-content + --otel-redact CLI flags, env propagation). Operators can set these knobs today and silently get nothing — the resolver populates observability.TracingConfig.CaptureContent / Redact, but the Phase 3 instrumentation in forge-core/runtime/loop.go never reads them.
The PR descriptions on #125 (Phase 3) and #126 (Phase 4) called out this gap explicitly:
Phase 3 is METADATA-ONLY. Tool args / results, prompts, completions are NOT recorded as span attributes. The config schema's CaptureContent + Redact knobs (Phase 2) are plumbed but not yet honored by the instrumentation — content capture will reuse the FWS-8 audit redactor in a follow-up so the same PII scrub passes both pipelines.
This issue is that follow-up.
Why one follow-up, not silently never
The config knobs are the bug. An operator who reads docs/core-concepts/observability-tracing.md and sets:
...reasonably expects span attributes to carry prompts / completions / tool I/O with PII scrubbed. Today they get metadata-only spans and no error. That's the worst kind of config surface — load-bearing-looking, silently inert.
Scope
Honor TracingConfig.CaptureContent and TracingConfig.Redact in the Phase 3 instrumentation sites:
Site
Span
Content attribute(s) to add when CaptureContent=true
forge-core/runtime/loop.go LLM call
llm.completion
request messages serialized, response text — keys: gen_ai.prompt + gen_ai.completion per OTel GenAI semconv
forge-core/runtime/loop.go tool call
tool.<name>
forge.tool.args + forge.tool.result
Each attribute value MUST go through the existing FWS-8 redactor pipeline — same PII / secret scrubbing the audit payload-capture path already uses — before being set on the span. The audit pipeline and the trace pipeline must produce identical scrubbed content for the same logical event so an operator who sees a redacted token in an audit row sees the same redaction on the linked span.
Reuse — do NOT re-implement
The FWS-8 redactor already exists. Canonical hooks:
forge-core/runtime/audit_payload_capture.go — AuditPayloadCapture struct (the per-field opt-in surface) + TruncateForAudit(s, max) (the byte-cap + …[truncated:N] marker helper).
forge-core/runtime/audit.go line ~471 — the call site that gates capture on AuditPayloadCapture flags.
Phase 3.5's job is to:
Extract the redact-and-cap logic from audit_payload_capture.go into a small package-internal helper that takes (content string, redact bool, maxBytes int) -> string and apply byte caps appropriate to span attributes (OTel attribute values have a soft cap around 8 KiB before backends start truncating themselves — pin a Forge-side cap below that).
Call the helper from each of the four content sites in loop.go when e.tracingCfg.CaptureContent is true.
Match the audit …[truncated:N] marker so an operator grepping for truncation signals across both pipelines gets identical output.
Redact=false is the enterprise opt-in — raw capture, still capped. Redact=true (default) runs the FWS-8 scrubber.
Plumbing
The executor needs to see the resolved observability.TracingConfig. Today LLMExecutorConfig does not carry it. Phase 3.5 adds:
typeLLMExecutorConfigstruct {
// ...TracingConfig observability.TracingConfig// new
}
Populated in forge-cli/runtime/runner.go from the same resolver call Phase 2 already does (runtime.ResolveTracingConfig).
Tests
In forge-core/runtime/loop_spans_test.go (the file Phase 3 added):
TestExecute_CaptureContentTrue_StampsRedactedPromptOnLLMSpan — set CaptureContent=true, Redact=true, send a prompt containing an obviously redactable secret (e.g. AWS access key shape), assert the gen_ai.prompt attribute exists and does NOT contain the raw key.
TestExecute_CaptureContentTrue_RedactFalse_StampsRawPromptOnLLMSpan — the enterprise raw path, asserts the attribute is present and unredacted (still capped at max bytes).
TestExecute_CaptureContentFalse_NoContentAttribute — default; the keys must not appear at all (pinned by omitempty semantics on the attribute set, not by empty string).
TestExecute_LargePrompt_TruncatesWithSameMarkerAsAudit — the byte cap fires and the marker is byte-identical to what the audit payload-capture path produces for the same input.
Same set for tool args / results on tool.<name> spans.
New attribute keys beyond the four above. gen_ai.system_instructions (system prompt content) and forge.tool.args.<field> per-field decomposition are interesting but out of scope — pick them up in a follow-follow-up if operators ask.
Sampling-aware capture ("only capture content on dropped traces"). The metadata-only default already handles the storage-cost concern.
Documentation updates
docs/core-concepts/observability-tracing.md § Phase 3 is metadata-only — flip to past tense, describe the new behavior, note the parity with audit.
docs/security/audit-logging.md § Trace cross-link — add a paragraph noting that with capture_content: true set, prompt / completion / tool I/O content appears on both the audit row and the linked span, with the same redaction applied.
.claude/skills/forge.md § 12.9 — drop the "Phase 3 ships metadata-only" caveat sentence; replace with a paragraph on the capture surface.
Background
Forge v0.14.0 shipped OpenTelemetry Tracing v1 (initiative #108, PRs #122-#128). Phase 3 (#104) added span instrumentation across the A2A dispatcher, executor loop, LLM completions, and tool calls — but metadata-only: no prompt text, no completion text, no tool args, no tool results recorded as span attributes.
Phase 2 (#103) plumbed
capture_content(bool, default false) andredact(bool, default true) through the config schema (observability.tracinginforge.yaml,--otel-capture-content+--otel-redactCLI flags, env propagation). Operators can set these knobs today and silently get nothing — the resolver populatesobservability.TracingConfig.CaptureContent/Redact, but the Phase 3 instrumentation inforge-core/runtime/loop.gonever reads them.The PR descriptions on #125 (Phase 3) and #126 (Phase 4) called out this gap explicitly:
This issue is that follow-up.
Why one follow-up, not silently never
The config knobs are the bug. An operator who reads
docs/core-concepts/observability-tracing.mdand sets:...reasonably expects span attributes to carry prompts / completions / tool I/O with PII scrubbed. Today they get metadata-only spans and no error. That's the worst kind of config surface — load-bearing-looking, silently inert.
Scope
Honor
TracingConfig.CaptureContentandTracingConfig.Redactin the Phase 3 instrumentation sites:CaptureContent=trueforge-core/runtime/loop.goLLM callllm.completiongen_ai.prompt+gen_ai.completionper OTel GenAI semconvforge-core/runtime/loop.gotool calltool.<name>forge.tool.args+forge.tool.resultEach attribute value MUST go through the existing FWS-8 redactor pipeline — same PII / secret scrubbing the audit payload-capture path already uses — before being set on the span. The audit pipeline and the trace pipeline must produce identical scrubbed content for the same logical event so an operator who sees a redacted token in an audit row sees the same redaction on the linked span.
Reuse — do NOT re-implement
The FWS-8 redactor already exists. Canonical hooks:
forge-core/runtime/audit_payload_capture.go—AuditPayloadCapturestruct (the per-field opt-in surface) +TruncateForAudit(s, max)(the byte-cap +…[truncated:N]marker helper).forge-core/runtime/audit.goline ~471 — the call site that gates capture onAuditPayloadCaptureflags.Phase 3.5's job is to:
audit_payload_capture.gointo a small package-internal helper that takes(content string, redact bool, maxBytes int) -> stringand apply byte caps appropriate to span attributes (OTel attribute values have a soft cap around 8 KiB before backends start truncating themselves — pin a Forge-side cap below that).loop.gowhene.tracingCfg.CaptureContentis true.…[truncated:N]marker so an operator grepping for truncation signals across both pipelines gets identical output.Redact=falseis the enterprise opt-in — raw capture, still capped.Redact=true(default) runs the FWS-8 scrubber.Plumbing
The executor needs to see the resolved
observability.TracingConfig. TodayLLMExecutorConfigdoes not carry it. Phase 3.5 adds:Populated in
forge-cli/runtime/runner.gofrom the same resolver call Phase 2 already does (runtime.ResolveTracingConfig).Tests
In
forge-core/runtime/loop_spans_test.go(the file Phase 3 added):TestExecute_CaptureContentTrue_StampsRedactedPromptOnLLMSpan— setCaptureContent=true, Redact=true, send a prompt containing an obviously redactable secret (e.g. AWS access key shape), assert thegen_ai.promptattribute exists and does NOT contain the raw key.TestExecute_CaptureContentTrue_RedactFalse_StampsRawPromptOnLLMSpan— the enterprise raw path, asserts the attribute is present and unredacted (still capped at max bytes).TestExecute_CaptureContentFalse_NoContentAttribute— default; the keys must not appear at all (pinned byomitemptysemantics on the attribute set, not by empty string).TestExecute_LargePrompt_TruncatesWithSameMarkerAsAudit— the byte cap fires and the marker is byte-identical to what the audit payload-capture path produces for the same input.tool.<name>spans.Out of scope
gen_ai.system_instructions(system prompt content) andforge.tool.args.<field>per-field decomposition are interesting but out of scope — pick them up in a follow-follow-up if operators ask.Documentation updates
docs/core-concepts/observability-tracing.md§ Phase 3 is metadata-only — flip to past tense, describe the new behavior, note the parity with audit.docs/security/audit-logging.md§ Trace cross-link — add a paragraph noting that withcapture_content: trueset, prompt / completion / tool I/O content appears on both the audit row and the linked span, with the same redaction applied..claude/skills/forge.md§ 12.9 — drop the "Phase 3 ships metadata-only" caveat sentence; replace with a paragraph on the capture surface.References