Skip to content

telemetry: log structured tool and inference timing events#30334

Open
bolinfest wants to merge 1 commit into
mainfrom
pr30334
Open

telemetry: log structured tool and inference timing events#30334
bolinfest wants to merge 1 commit into
mainfrom
pr30334

Conversation

@bolinfest

@bolinfest bolinfest commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Why

App-server deployments can emit JSON logs that downstream applications use for diagnostics and operational measurements. Tool latency is especially useful when it can answer more than "how long did the handler run?": consumers may need to distinguish dispatch/queue time from handler time, correlate nested code-mode calls, and avoid double-counting tools that execute concurrently.

Structured INFO events provide those measurements through the existing logging pipeline without requiring an OTEL exporter. For example, an app-server consumer can:

  • identify whether latency accumulated before or during tool execution;
  • correlate events by conversation, turn, call, and trace identifiers;
  • union the intervals from direct tool calls to measure active tool time without double-counting parallel or nested code-mode calls;
  • compare narrow tool execution with complete model-inference sections and the gaps between them;
  • compare client-visible tool latency with exec-server request and process lifetimes.

What changed

  • Added a codex.tool_call completion event covering the full outer tool lifecycle. It records dispatch, handler, and total durations together with conversation, turn, call, source, code-mode, and trace correlation fields.
  • Added a codex.inference completion event for each model sampling section. It reuses the existing turn timing guard and records a per-turn inference index, result, model/provider identity, correlation fields, and total duration.
  • Enriched codex.tool_result with trace/turn/source correlation fields and a numeric duration_seconds field while preserving all existing fields. In particular, arguments and output predate this change and remain unchanged for compatibility.
  • Added codex.exec_server_request and codex.exec_server_process events for request handling and actual process lifetime.
  • Deferred event-only allocation and field computation until the corresponding INFO target is enabled; expressions that can remain inside tracing macros are evaluated there.
  • Added reusable JSONL capture and validation to the app-server test harness, including structured-envelope and RFC 3339 timestamp validation and waiting for repeated named events.
  • Added an end-to-end app-server test that launches the real binary with LOG_FORMAT=json, drives an exec_command through the public v2 JSON-RPC API, and verifies the tool events plus both inference boundaries.

Suggested logging filters

Timing-only app-server consumers can enable the two compact targets without enabling general INFO logging:

LOG_FORMAT=json \
RUST_LOG='warn,codex_core::tools::parallel=info,codex_core::turn_timing=info' \
codex app-server

Exec-server timing can be enabled independently:

RUST_LOG='warn,codex_exec_server::telemetry=info' codex-exec-server

codex.tool_result is emitted on codex_otel.log_only. That pre-existing target includes tool arguments and output, so timing-only consumers should leave it disabled unless they specifically need those verbose fields.

Event examples

These are representative records produced with LOG_FORMAT=json. Identifier and timing values are illustrative, and session metadata on codex.tool_result varies with the caller.

codex.tool_call

This event describes the complete outer tool-call lifetime, including time spent waiting to dispatch and time spent in the handler.

{
  "timestamp": "2026-06-27T03:45:20.443Z",
  "level": "INFO",
  "fields": {
    "message": "tool call completed",
    "event.name": "codex.tool_call",
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "conversation.id": "67e55044-10b1-426f-9247-bb680e5fe0c8",
    "turn_id": "019f04f8-6ac2-78f1-8625-f04a6d35af18",
    "tool_name": "exec_command",
    "call_id": "call_7b8483",
    "tool_source": "direct",
    "code_mode_cell_id": "",
    "code_mode_runtime_tool_call_id": "",
    "execution_started": true,
    "dispatch_duration_seconds": 0.012,
    "handler_duration_seconds": 0.431,
    "total_duration_seconds": 0.443
  },
  "target": "codex_core::tools::parallel"
}

codex.inference

This event describes one complete model sampling section. The completion timestamp and duration make both inference boundaries reconstructable, while inference_index orders repeated requests within one turn.

{
  "timestamp": "2026-06-27T03:45:20.882Z",
  "level": "INFO",
  "fields": {
    "message": "inference completed",
    "event.name": "codex.inference",
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "conversation.id": "67e55044-10b1-426f-9247-bb680e5fe0c8",
    "turn_id": "019f04f8-6ac2-78f1-8625-f04a6d35af18",
    "model": "gpt-5.1-codex",
    "provider_name": "OpenAI",
    "inference_index": 2,
    "result": "success",
    "duration_seconds": 1.284
  },
  "target": "codex_core::turn_timing"
}

codex.tool_result

This existing event now carries the correlation fields needed to join it to codex.tool_call, plus a numeric duration in seconds. Existing fields such as arguments, output, and session metadata remain available.

{
  "timestamp": "2026-06-27T03:45:20.443Z",
  "level": "INFO",
  "fields": {
    "event.name": "codex.tool_result",
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "turn_id": "019f04f8-6ac2-78f1-8625-f04a6d35af18",
    "tool_name": "exec_command",
    "call_id": "call_7b8483",
    "tool_source": "direct",
    "code_mode_cell_id": "",
    "code_mode_runtime_tool_call_id": "",
    "arguments": "{\"cmd\":\"git status\"}",
    "duration_ms": "431",
    "duration_seconds": 0.431,
    "success": "true",
    "output": "Chunk ID: 14f9ac\nProcess exited with code 0",
    "mcp_server": "",
    "mcp_server_origin": "",
    "event.timestamp": "2026-06-27T03:45:20.443Z",
    "conversation.id": "67e55044-10b1-426f-9247-bb680e5fe0c8",
    "app.version": "0.1.0",
    "auth_mode": "chatgpt",
    "originator": "codex_app_server_rs",
    "terminal.type": "unknown",
    "model": "gpt-5.1-codex",
    "slug": ""
  },
  "target": "codex_otel.log_only"
}

codex.exec_server_request

This event measures exec-server handling for one JSON-RPC request and retains the incoming trace context when present.

{
  "timestamp": "2026-06-27T03:45:20.447Z",
  "level": "INFO",
  "fields": {
    "message": "exec-server request completed",
    "event.name": "codex.exec_server_request",
    "request_id": "request-42",
    "method": "process/start",
    "result": "success",
    "duration_seconds": 0.018,
    "traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
  },
  "target": "codex_exec_server::telemetry"
}

codex.exec_server_process

This event measures the lifetime of the actual process, which can outlive the request that started it.

{
  "timestamp": "2026-06-27T03:45:20.861Z",
  "level": "INFO",
  "fields": {
    "message": "exec-server process completed",
    "event.name": "codex.exec_server_process",
    "process_id": "process-018f27",
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "result": "success",
    "duration_seconds": 0.414
  },
  "target": "codex_exec_server::telemetry"
}

Test plan

  • just test -p codex-app-server
  • just test -p codex-exec-server
  • just test -p codex-otel
  • just test -p codex-core tools::parallel::tests
  • just test -p codex-core tools::registry::tests
  • just test -p codex-cli app_server_emits_json_info_events

@bolinfest bolinfest changed the title telemetry: log structured tool timing events telemetry: log structured tool and inference timing events Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant