Read faster model from safety buffering events#30325
Conversation
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Direct Codex third-party traffic receives safety-buffering metadata from the Responses WebSocket without the Codex first-party treatment headers. This change reads the new optional
safety_buffering.faster_modelwire field, forwards it through the existing app-server notification, and leaves the existing explicit user retry flow unchanged.API review: https://app.notion.com/p/38c8e50b62b081308e4ae4719443db6b
Paired Responses API producer: https://github.com/openai/openai/pull/1082770
Behavior
faster_modelfallback.response.create, even when the WebSocket connection is reused.nullwire fields remain compatible and produce no retry target.The retry remains explicit and user initiated. It creates an ordinary request with the selected model; all ordinary access, safety, and rate-limit checks still apply.
Validation
just test -p codex-api: 134 passed.just test -p codex-app-server --test all direct_websocket_safety_buffering_reaches_app_server_notification: passed. This uses a mock WebSocket server and proves the wire value reaches the realmodel/safetyBuffering/updatednotification without leaking a disabled warmup treatment across responses on the same connection.just test -p codex-tui safety_buffering_offers_one_retry_with_app_wording: passed and proves the exact model becomes the existingRetrySafetyBufferedTurnaction.just fix -p codex-api,just fix -p codex-app-server, andjust fmt: passed.