[codex] fix shell runtime value quoting#30
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 1, 2026, 4:58 PM ET / 20:58 UTC. Summary Reproducibility: yes. from source inspection: current main renders shell-mode commands with Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Merge a maintainer-approved literal-data shell placeholder contract with Do we have a high-confidence way to reproduce the issue? Yes, from source inspection: current main renders shell-mode commands with Is this the best way to solve the issue? Yes, assuming maintainers accept the compatibility contract: quote-aware literal rendering is the narrow fix for runtime values as data, and AGENTS.md: not found in the target repository. Codex review notes: model gpt-5.5, reasoning high; reviewed against 1ca672acf013. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
Summary
shell=Trueexecution checks in both verifier implementations{name:raw}as an explicit escape hatch for benchmark-owned shell fragmentsRoot Cause
ExecutionCheck.shelldefaults toTrue, but the shell path rendered placeholders with plain string substitution and then passed the command to/bin/sh -c. Runtime values containing whitespace were split by the shell, and metacharacters could be interpreted as shell syntax.Compatibility
The new renderer treats
{name}as literal data and quotes or escapes it according to shell quote context. Templates that intentionally need a runtime value to expand as a shell fragment can opt in at that insertion point with{name:raw}. The spec documents that{name:raw}is only for benchmark-owned fragments and should not be used for agent-produced or otherwise untrusted values.Fixes #29.
Validation
/tmp/clawbench-shell-render-venv/bin/python -m pytest -q tests/test_execution_shell_rendering.py tests/test_environment_files.py tests/test_environment.py tests/test_services.py/tmp/clawbench-shell-render-venv/bin/python -m ruff check clawbench/render.py clawbench/environment.py clawbench/environment_files.py clawbench/services.py tests/test_execution_shell_rendering.py tests/test_services.py/tmp/clawbench-shell-render-venv/bin/python -m compileall clawbench/render.py clawbench/environment.py clawbench/environment_files.py clawbench/services.pygit diff --check