Skip to content

fix(ai): compute ERC20 transfer amounts deterministically for summaries#266

Merged
spalen0 merged 2 commits into
mainfrom
fix-summary
Jun 4, 2026
Merged

fix(ai): compute ERC20 transfer amounts deterministically for summaries#266
spalen0 merged 2 commits into
mainfrom
fix-summary

Conversation

@spalen0
Copy link
Copy Markdown
Collaborator

@spalen0 spalen0 commented Jun 4, 2026

Problem

The AI Summary line in Safe multisig alerts reported the wrong magnitude while the linked "Full details" showed the correct figure. Example:

Total outflow ~50.8k USDC-equivalent (raw sum ~50.78M units, assuming 6 decimals)

50.78M raw / 1e6 = 50.78, not 50.8k — the headline was off by ~1000x, but the detail was right.

Root cause

Safe multisig monitoring calls explain_batch_transaction(..., skip_simulation=True) because Safe multiSend batches use DELEGATECALL, which our plain-CALL Tenderly simulator can't model. With simulation skipped there are no Tenderly asset_changes rows (which carry pre-normalized amounts), so the LLM had to do raw / 10**decimals arithmetic itself — and it mis-scaled the value in the short TLDR while computing it correctly in the detailed analysis.

Fix

Take the arithmetic away from the LLM. New deterministic helper _collect_token_flows():

  • Detects known ERC20 movement calls (transfer, transferFrom, mint, burn, approve) by signature.
  • Looks up decimals via the existing cached fetch_erc20_metadata.
  • Normalizes amounts with Decimal (exact — 50_780000 / 1e650.78, no float error) and computes a per-token Total moved.
  • Injects an authoritative --- Token Flows (computed — authoritative amounts) --- prompt section.

The system prompt and self-critique checklist now instruct the model to use these amounts verbatim in both TLDR and DETAIL rather than re-deriving them. approve is listed but excluded from the total (allowance, not a balance move). Falls back silently for non-ERC20 targets / undiscoverable decimals, so nothing regresses. Wired into both explain_transaction and explain_batch_transaction.

Testing

  • Added 5 unit tests (the exact 50.78-vs-50.8k case, totals, non-ERC20 skip, approve-not-summed, prompt injection).
  • All 71 tests in tests/test_ai_explainer.py pass; ruff format + ruff check clean.

🤖 Generated with Claude Code

codex and others added 2 commits June 4, 2026 21:05
Safe multisig alerts run with skip_simulation=True (DELEGATECALL batches our
plain-CALL simulator can't model), so there are no Tenderly asset-change rows
with pre-normalized amounts. The LLM was left to divide raw calldata values by
10**decimals itself and mis-scaled the figure in the short TLDR (e.g. ~50.8k
for a ~50.78-token transfer) while the detail computed it correctly.

Add _collect_token_flows(): detect known ERC20 movement calls, look up decimals
via the cached fetch_erc20_metadata, normalize with Decimal (no float error),
and inject an authoritative "Token Flows (computed)" section plus a per-token
total. System prompt and refine checklist now require the model to use these
amounts verbatim in both TLDR and DETAIL instead of re-deriving them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make the Telegram-visible summary the single source of truth and derive the
full report from it, so the two artifacts the team sees can never disagree on
the headline number or risk verdict. Previously summary and detail were two
fields of one structured call, so the model did the same arithmetic twice and
could diverge (the ~50.8k summary vs the correct ~50.78 detail).

Split _generate_draft into a two-stage _generate_explanation:
- _generate_summary: structured summary + risk_tag only (SUMMARY_SCHEMA).
- _expand_detail: a second call that writes the detail FROM the confirmed
  summary (DETAIL_EXPANSION_TASK), required to stay consistent with it.
- _refine_summary (when refine=True) now critiques the summary alone, before
  expansion, since the summary is authoritative.

The text-fallback path keeps its single joint TLDR+DETAIL completion (no second
call); the derive-from-summary guarantee applies to the structured production
path. Token Flows from the prior commit still make the number correct; this
keeps the two outputs in sync. README updated; tests cover both stages.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@spalen0 spalen0 merged commit 0e87dbf into main Jun 4, 2026
2 checks passed
@spalen0 spalen0 deleted the fix-summary branch June 4, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant