fix(ai): compute ERC20 transfer amounts deterministically for summaries by spalen0 · Pull Request #266 · yearn/monitoring

spalen0 · 2026-06-04T21:05:54Z

Problem

The AI Summary line in Safe multisig alerts reported the wrong magnitude while the linked "Full details" showed the correct figure. Example:

Total outflow ~50.8k USDC-equivalent (raw sum ~50.78M units, assuming 6 decimals)

50.78M raw / 1e6 = 50.78, not 50.8k — the headline was off by ~1000x, but the detail was right.

Root cause

Safe multisig monitoring calls explain_batch_transaction(..., skip_simulation=True) because Safe multiSend batches use DELEGATECALL, which our plain-CALL Tenderly simulator can't model. With simulation skipped there are no Tenderly asset_changes rows (which carry pre-normalized amounts), so the LLM had to do raw / 10**decimals arithmetic itself — and it mis-scaled the value in the short TLDR while computing it correctly in the detailed analysis.

Fix

Take the arithmetic away from the LLM. New deterministic helper _collect_token_flows():

Detects known ERC20 movement calls (transfer, transferFrom, mint, burn, approve) by signature.
Looks up decimals via the existing cached fetch_erc20_metadata.
Normalizes amounts with Decimal (exact — 50_780000 / 1e6 → 50.78, no float error) and computes a per-token Total moved.
Injects an authoritative --- Token Flows (computed — authoritative amounts) --- prompt section.

The system prompt and self-critique checklist now instruct the model to use these amounts verbatim in both TLDR and DETAIL rather than re-deriving them. approve is listed but excluded from the total (allowance, not a balance move). Falls back silently for non-ERC20 targets / undiscoverable decimals, so nothing regresses. Wired into both explain_transaction and explain_batch_transaction.

Testing

Added 5 unit tests (the exact 50.78-vs-50.8k case, totals, non-ERC20 skip, approve-not-summed, prompt injection).
All 71 tests in tests/test_ai_explainer.py pass; ruff format + ruff check clean.

🤖 Generated with Claude Code

Safe multisig alerts run with skip_simulation=True (DELEGATECALL batches our plain-CALL simulator can't model), so there are no Tenderly asset-change rows with pre-normalized amounts. The LLM was left to divide raw calldata values by 10**decimals itself and mis-scaled the figure in the short TLDR (e.g. ~50.8k for a ~50.78-token transfer) while the detail computed it correctly. Add _collect_token_flows(): detect known ERC20 movement calls, look up decimals via the cached fetch_erc20_metadata, normalize with Decimal (no float error), and inject an authoritative "Token Flows (computed)" section plus a per-token total. System prompt and refine checklist now require the model to use these amounts verbatim in both TLDR and DETAIL instead of re-deriving them. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Make the Telegram-visible summary the single source of truth and derive the full report from it, so the two artifacts the team sees can never disagree on the headline number or risk verdict. Previously summary and detail were two fields of one structured call, so the model did the same arithmetic twice and could diverge (the ~50.8k summary vs the correct ~50.78 detail). Split _generate_draft into a two-stage _generate_explanation: - _generate_summary: structured summary + risk_tag only (SUMMARY_SCHEMA). - _expand_detail: a second call that writes the detail FROM the confirmed summary (DETAIL_EXPANSION_TASK), required to stay consistent with it. - _refine_summary (when refine=True) now critiques the summary alone, before expansion, since the summary is authoritative. The text-fallback path keeps its single joint TLDR+DETAIL completion (no second call); the derive-from-summary guarantee applies to the structured production path. Token Flows from the prior commit still make the number correct; this keeps the two outputs in sync. README updated; tests cover both stages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

codex and others added 2 commits June 4, 2026 21:05

spalen0 merged commit 0e87dbf into main Jun 4, 2026
2 checks passed

spalen0 deleted the fix-summary branch June 4, 2026 21:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ai): compute ERC20 transfer amounts deterministically for summaries#266

fix(ai): compute ERC20 transfer amounts deterministically for summaries#266
spalen0 merged 2 commits into
mainfrom
fix-summary

spalen0 commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spalen0 commented Jun 4, 2026

Problem

Root cause

Fix

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant