fix(ai): compute ERC20 transfer amounts deterministically for summaries#266
Merged
Conversation
Safe multisig alerts run with skip_simulation=True (DELEGATECALL batches our plain-CALL simulator can't model), so there are no Tenderly asset-change rows with pre-normalized amounts. The LLM was left to divide raw calldata values by 10**decimals itself and mis-scaled the figure in the short TLDR (e.g. ~50.8k for a ~50.78-token transfer) while the detail computed it correctly. Add _collect_token_flows(): detect known ERC20 movement calls, look up decimals via the cached fetch_erc20_metadata, normalize with Decimal (no float error), and inject an authoritative "Token Flows (computed)" section plus a per-token total. System prompt and refine checklist now require the model to use these amounts verbatim in both TLDR and DETAIL instead of re-deriving them. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make the Telegram-visible summary the single source of truth and derive the full report from it, so the two artifacts the team sees can never disagree on the headline number or risk verdict. Previously summary and detail were two fields of one structured call, so the model did the same arithmetic twice and could diverge (the ~50.8k summary vs the correct ~50.78 detail). Split _generate_draft into a two-stage _generate_explanation: - _generate_summary: structured summary + risk_tag only (SUMMARY_SCHEMA). - _expand_detail: a second call that writes the detail FROM the confirmed summary (DETAIL_EXPANSION_TASK), required to stay consistent with it. - _refine_summary (when refine=True) now critiques the summary alone, before expansion, since the summary is authoritative. The text-fallback path keeps its single joint TLDR+DETAIL completion (no second call); the derive-from-summary guarantee applies to the structured production path. Token Flows from the prior commit still make the number correct; this keeps the two outputs in sync. README updated; tests cover both stages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The AI Summary line in Safe multisig alerts reported the wrong magnitude while the linked "Full details" showed the correct figure. Example:
50.78M raw / 1e6 = 50.78, not50.8k— the headline was off by ~1000x, but the detail was right.Root cause
Safe multisig monitoring calls
explain_batch_transaction(..., skip_simulation=True)because Safe multiSend batches use DELEGATECALL, which our plain-CALL Tenderly simulator can't model. With simulation skipped there are no Tenderlyasset_changesrows (which carry pre-normalized amounts), so the LLM had to doraw / 10**decimalsarithmetic itself — and it mis-scaled the value in the short TLDR while computing it correctly in the detailed analysis.Fix
Take the arithmetic away from the LLM. New deterministic helper
_collect_token_flows():transfer,transferFrom,mint,burn,approve) by signature.fetch_erc20_metadata.Decimal(exact —50_780000 / 1e6→50.78, no float error) and computes a per-token Total moved.--- Token Flows (computed — authoritative amounts) ---prompt section.The system prompt and self-critique checklist now instruct the model to use these amounts verbatim in both TLDR and DETAIL rather than re-deriving them.
approveis listed but excluded from the total (allowance, not a balance move). Falls back silently for non-ERC20 targets / undiscoverable decimals, so nothing regresses. Wired into bothexplain_transactionandexplain_batch_transaction.Testing
tests/test_ai_explainer.pypass;ruff format+ruff checkclean.🤖 Generated with Claude Code