Fix: extend encoded-payload-redact with text-cipher encodings (#203) by twschiller · Pull Request #216 · pixiebrix/agent-browser-shield

twschiller · 2026-06-07T20:59:45Z

Summary

Extends encoded-payload-redact beyond byte encodings (base64 / hex / percent) to cover six text ciphers: ROT13, Atbash, reverse, leetspeak, NATO phonetic, and Morse.
Text ciphers can't use the printable-ASCII-ratio qualifier (the encoded form is already printable), so detection is gated by a distinct-common-English-word count on the decoded output. Substitution ciphers additionally skip candidates whose source text is already English.
NATO runs that decode to a sequential alphabet (ABCDE…) are intentionally left alone — instructional content, not a payload.
Adds 13 example tests + 5 property tests covering each new cipher: positive paths, length/substitution-floor guards, alphabet-drill carve-out, ASCII-art Morse, and a no-false-fire property on plain English prose. Test sources contain only ciphertext / symbolic runs — benign filler is encoded at test time so adversarial phrasing never appears in plaintext.

Part of #203.

Test plan

bun run check in extension/
bun run test in extension/ — encoded-payload-redact.test.ts (31 cases) and encoded-payload-redact.property.test.ts (12 cases) all pass
Manually verify a ROT13 / Morse / NATO snippet on a sample page gets the click-to-reveal placeholder
Manually verify ordinary English prose containing digits or scattered Morse-like punctuation is not redacted

🤖 Generated with Claude Code

Add detection for ROT13, Atbash, reverse, leetspeak, NATO phonetic, and Morse alongside the existing base64 / hex / percent matchers. Text-cipher decodes are gated by a distinct common-English-word count since the encoded form is itself printable; substitution ciphers additionally skip candidates whose source is already English. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-06-07T20:59:51Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-browser-shield-demo-site	Ready	Preview, Comment	Jun 7, 2026 9:22pm

unblocked

3 issues found.

About Unblocked

Unblocked has been set up to automatically review your team's pull requests to identify genuine bugs and issues.

📖 Documentation — Learn more in our docs.

💬 Ask questions — Mention @unblocked to request a review or summary, or ask follow-up questions.

👍 Give feedback — React to comments with 👍 or 👎 to help us improve.

⚙️ Customize — Adjust settings in your preferences.

unblocked · 2026-06-07T21:05:38Z

+function collectReverse(text: string, matches: InlineMatch[]): void {
+  for (const m of text.matchAll(TEXT_CIPHER_CANDIDATE)) {
+    const candidate = m[0];
+    if (alreadyEnglish(candidate)) {
+      continue;
+    }
+    if (tryCipherDecode(candidate, reverseText) !== null) {
+      matches.push({
+        start: m.index,
+        end: m.index + candidate.length,
+        label: "[encoded payload hidden]",
+      });
+    }
+  }
+}


collectReverse is structurally identical to calling collectSubstitutionCipher(text, reverseText, matches). If collectSubstitutionCipher is later modified (e.g., an extra guard is added), this copy won't be updated in lock-step.

The call site at line 808 can simply become:

collectSubstitutionCipher(text, reverseText, matches);

and this function can be deleted.

Fixed in ed4e025 — folded collectReverse into the new collectSubstitutionCiphers (single pass over TEXT_CIPHER_CANDIDATE driving a [rot13, atbash, reverseText] decoder list), and dropped the standalone wrapper. — Claude Code, on behalf of @twschiller

unblocked · 2026-06-07T21:05:38Z

+function deleet(text: string): string {
+  return text.replaceAll(
+    new RegExp(LEET_SUBSTITUTION_CHAR_CLASS, "g"),
+    (c) => LEET_MAP[c] ?? c,
+  );
+}


deleet (and countLeetSubstitutions at line 487) construct a fresh RegExp from the static LEET_SUBSTITUTION_CHAR_CLASS string on every invocation. These sit inside the per-candidate inner loop of collectLeet, so on a page with many 40+-char text groups each call allocates two throwaway regex objects.

Pre-compile once at module level:

const LEET_SUBSTITUTION_RE = new RegExp(LEET_SUBSTITUTION_CHAR_CLASS, "g");

Then use LEET_SUBSTITUTION_RE in both helpers (reset lastIndex or use matchAll / spread to avoid stale-state issues with the shared g-flag regex).

Fixed in ed4e025 — pre-compiled LEET_SUBSTITUTION_RE at module level and routed both deleet (via replaceAll) and countLeetSubstitutions (via match) through it. Both call sites use the shared instance safely — String.prototype.replaceAll and String.prototype.match with a g-flag regex manage lastIndex internally, so no manual reset is needed. — Claude Code, on behalf of @twschiller

unblocked · 2026-06-07T21:05:38Z

+  collectSubstitutionCipher(text, rot13, matches);
+  collectSubstitutionCipher(text, atbash, matches);
+  collectReverse(text, matches);
+  collectLeet(text, matches);
+  collectNato(text, matches);
+  collectMorse(text, matches);


collectSubstitutionCipher is called for rot13 and atbash, and collectReverse runs the same TEXT_CIPHER_CANDIDATE regex a third time. Each call also re-runs alreadyEnglish (itself a regex + set lookup) on the same set of candidates.

Consider collecting TEXT_CIPHER_CANDIDATE matches once and testing all three decoders inside a single loop. This would cut the regex work and the alreadyEnglish checks to one-third on every text group, which matters now that MIN_TEXT_LENGTH is 20 (previously 120) and far more groups enter collectMatches.

Fixed in ed4e025 — collectSubstitutionCiphers now walks TEXT_CIPHER_CANDIDATE once, runs alreadyEnglish once per candidate, and tries each decoder in SUBSTITUTION_DECODERS ([rot13, atbash, reverseText]) until one succeeds. Cuts the regex + qualifier work to one third per text group, which matters now that MIN_TEXT_LENGTH is 20. — Claude Code, on behalf of @twschiller

…edact Adds 13 example tests and 5 property tests for the text-cipher detection paths. Source files contain only ciphertext or symbolic runs — benign English filler is encoded at test time so adversarial phrasing never appears in plaintext. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t regex Address review feedback on #216: - Collapse rot13/atbash/reverse passes into one TEXT_CIPHER_CANDIDATE iteration with a decoder list; cuts regex + alreadyEnglish work to one-third per text group. - Pre-compile LEET_SUBSTITUTION_RE at module level so deleet and countLeetSubstitutions stop allocating a regex per call inside the inner candidate loop. - Drop the now-duplicate collectReverse wrapper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

unblocked Bot reviewed Jun 7, 2026

View reviewed changes

vercel Bot deployed to Preview June 7, 2026 21:13 View deployment

Lint: fix biome/eslint findings in encoded-payload-redact

0aaa0dc

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel Bot deployed to Preview June 7, 2026 21:18 View deployment

vercel Bot deployed to Preview June 7, 2026 21:22 View deployment

twschiller merged commit 8603b73 into main Jun 7, 2026
7 checks passed

twschiller deleted the fix/encoded-payload-extra-encodings branch June 7, 2026 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: extend encoded-payload-redact with text-cipher encodings (#203)#216

Fix: extend encoded-payload-redact with text-cipher encodings (#203)#216
twschiller merged 4 commits into
mainfrom
fix/encoded-payload-extra-encodings

twschiller commented Jun 7, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

unblocked Bot left a comment

Uh oh!

unblocked Bot Jun 7, 2026

Uh oh!

twschiller Jun 7, 2026

Uh oh!

unblocked Bot Jun 7, 2026

Uh oh!

twschiller Jun 7, 2026

Uh oh!

unblocked Bot Jun 7, 2026

Uh oh!

twschiller Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

twschiller commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

vercel Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

unblocked Bot left a comment

Choose a reason for hiding this comment

About Unblocked

Uh oh!

unblocked Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

twschiller Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

unblocked Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

twschiller Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

unblocked Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

twschiller Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

twschiller commented Jun 7, 2026 •

edited

Loading

vercel Bot commented Jun 7, 2026 •

edited

Loading