Improve false positive rate on real-world packages#769
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 497c7199d0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
7ddd42b to
fcc8a70
Compare
YARA string identifiers are file-wide, so the previous $ip and not $ip_internal condition suppressed the entire $ip branch whenever any loopback/private URL appeared in a file, even when a real public-IP C2 endpoint was also present. Bake the loopback/private range exclusion (0/8, 10/8, 127/8, 169.254/16, 172.16-31, 192.168/16) into the $ip regex so it applies per match. Add a regression test for a file that mixes a loopback URL with a public-IP endpoint.
fcc8a70 to
af1f25d
Compare
The YARA rule unit tests (tests/analyzer/sourcecode), including the per-rule compilation test, were defined in the Makefile but never invoked by CI, so rules that fail to compile (e.g. unsupported regex lookbehind) could merge undetected. Add a test-yara-rules step to the checks workflow and include its coverage in the combined report.
f5eb001 to
dc67dcb
Compare
Four rules (threat-process-download-exec, threat-runtime-obfuscation-base64exec, threat-runtime-dynamic-loader, threat-runtime-obfuscation-steganography) used a negative lookbehind (?<![.\w]) to match bare exec(/eval( builtins. YARA's regex engine does not support lookbehind, so these rules failed to compile and never fired (missing beautifulsup4, pylibutil, weecoder in the recall eval). Replace with a consuming negative class [^.\w], which keeps the same intent (skip method calls like model.eval()). Add positive and benign regression tests for the four rules.
fce610a to
092675f
Compare
| // so dict keys (result['hostname']), XML lookups (findall("hostname")) | ||
| // and API paths (/workers/whoami) no longer match. | ||
| $whoami = /\b(popen|system|getoutput|getstatusoutput|check_output|check_call|Popen|run|call|spawn\w*|execl?p?e?|exec_command)\s*\(\s*\[?\s*['"]whoami\b/ nocase | ||
| $hostname = /\b(popen|system|getoutput|getstatusoutput|check_output|check_call|Popen|run|call|spawn\w*|execl?p?e?|exec_command)\s*\(\s*\[?\s*['"]hostname\b/ nocase |
There was a problem hiding this comment.
the idea here was that hostname is a lolbas (like an actual OS command).
popen, check_call etc are python dependant, i can live with this, but goes against the spirit of the .meta rule
There was a problem hiding this comment.
Good point. Opened #773 (base v3) to address this: it adds language-agnostic invocation patterns (tool-specific flags like whoami /all / hostname -I, and absolute paths like /usr/bin/whoami) alongside the exec wrapper, matching the approach already used by lolbas-proc.meta and lolbas-net.meta. The exec-wrapper patterns stay so the bare no-arg invocation is still caught and the FP fixes from this PR are preserved.
In this PR
Run Yara compilation tests in CI
We had a Makefile target but it wasn't running in CI, resulting in previously-silently failing Yara rule. Fixed in dc67dcb, which allowed to identify broken rules that we fixed in 092675f.
Rule improvements
threat-runtime-obfuscation.yar$b64_1requires an opening quote and a 150+ char run (long URLs / generated identifiers are ≤108); dropped the[A-Z]{10,}branch of$random_varsthat matched ALLCAPS words (MERCHANTABILITY,INTERACTIVE)requests,mistralai,mypy-boto3-securityhub,pytensorthreat-filesystem-autostart.yar.profilenow requires quoted-path context (noself.profile/.profile_manager);open(→\bopen\s*\(sourlopen(no longer countssemgrep,dbt-snowflake,pytensor,testcontainerslolbas-sysinfo.metawhoami/hostnameonly match inside an exec call (no dict keys,findall(), API paths)mistralai,botocore,censysthreat-network-exfiltration.yar$ipexcludes loopback / RFC1918 ranges ($ip_internalcarve-out); added source-onlypath_include; renamed rule to match file idmistralai(mcp.py),timing-asgithreat-filesystem-read.yarpath_includeso METADATA / changelogs are not scannedbotocore,python-dotenvthreat-network-outbound-shady-links.yardiscord.comtodiscord.com/api/webhooks; added source-onlypath_includepytestEvals
I do think this makes the rules better and the hit on recall is worth the gain in accuracy in our context. In a follow-up PR I'll analyze the recall drop and see if we can solve it.