Perf: speed up task --list on projects with many includes#2819
Perf: speed up task --list on projects with many includes#2819romnn wants to merge 4 commits intogo-task:mainfrom
task --list on projects with many includes#2819Conversation
…tVariables (5.0s → 3.1s)
…ones otherwise (3.1s → 0.44s)
|
@romnn The first commit as a separate PR would be useful. Please, without AI generated comments. Its reasonable to assume, if you make your build infrastructure more modular (Task calls Task), that things would naturally be faster. I also think that simply caching content for task list would actually be much much faster. Here I mean, run Task to generate the list, then put a script frontend (adapted from one of the existing methods), and you would probably get much much faster performance. |
|
@trulede thanks for the quick feedback! The numbers are from an 11-core M3 Pro macbook with 18GB of RAM, so its a fast multi-core machine that I feel should be able to list tasks in under a second. Caching the output of I agree that the first commit is the most obvious, but it's still 5 seconds due to the heavy allocation churn. I will open a separate PR #2820 with just the first optimization regardless. Best, Roman |
I work in an org with a fairly large monorepo and we use
taskheavily as our cross-language entry point. The taskfile setup looks like this:task --list(3,547 total before filtering internals)env:entries and 13 mergedvars:at taskfile scopetask --listhere takes about 13.7 seconds.Four commits, applied in order of biggest gain first. Each is scoped to a single change and should be readable on its own — they accumulate, but they're largely independent and can be cherry-picked individually if any one of them seems too risky:
{{(13.7s → 5.0s).Most env values are static literals; running them through
template.Parse + Executeis pure overhead.getVariablesrangeFunc loop (5.0s → 3.1s).The previous code allocated a fresh
templater.Cacheper variable, socacheMapgot rebuilt from scratch for every entry.cacheMapdirectly when the template can't mutate it; pool clones for the cases that can (3.1s → 0.44s).The shallow
maps.Clonewas there to defend against sprig'sset/unsetwriting back intocacheMap. A cheap substring check on the template tells us when that's possible — for everything else (and for nil values, which can't run a template at all) we just handtpl.Executethe cacheMap directly. When we do need to clone, the clone goes through async.Pool.template.New("").Funcs(templateFuncs).Parse(text)copies the ~170-entry sprig FuncMap into a fresh template every time. With many tasks processing the same merged env/vars, the same source strings get parsed over and over. Parsed templates are documented as safe for concurrentExecute, so memoizing them in async.Mapis straightforward.End-to-end: 13.7s → 0.25s, ~55× faster on the repo above on an 11-core M3 Pro macbook with 18GB RAM.
Correctness
This is best effort. I've validated:
task --listoutput is byte-identical to v3.50.0 on our repo..via sprig'sset/unsetproduce the same output as the systemtaskbinary in targeted reproductions.The risk areas, in case I missed something:
"set"anywhere in the template) to decide whether to clone. False positives are fine — they just clone when not strictly necessary. False negatives would silently leak template-side mutations back intocacheMap. I went through the templater's func map and the only top-level dict mutators I found were sprig'ssetandunset(unsetis also caught by the substring scan for"set"). Push/append/prepend mutate slice values, which the original shallow clone never protected against either. If there are mutators I missed — especially any that get added later — this could regress.Cachein commit 2 is per-getVariables-call, not across goroutines. Each task compilation still gets its own cache. But it does mean that if a template mutatescacheMap(i.e. clone path didn't run because the substring check missed), pollution would persist across subsequent variables in the same compilation. The combination of the always-present clone in commit 1 and the COW logic in commit 3 should prevent this in practice — but it's worth flagging.So: tests pass, behavior matches on our workspace, but I can't 100% guarantee no edge case in some other repo's taskfiles.
On the four-commit structure
The contributing guide prefers a single commit. I deliberately split this into four because each one is a self-contained change with its own measurable win and its own correctness argument. If any single commit turns out to be unsafe in some repo I haven't thought of, you can revert just that one without losing the others. Happy to squash before merge if you'd rather have a single commit.
On the doc comments
The comments I added on the new functions are wordy — I left "why" in there explicitly because I wanted the trade-offs visible. Happy to trim, reword, or drop them entirely.
Also open to feedback on the substring approach in commit 3. A more rigorous version would parse the template AST and look for FuncCall nodes matching
set/unset. That felt like over-engineering for a static set of two function names, but if you'd prefer that approach I can add it.AI disclosure
Per the contributing guide: I used Claude as a profiling/research tool while digging into where the time was actually going and validating correctness against the upstream test suite. The code went through several rounds of review and adjustment by me before landing here.