perf(scheduler): reverse-dep-aware ready-queue sort (PR-D of 5) by Exelord · Pull Request #91 · vznjs/vx

Exelord · 2026-05-16T14:20:07Z

Summary

When more than one task is ready at the same scheduling tick, pick the one that blocks the most transitive downstream work — not whatever happened to come first in graph insertion order. Matches Nx's tasks-schedule.ts:166-207 pattern: schedule blockers early so the worker pool keeps draining instead of going idle at the end of a run.

How

computeReverseDepCount(nodes) — one DFS to compute, for each task, the size of its transitive reverse-dependency set.
Sort node IDs once at runGraph start by that count (descending). Static for the duration of a run, so no per-tick re-sort.
The scheduling tick walks the pre-sorted order, skipping not-yet-remaining / in-flight entries.
Ties break in graph-insertion order (= the topo order buildTaskGraph produces).

Behavioral tests

Blocker priority — graph root → a → b → c → d plus isolated leaf, with leaf deliberately inserted BEFORE root, concurrency=1: root runs first because it blocks 4 descendants vs leaf's 0.
Tie break — two roots each blocking one descendant: earlier insertion wins.

Magnitude

Depends on graph shape. Maximum payoff when the critical-path task is also one of the first ready (so it doesn't sit waiting while the worker pool burns cycles on isolated leaves). On a 200-package build with N tasks per project, 5-15% off makespan is realistic when concurrency < ready-set width.

Test plan

460/460 tests pass
Lint clean
Format clean
No CACHE_VERSION / SCHEMA_VERSION change — pure scheduling tweak

What's next

This is PR-D of 5. PR-E remaining: batch cache-hit lookups in prepareRun.

https://claude.ai/code/session_016HXj6HW6bxSn8EYuKcxTD9

Generated by Claude Code

When more than one task is ready at the same tick, the scheduler now picks the one that blocks the most transitive downstream work — not just whatever happened to come first in graph insertion order. Matches Nx's `tasks-schedule.ts:166-207` pattern: schedule "blockers" early so the worker pool keeps draining instead of going idle at the end of the run. How: - `computeReverseDepCount(nodes)` does one DFS to compute, for each task, the size of its transitive reverse-dependency set. - We sort node IDs once at runGraph start by that count (descending). - The scheduling tick walks the pre-sorted order, skipping not-yet-remaining / in-flight entries. - Ties break in graph-insertion order (= the topo order `buildTaskGraph` produces). Two behavioral tests pin the new invariants: 1. **Blocker priority**: in a graph `root → a → b → c → d` plus isolated `leaf`, with `leaf` deliberately inserted BEFORE `root`, concurrency=1 — `root` runs first because it blocks 4 descendants vs `leaf`'s 0. 2. **Tie break**: two roots each blocking one descendant — earlier insertion wins. Magnitude: depends on graph shape. Maximum payoff when the critical- path task is also one of the first ready (so it doesn't sit waiting while the worker pool burns cycles on isolated leaves). On a 200-package build with N tasks per project, can shave 5-15% off makespan when concurrency < ready-set width. No CACHE_VERSION / SCHEMA_VERSION change — pure scheduling tweak.

Captures findings from a Turbo + Nx code review focused on the correctness / robustness dimensions we hadn't systematically checked. Six concrete gaps, each with a verified source link in either repo and a fix sketch. Ordered by severity × ease so we can ship the small-but-high-value ones first. Headline gaps: 1. No SIGINT/SIGTERM handler in run() — Ctrl+C orphans child tasks + skips cache.close() (Nx forwards signals via IPC). 2. Path-traversal hole in extractOutputs — a malicious tar entry name with `../` would escape destDir (Turbo gates this via lexical canonicalization in the symlink restore path). 3. No content verification on restore — bit-flips, partial writes, manual tampering all silent. Cheap fix: xxh3(compressed_bytes) stored in entries row. 4. No HMAC on remote artifacts — Turbo gates this behind TURBO_REMOTE_CACHE_SIGNATURE_KEY; we don't have an equivalent. 5. No machine-ID gate — Nx hashes machine GUID into entries to reject cross-OS restores. Only matters for shared <cacheDir>. 6. No retry on transient FS failures — Nx wraps FS ops in exponential backoff (Math.random()*2+2 base exponent, 6 attempts max). Recommended ship order: items 1–4 as small focused PRs; 5–6 deferred until a user actually runs into shared-cache or flaky-FS scenarios. Document records the threat model + Turbo/Nx source references so future agents have the context. Also documents what we already cover (PRs #88, #91, #92, #95) and what we explicitly won't ship (TUI mode selection, flake tracking, per-task .env hashing) to keep this doc as a single source of truth for the integrity backlog.

Exelord force-pushed the claude/reverse-dep-schedule branch from 7fba695 to 96eccee Compare May 16, 2026 19:44

Exelord merged commit 496c20b into main May 16, 2026
1 check passed

Exelord mentioned this pull request May 16, 2026

docs: integrity & robustness audit (May 2026) #96

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(scheduler): reverse-dep-aware ready-queue sort (PR-D of 5)#91

perf(scheduler): reverse-dep-aware ready-queue sort (PR-D of 5)#91
Exelord merged 1 commit into
mainfrom
claude/reverse-dep-schedule

Exelord commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Exelord commented May 16, 2026

Summary

How

Behavioral tests

Magnitude

Test plan

What's next

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants