interp: bound-check elision for provably in-range array/vector access by aleksisch · Pull Request #3327 · GaijinEntertainment/daScript

aleksisch · 2026-07-01T09:43:52Z

What

Bound-check elision for the interpreter, mirroring the JIT's [hint(unsafe_range_check)] — but per-access and auto-deduced. Opt-in via options bound_check_elision (default off). WIP / draft, pushed with [skip ci].

When the compiler can prove an array/vector index is in range, the access is marked noBoundCheck and lowered to an unchecked simulate node, dropping the per-access idx >= size compare+branch.

How

Unchecked simulate nodes — SimNode_AtU / SimNode_AtR2VU (fixed arrays, simulate_nodes.h), SimNode_AtVectorU (vectors), SimNode_ArrayAtU / SimNode_ArrayAtR2VU (dynamic arrays, runtime_array.h). Pointers are already unchecked. A noBoundCheck:1 flag on ExprAt (serialized, cloned) selects them at simulate.

Fused unchecked nodes — a generic unchecked node would fall off the interpreter's fusion fast path (the ArgLoc/LocLoc/… specializations), which costs more than the removed check. So the fusion generators emit unchecked families (AtU/ArrayAtU/…) alongside the checked ones by toggling the bounds check as a macro (DAS_AT_CHECK / DAS_ARRAYAT_CHECK) — one node-body macro, instantiated check-ON for the checked op-name and check-OFF for the unchecked one. An elided access fuses to e.g. ArrayAtUArgLoc.

Fact analysis (CFG dataflow) — a forward must analysis over each function's CFG. A fact is 0 <= idx < BOUND (BOUND = a constant or length(arrayVar)) plus a set of proven-nonnegative index vars. Facts are:

genned by loop induction at the for-body block (range(N), range(length(x))) and by branch guards on the taken edge (if (i < length(a)) …, if (i >= length(a)) return; …);
merged by intersection at CFG joins — so an else establishing ¬cond, or code dominated by an early-exit guard, carries the fact;
killed by any array mutation (any resize/erase/push/rebind — we assume everything may alias everything, so any length-changing op invalidates every length(·) fact) and by reassigning the index. Element writes (a[i]=v) don't change length and are excused.

A constant index into a fixed dim is marked directly. Non-truncating numeric casts (uint/int/int64/uint64) around length() or the index are seen through.

Pipeline — the CFG is a single shared pass: built once at the post-infer stable point (only if a consumer is enabled) and handed to two consumers as a const pointer — the unsafe-index pass (read-only, only sets flags) runs first, then the flow-sensitive escape pass reads the same CFG before it inserts scope_free. noBoundCheck is a runtime-semantic property, so it survives the later fold loop and re-infer. (CfgBlock gained an additive loopSource anchor so induction survives the flattened cond-less loop header.)

Logging — options log_bound_check_elision reports every elided access (function, source location, access, reason).

Benchmark

benchmarks/micro/bound_check_elision.das — index-heavy loops in plain functions (dynamic range(length) read/write + sum, fixed-array constant range). Release, interpreter, ns/op (median of 5):

bench	checked (fused)	elided (fused)
array_rw/100000	3.8	3.4
array_sum/100000	2.0	1.8
fixed_rw/256	3.5	3.3

~6–11% on tight index loops. The bounds check is a small slice of per-element interpreter cost, so the win is modest but consistent — and, importantly, not a regression (the earlier generic-unchecked-node version regressed ~70% by losing fusion, which the fused variants fix).

How much does it catch (corpus statistics)

Measured over 696 modules (daslib + tests + examples + tutorials, each compiled once) by forcing the pass on and counting candidate accesses vs elided:

5,500 array/vector index accesses carry a runtime bounds check ("candidates").
763 (≈14%) are provably in range and get elided.

Bimodal, not uniform — most array-indexing files elide nothing (they use unsafe, iterators, or computed indices the analysis can't prove), but a tail of ~25 files with tight range(length) / fixed-dim loops are 75–100% elidable.

14% is a conservative floor, chiefly because accesses inside block/lambda arguments (foreach/run/comprehension bodies — very common in daslib) are not analysed: the CFG is a function's CFG, and facts can't soundly cross into a deferred lambda body. Running the dataflow per block/lambda scope (future work) would raise this.

Limitations / notes

Interpreter only — AOT and the interpreter fusion of I64/U64-indexed arrays keep full checks (no *U fusion added for those); the flag is ignored by the AOT C++ emitter.
Function-body loops — accesses nested inside a block/lambda argument (e.g. the body of a run/foreach block) are left checked; the analysis walks a function's own CFG.
Guard facts need idx >= 0 — satisfied by an unsigned index or a lower-bound fact; a bare signed x < len won't elide (a negative x would slip the check). Loop induction carries >= 0 for free.

🤖 Generated with Claude Code

Add unchecked At/ArrayAt/AtVector simulate node variants and a noBoundCheck flag on ExprAt. A conservative optimizer pass, enabled by options bound_check_elision, marks accesses whose index is provably in range, mirroring the JIT unsafe_range_check hint for the interpreter: - constant index into a fixed-size array or vector - induction var over a constant range, into a fixed-size array/vector - induction var over range(length(x)), indexing that same dynamic array x (symbolic length fact) Facts carry either a constant [lo,hi) interval or a symbolic upper bound length(var). A constant-index fact is immune to array mutation (fixed dims never change). A symbolic length(x) fact is killed by ANY length-changing array op (resize/erase/push/move/rebind) on ANY array in the loop body: we assume everything may alias everything, so aliased mutation can never invalidate an elided access. Element writes (x[i]=v) do not change length and are excused. The index var must not be reassigned. Also make the CFG a shared pass: ProgramCfg / buildProgramCfg build a per-function CFG cache once, handed to consumers as a const pointer. Escape analysis's flow-sensitive pass now reads the shared cache instead of building its own CFG per function; each consumer is gated independently (force_partial_escape_free / bound_check_elision). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Replace the AST-visitor fact stack with a forward "must" dataflow over each function's CFG. A fact is 0 <= idx < BOUND (BOUND = a constant or length(arrayVar)), plus a set of provably-nonnegative index vars. Facts are: - genned by loop induction at the for-loop body block (range(N) / range(length(x))), and by branch guards on the taken edge (if (i < length(a)) ... , if (i >= length(a)) return; ...), - merged by INTERSECTION at CFG joins (a fact must hold on every incoming edge - so an else-branch that establishes ~cond, or code dominated by an early-exit guard, carries the guard fact), - killed by any array mutation (any resize/rebind - everything may alias everything) and by reassigning the index. An ExprAt whose (index, array/dim) matches a live fact is marked noBoundCheck; a constant index into a fixed dim is marked directly. Pipeline: the CFG is built ONCE at the post-infer stable point and shared by two consumers - the unsafe-index pass (read-only: only sets flags) runs first, then the flow-sensitive escape pass reads the same CFG before it inserts scope_free. noBoundCheck is a runtime-semantic property, so it survives the later fold loop and the scope_free re-infer. gated by bound_check_elision / force_partial_escape_free; the CFG is built only if a consumer is enabled. CFG change (additive): a for-loop body CfgBlock records its ExprFor (loopSource) so induction facts survive the flattened cond-less header. Non-truncating numeric casts (uint/int/int64/uint64) around length() or the index are seen through; truncating casts are not. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…skip ci] Without this, marking an access noBoundCheck emitted a generic SimNode_AtU / ArrayAtU that the fusion optimizer does not recognize, so the access fell off the fused ArgLoc/LocLoc fast path - a net regression (the lost fusion cost far more than the removed bounds-check branch). Generate fused unchecked node families (AtU / AtR2VU / ArrayAtU / ArrayAtR2VU) alongside the checked ones by parameterizing the bounds check as a toggled macro (DAS_AT_CHECK / DAS_ARRAYAT_CHECK) - one node-body macro, instantiated ON for the checked op-name and OFF for the unchecked one, no duplicated struct bodies. The elided access now fuses to e.g. ArrayAtUArgLoc. Logging: `options log_bound_check_elision` reports every elided access (function, source location, access, reason). Benchmark: benchmarks/micro/bound_check_elision.das - index-heavy loops in plain functions (dynamic range(length) r/w + sum, fixed-array const range). Release, interp: elided vs checked ~ array_rw 3.8->3.4, array_sum 2.0->1.8, fixed_rw 3.5->3.3 ns/op (~6-11%). The pass targets function-body loops; accesses nested in a block/lambda argument stay checked. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

aleksisch force-pushed the aleksisch/bound-check branch 2 times, most recently from 4f4714f to dde1ed5 Compare July 1, 2026 10:20

aleksisch force-pushed the aleksisch/bound-check branch from dde1ed5 to 5f645a4 Compare July 1, 2026 12:36

aleksisch and others added 2 commits July 1, 2026 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

interp: bound-check elision for provably in-range array/vector access#3327

interp: bound-check elision for provably in-range array/vector access#3327
aleksisch wants to merge 3 commits into
masterfrom
aleksisch/bound-check

aleksisch commented Jul 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

aleksisch commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Benchmark

How much does it catch (corpus statistics)

Limitations / notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aleksisch commented Jul 1, 2026 •

edited

Loading