perf(felt252): compute felt252_div via a runtime binding#1648
Conversation
|
✅ Code is now correctly formatted. |
b8d151f to
b7d8b96
Compare
abc0202 to
6cc4024
Compare
orizi
left a comment
There was a problem hiding this comment.
@orizi made 2 comments.
Reviewable status: 0 of 3 files reviewed, 2 unresolved discussions (waiting on TomerStarkware).
src/metadata/runtime_bindings.rs line 326 at r1 (raw file):
/// little-endian felt252 buffers; the result is written to `dst_ptr`. #[allow(clippy::too_many_arguments)] pub fn felt252_div<'c, 'a>(
put near mul.
src/metadata/runtime_bindings.rs line 333 at r1 (raw file):
dst_ptr: Value<'c, '_>, lhs_ptr: Value<'c, '_>, rhs_ptr: Value<'c, '_>,
and remove the allow.
Suggestion:
[dst_ptr, lhs_ptr, rhs_ptr]: [Value<'c, '_>; 3],f876fb2 to
1cb3979
Compare
orizi
left a comment
There was a problem hiding this comment.
@orizi made 1 comment.
Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on TomerStarkware).
-- commits line 1 at r2:
rebase
orizi
left a comment
There was a problem hiding this comment.
@orizi made 1 comment.
Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on TomerStarkware).
src/runtime.rs line 139 at r1 (raw file):
/// /// `rhs` originates from a `NonZero<felt252>`, so it is guaranteed nonzero; /// `field_div` uses the optimized Montgomery inverse.
move near mul
1f69923 to
79423e0
Compare
Route felt252 division through a runtime binding (cairo_native__felt252_div) instead of lowering it inline as an extended-euclidean modular inverse plus an i512 multiply-and-remainder. The inline form legalizes into huge limb sequences (the i512 remainder being the worst), making O3 codegen of division-heavy contracts pathologically slow — the same hazard the felt252 multiply binding removed. The binding uses starknet-types-core's field_div (optimized Montgomery inverse); rhs is a NonZero<felt252>, so it is guaranteed nonzero. Also removes the now-dead ExtendedEuclideanWidth::U252 variant, which was only ever constructed by the old division path (U31/U256/U384 remain in use by egcd / u256-invmod). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
79423e0 to
4757922
Compare
TomerStarkware
left a comment
There was a problem hiding this comment.
@TomerStarkware made 3 comments.
Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on orizi).
Previously, orizi wrote…
rebase
Done.
src/runtime.rs line 139 at r1 (raw file):
Previously, orizi wrote…
move near mul
Done.
src/metadata/runtime_bindings.rs line 333 at r1 (raw file):
Previously, orizi wrote…
and remove the allow.
Done.
Benchmarking resultsBenchmark for program
|
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
10.312 ± 0.083 | 10.249 | 10.536 | 5.78 ± 0.07 |
cairo-native (embedded AOT) |
1.785 ± 0.017 | 1.761 | 1.811 | 1.00 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1.788 ± 0.010 | 1.770 | 1.803 | 1.00 ± 0.01 |
Benchmark for program dict_snapshot
Open benchmarks
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
514.0 ± 2.5 | 508.1 | 517.3 | 1.00 |
cairo-native (embedded AOT) |
1600.3 ± 7.1 | 1592.4 | 1613.0 | 3.11 ± 0.02 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1636.6 ± 6.8 | 1627.1 | 1650.4 | 3.18 ± 0.02 |
Benchmark for program factorial_2M
Open benchmarks
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
4.614 ± 0.034 | 4.589 | 4.688 | 2.76 ± 0.02 |
cairo-native (embedded AOT) |
1.672 ± 0.006 | 1.663 | 1.681 | 1.00 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1.692 ± 0.010 | 1.676 | 1.709 | 1.01 ± 0.01 |
Benchmark for program fib_2M
Open benchmarks
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
4.524 ± 0.011 | 4.506 | 4.543 | 2.79 ± 0.03 |
cairo-native (embedded AOT) |
1.621 ± 0.017 | 1.606 | 1.663 | 1.00 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1.655 ± 0.019 | 1.633 | 1.707 | 1.02 ± 0.02 |
Benchmark for program linear_search
Open benchmarks
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
554.6 ± 6.7 | 547.4 | 566.3 | 1.00 |
cairo-native (embedded AOT) |
1631.2 ± 14.2 | 1611.8 | 1661.5 | 2.94 ± 0.04 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1662.7 ± 8.0 | 1648.0 | 1672.1 | 3.00 ± 0.04 |
Benchmark for program logistic_map
Open benchmarks
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
Cairo-vm (Rust, Cairo 1) |
469.5 ± 2.7 | 466.5 | 474.0 | 1.00 |
cairo-native (embedded AOT) |
1610.4 ± 5.0 | 1604.7 | 1617.9 | 3.43 ± 0.02 |
cairo-native (embedded JIT using LLVM's ORC Engine) |
1645.1 ± 11.5 | 1631.1 | 1665.5 | 3.50 ± 0.03 |
Benchmark results Main vs HEAD.Base
Head
Base
Head
Base
Head
Base
Head
Base
Head
Base
Head
|
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed 3 files and all commit messages, made 1 comment, and resolved 4 discussions.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on TomerStarkware).
Stacked on #1647 (felt252_mul binding). Review/merge that first; this PR's base is
tomer/felt252_mul_pr, so it shows only the division change.What
Route
felt252division through a runtime binding (cairo_native__felt252_div) instead of lowering it inline as an extended-euclidean modular inverse followed by an i512 multiply-and-remainder.Why
The inline form has the same hazard the mul binding removed, and then some: it legalizes into huge limb sequences — the i512 remainder being the worst offender — which makes O3 register allocation of division-heavy contracts pathologically slow. It also computed the modular inverse via a large MLIR extended-euclidean loop.
The binding uses starknet-types-core's
field_div(optimized Montgomery inverse).rhsoriginates from aNonZero<felt252>, so it is guaranteed nonzero andfrom_felt_uncheckedis safe. No builtin/range-check accounting is involved (felt division is a pure field op).Changes
src/runtime.rs—cairo_native__felt252_divviafield_divsrc/metadata/runtime_bindings.rs—Felt252Divbinding (symbol, fn-ptr, emitter,setup_runtime); removed the now-deadExtendedEuclideanWidth::U252, which was constructed only by the old division path (U31/U256/U384remain, used by egcd / u256-invmod)src/libfuncs/felt252.rs—Divrouted to the binding; inline euclidean + i512 multiply/remainder removedAll
felt252libfunc tests pass (incl.felt252_div/felt252_div_const: division-by-zero panics, valid quotients, and negative operands).🤖 Generated with Claude Code
This change is