perf(spark): use 256-entry byte-pair table in hex encoding#21836
Open
Scolliq wants to merge 3 commits intoapache:mainfrom
Open
perf(spark): use 256-entry byte-pair table in hex encoding#21836Scolliq wants to merge 3 commits intoapache:mainfrom
Scolliq wants to merge 3 commits intoapache:mainfrom
Conversation
The bytes path looked up two nibbles and pushed two bytes per input
byte. Replace it with a precomputed `[[u8; 2]; 256]` table built at
compile time, so each input byte becomes one indexed load and one
two-byte extend_from_slice. The int64 path now consumes two nibbles
per loop iteration via the same table, with a fall-through for the
single high nibble.
Existing benchmarks in `datafusion/spark/benches/hex.rs` cover the
hot paths (Int64, Utf8, Utf8View, LargeUtf8, Binary, LargeBinary,
plus dictionary variants).
Adds tests covering all 256 byte values against `format!("{:02X/x}")`
and i64 edge cases (`0`, `i64::MAX`, `i64::MIN`, `-1`).
Refs apache#15986
Contributor
|
run benchmark hex |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing perf/spark-hex-byte-table (7fc5a40) to 794f30e (merge-base) diff File an issue against this benchmark runner |
|
🤖 Criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #15986.
Why:
spark_hexwalked one nibble at a time — twoHEX_CHARS[i]lookups and twoVec::pushcalls per input byte. The hot loop flattens into one indexed load and oneextend_from_sliceper byte with a precomputed table.What changed: added
HEX_LOOKUP_LOWER/HEX_LOOKUP_UPPERas[[u8; 2]; 256]const tables built at compile time. Bytes path now does a single lookup + 2-byte extend per input byte. The int64 path consumes two nibbles per iteration via the same table, with a fall-through for the high nibble. Behaviour for0,i64::MAX,i64::MIN,-1preserved.Tests: extended
test_hex_int64to cover edge values; newtest_hex_lookup_table_covers_all_bytescross-checks every entry againstformat!("{:02X/x}"); newtest_spark_hex_binary_round_trip_all_bytesfeeds all 256 byte values throughspark_hexand verifies the result.cargo test -p datafusion-spark --lib hex→ 8 pass.cargo clippy --all-features --all-targetsclean.cargo bench --no-runbuilds — existingbenches/hex.rsalready covers Int64/Utf8/Utf8View/LargeUtf8/Binary/LargeBinary plus dict paths.Not in this PR: the #15947 review also flagged Utf8View output and dictionary-key reuse — those felt worth their own PRs to keep this focused on the per-byte hot path.