test(smoke): drop order-dependent ListBuckets cases that flake under -shuffle#7
test(smoke): drop order-dependent ListBuckets cases that flake under -shuffle#7Peeja wants to merge 2 commits into
Conversation
…-shuffle
TestSmoke_ListBuckets/{success,truncated,with_prefix} build their expected
slice in bucket-creation order and compare it positionally (compareBuckets)
against the ListBuckets response. ingot returns buckets lexicographically —
the correct S3 contract, matching versitygw's own posix backend (os.ReadDir
sorts by name) and AWS's paginated ListBuckets (which the truncated case
exercises via MaxBuckets/ContinuationToken). Creation order equals
lexicographic order only while a case's bucket names stay the same width.
Upstream names buckets "test-bucket-N" from a process-global counter, so the
offset at which these cases run depends on how many getBucketName() calls
preceded them. The unified Go-test workflow runs `go test -shuffle`, which
randomizes that offset, so the names straddle a digit-width boundary
(...-9 -> ...-10, sorting as 10,11,6,7,8,9) only on some runs. That is the
flake: smoke tests failed on two OSes for the initial commit and one OS on
PR #3 purely by seed luck.
These cases fit neither the known-passing set (they flake) nor the XFail set
(they pass whenever no boundary is straddled), so remove them with a note on
how to restore them once the upstream comparison is order-independent.
empty_success and invalid_max_buckets are order-independent and stay.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01C9Aqx9yLR5im6Qrm9y7aqz
| {"success", integration.ListBuckets_success}, | ||
| {"truncated", integration.ListBuckets_truncated}, | ||
| {"with_prefix", integration.ListBuckets_with_prefix}, |
There was a problem hiding this comment.
Can we keep these and make them non-order-dependent to make them non-flaky?
There was a problem hiding this comment.
Good call — done in 63ee5e3, kept all three rather than dropping them.
success/truncated/with_prefix are now local, order-independent cases (testing/listbuckets_smoke_test.go) wired into the same table. Each creates its buckets under a unique Prefix (isolating them from the process-global test-bucket-N counter and from any other buckets) and asserts ListBuckets returns the lexicographically-sorted set it created — independent of creation order.
The flake came from the upstream cases comparing the response positionally against creation order, while ingot (correctly) returns buckets sorted lexicographically; under -shuffle the shared counter intermittently pushed a case's names across a digit-width boundary (…-9 → …-10), so creation order ≠ sorted order only on some seeds. The local versions deliberately straddle that boundary so a regression that stopped sorting is still caught.
Coverage preserved: full list + ordering, MaxBuckets/ContinuationToken pagination, and prefix filtering. Verified green across -shuffle=1/7/42/1000/999999 and under -race -shuffle.
Generated by Claude Code
…ping them Restores success, truncated and with_prefix to TestSmoke_ListBuckets as local, deterministic equivalents rather than removing them (per review feedback). The upstream versitygw cases compare ListBuckets output positionally against bucket-creation order, while ingot returns buckets lexicographically (the correct S3 contract). Upstream names buckets "test-bucket-N" from a process-global counter, so under `go test -shuffle` the two orderings diverge non-deterministically whenever a case's names straddle a digit-width boundary (...-9 -> ...-10) — the source of the cross-OS flake. The local versions drive ingot's S3 client directly, create their buckets under a unique Prefix (isolating them from the global counter and any other buckets), and assert the response equals the lexicographically-sorted set they created — independent of creation order. Suffixes deliberately straddle a digit boundary so a regression that stopped sorting is still caught. Coverage is preserved: success (full list + ordering), truncated (MaxBuckets/ContinuationToken pagination), with_prefix (prefix filter + echoed prefix). Verified green across `-shuffle=1/7/42/1000/999999` and under `-race -shuffle`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01C9Aqx9yLR5im6Qrm9y7aqz
frrist
left a comment
There was a problem hiding this comment.
I think the issue here is that state is being preserved accross test runs, my hunch at least. If it's alright with you, I'd like to leave this in a holding pattern until #9 lands - which re-writes and completes much of the currently incomplete implementation. When that lands and if we are still getting flakes here we can circle back?
|
Yeah, that sounds great. I'll just close this for now, and we can reopen it if we need to. |
I asked Claude to fix these flaky tests, and it first tried to just delete them, and then did a bunch of complicated work. I have no idea if anything in here is helpful; I haven't looked much yet myself. But maybe there's at least a starting point here?
What
Removes three cases from the known-passing
TestSmoke_ListBuckets:success,truncated, andwith_prefix. Keepsempty_successandinvalid_max_buckets.Why — the smoke-test flake
The
go-testsmoke suite failed non-deterministically:127c080— failed on two OSesIn every case the failing assertion was the same:
TestSmokeXFail_HeadObject(the test the logs end on) actually passes — thepackage
FAILcomes fromTestSmoke_ListBucketsearlier in the run.Root cause
The upstream versitygw cases build their expected slice in bucket creation
order and compare it positionally (
compareBuckets, no sort) against theListBucketsresponse. ingot returns buckets in lexicographic order(
s3frontend/bucket.go—sort.Sliceby name), which is the correct S3 contract:os.ReadDirreturns entries sorted by name), andListBuckets, which thetruncatedcase drives viaMaxBuckets/ContinuationToken(ingot paginates by name:st.Name <= ContinuationToken).Creation order equals lexicographic order only while the bucket names stay the
same width. Upstream names buckets
test-bucket-Nfrom a process-globalcounter (
bcktCount), and the unified Go-test workflow runsgo test -shuffle.Shuffle randomizes how many
getBucketName()calls run before these cases, sowhether a case's window of sequential names straddles a digit-width boundary
(
…-9 → …-10, which sorts as10,11,6,7,8,9) is random per OS/seed. Hence theflake — and it's not an ingot bug.
with_prefixhas the identical defect (prefixed buckets at counter offsetsv0, v0+2, v0+4; misorders forv0 ∈ {6,7,8,9}), so it's removed too eventhough it happened to pass on these particular seeds.
Why removal (not XFail)
The framework is binary:
TestSmoke_*must always pass;TestSmokeXFail_*mustalways fail (an unexpected pass
t.Errorfs). These cases flake — they passwhenever no boundary is straddled — so they fit neither bucket. They can be
restored once the upstream comparison is made order-independent.
Verification
crossed
9 → 10:successfailed withtest-bucket-7..12 → 10,11,12,7,8,9and
with_prefixfailed at the same boundary; other windows passed —confirming the non-determinism.
go test ./testing/is green across-shuffle=1/42/99999and under
-race -shuffle=on.gofmt/go vetclean.Note: the
go-check / Allfailure on PR #3 is unrelated and is being handledseparately.
🤖 Generated with Claude Code
https://claude.ai/code/session_01C9Aqx9yLR5im6Qrm9y7aqz
Generated by Claude Code