Refactor: umbp metadata store#394
Open
TianDi101 wants to merge 8 commits into
Open
Conversation
Preparatory type changes ahead of the IMasterMetadataStore refactor. No
behavior change; the rest of the refactor becomes purely structural.
0a. Hoist NodeTierKey (from GlobalBlockIndex) and NodeMatch (from
ExternalKvBlockIndex, with MatchedHashCount) to types.h. Temporary
`using` aliases left in both classes so existing callers compile;
removed in Phase 5.
0b. Add ClientRegistration and HeartbeatResult (with nested
enum Status { APPLIED, SEQ_GAP, UNKNOWN }) to types.h. Additive;
these are the input/result vocabulary the Phase 1 interface uses.
0c. Migrate boundary-crossing timestamps from steady_clock to
system_clock (hazard #7): ClientRecord {last_heartbeat,registered_at},
BlockMetrics {created_at,last_accessed_at}, EvictionCandidate
last_accessed_at, BlockEntry lease/access atomics + GrantLease/
BatchLookupForRouteGet durations, Router lease_duration_, and
master_server NowNs(). Process-local duration/timeout clocks (rpc
latency timer, ssd read-lease, peer allocator deadlines) intentionally
stay steady_clock. Fixes the one audit fallout in
test_route_put_strategy.cpp that constructed the migrated fields.
Full UMBP C++ unit suite (local + distributed) green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Land the abstract IMasterMetadataStore interface header consolidating the four master-side state holders (GlobalBlockIndex, ClientRegistry, ExternalKvBlockIndex, ExternalKvHitIndex) behind one contract. No consumers wired yet. - master_metadata_store.h: lifted from the draft, depends only on types.h. Adds the two hit-count methods the draft dropped (GetExternalKvHitCounts, GarbageCollectHits) so the live GetExternalKvHitCounts RPC path is preserved, and adds a `now` parameter to MatchExternalKv so count_as_hit=true can stamp last_seen. - Hoist EvictionCandidate from global_block_index.h into types.h (part of the store contract; mirrors the phase 0 NodeTierKey/NodeMatch hoist) so the interface depends only on types.h. - Compile/instantiation gates: self-compile TU, GMock MockMasterMetadataStore (reused in phase 3), and a signature-completeness test exercising every method through IMasterMetadataStore&. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Implement IMasterMetadataStore in-process by folding the four former state holders (GlobalBlockIndex, ClientRegistry, ExternalKvBlockIndex, ExternalKvHitIndex) behind a single std::shared_mutex. - Block locations + LRU/lease (per-entry atomics mutated under a shared lock, keeping the RouteGet hot path concurrent); lease/access timestamps are now caller-supplied system_clock (hazard #7). - ApplyHeartbeat single-seq CAS; SEQ_GAP keeps liveness but not caps/seq (hazard #1). ExpireStaleClients flips ALIVE->EXPIRED and keeps the row, cascading block + external-KV cleanup atomically (hazard #3). - RegisterExternalKvIfAlive fuses the alive-check with the write (TOCTOU fix). - MatchExternalKv(count_as_hit=true) is the one formerly-shared path that takes the unique lock; hit last_seen is system_clock and feeds GarbageCollectHits. - EnumerateLruForEviction = Option A (full scan + sort + greedy byte budget, no maintained index) so tie-timestamps never drop candidates. Adds the §6a behavioral suite (31 cases) written against IMasterMetadataStore& so it is reused for the Redis backend; all green, concurrency cases clean under ThreadSanitizer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the direct GlobalBlockIndex / ClientRegistry / ExternalKvBlockIndex / ExternalKvHitIndex references in every master-side consumer with a single IMasterMetadataStore&. The four old classes are still present, so this isolates rewire regressions from the Phase 4 deletion. - Router: holds one store_ instead of index_ + registry_. RoutePut / BatchRoutePut read ListAliveClients() and BatchExistsBlock(); BatchRouteGet now passes an explicit system_clock::now() into BatchLookupBlockForRouteGet (the old GlobalBlockIndex read the clock internally — the timestamp now crosses the store boundary, hazard #7). - EvictionManager: constructor takes the store. RunOnce passes its existing per-(node,tier) byte budget (already computed down to the low watermark) straight into EnumerateLruForEviction and drops its own std::sort + greedy budget walk, since the store returns candidates LRU-ordered and budget-trimmed. The standalone overloaded-set is gone — the budget map's keys already identify the overloaded buckets. - MasterServer: the four state members collapse to one unique_ptr<IMasterMetadataStore> (InMemoryMasterMetadataStore), declared before router_/service_ so it outlives their references. gRPC handlers rewired to store_ calls. The Heartbeat handler flattens EventBundle[] into per-bundle single-seq ApplyHeartbeat calls and short-circuits on SEQ_GAP (§3e). The client-expiry reaper moves out of ClientRegistry into MasterServer (per-tick store_->ExpireStaleClients(cutoff)); the hit-index GC tick becomes store_->GarbageCollectHits(now - max_age) — both cutoffs on the system_clock basis so they compare against last_heartbeat / last_seen. - Add UnregisterExternalKvByNode to the interface, in-memory impl, and mock: it backs the live RevokeAllExternalKvBlocksForNode RPC (whole-node external-KV wipe that leaves the client record + block locations intact), which the §1b draft had dropped. Tests: - test_router_dedup migrated to construct InMemoryMasterMetadataStore instead of GlobalBlockIndex + ClientRegistry. - Store suite extended with the previously-uncovered methods: UnregisterExternalKvByNode (distinct from UnregisterClient), GetPeerAddress (ALIVE + EXPIRED + unknown), GetClientTags, and ListAliveClients content; the interface "every method callable" gate now names UnregisterExternalKvByNode. - InMemoryMasterMetadataStore behavioral suite green (36 cases) via the standalone g++ build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
UMBP C++ tests were split across two trees with two CMake wirings and a duplicate GoogleTest FetchContent: the standard tests/cpp/umbp and a second suite under src/umbp/tests (built unconditionally, with its own gtest fetch). Move the src/umbp/tests suite into tests/cpp/umbp/distributed/ (master/peer/ index/store/router logic = distributed layer), merge its target definitions into distributed/CMakeLists.txt (keeping gtest_discover_tests registration), and drop the src/umbp/tests subdirectory and its duplicate googletest fetch. The whole UMBP suite now builds under the single BUILD_TESTS + BUILD_UMBP gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Now that all consumers route through IMasterMetadataStore (Phase 3),
remove the four superseded master-side state holders and their
class-specific test suites:
- GlobalBlockIndex, ClientRegistry, ExternalKvBlockIndex,
ExternalKvHitIndex (header + cpp each)
- their Phase 0a `using NodeTierKey` / `using NodeMatch` aliases
(lived inside the deleted headers)
- GlobalBlockIndex::GetMetrics (zero callers outside the class)
Behavioral coverage moved to the store suite
(test_in_memory_master_metadata_store), so the five class-specific
tests are dropped. Two suites that still constructed the old classes
as fixtures are migrated to InMemoryMasterMetadataStore instead:
- test_ssd_reliability: ApplyEvents/Lookup/BatchLookupForRouteGet ->
RegisterClient + ApplyHeartbeat (ascending seq) + LookupBlock /
BatchLookupBlockForRouteGet
- test_umbp_tags Suite 1: ClientRegistry -> store
RegisterClient/ApplyHeartbeat/GetClientTags/UnregisterClient
BlockMetrics and EvictionCandidate stay in types.h (still used by the
store's BlockEntry, eviction_manager, and test_types).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n docker The UMBP C++ tests were consolidated under tests/cpp/umbp/distributed (commit 00652fa); remove the now-duplicate originals left behind under src/umbp/tests (their CMakeLists was already deleted in that commit, so these files were orphaned and unbuilt). Also add .rocprofv3/ to .dockerignore so profiler output stays out of the docker build context. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.