feat: cache lifecycle — consistent cleanup, warm/cold TTL, long-URL fix#319
Merged
Conversation
Consistency: clearCache() empties BOTH the in-memory NSCache and the on-disk files; reset() = clearCache() + wipe credentials/headers/fakes. The static deleteCachedFiles() (disk-only — it left memory serving deleted data) is gone. Warm/cold expiry: on-disk entries carry a sliding TTL whose clock is last use. Every read/write records access into a lock-synchronized in-memory CacheExpiry map (never persisted, files never touched on read); objectFromCache drops a disk entry idle past cacheTTL (default 7 days), and a detached startup sweep bulk-removes aged files off the request path. Memory is the warm tier — no exposed NSCache limits; pressure eviction handles it. Long-URL fix: destinationURL hashes the filename component past the filesystem 255-byte limit (readable prefix + SHA-256 suffix) so a long URL caches instead of throwing. Flipped testCachingALong... from documenting the failure to asserting success. cacheTTL: Duration (non-optional, default 7 days) via init + setCacheTTL. README, repo notes, CHANGELOG, roadmap #10 updated. Breaking (v8).
The README claimed a verb request with .none drops a single cached entry, but .none is the *default* for get — making it purge would wipe deliberately-cached entries on every plain get(path). The code is right (.none bypasses: no read, no write, no purge); the doc was wrong. Reword to: no per-path purge (use clearCache() or let the TTL expire entries); .none forces a refresh only on the download path. Add testNoneCachingLevelDoesNotPurgeAnExistingEntry to guard it.
Replace the in-memory access map (which didn't survive launches) with the file's modification date as the sliding-TTL clock: objectFromCache expires a disk entry whose mtime is older than cacheTTL and re-warms on a disk hit by bumping mtime. NSCache absorbs repeat reads, so the touch is ~once per entry per launch — no explicit debounce. CacheExpiry collapses to a synchronized TTL holder (isExpired(fileDate:)). Shard the cache dir from the start: destinationURL lays files under domain/<shard>/<file> (one hex nibble of the key hash, 16 shards); the startup sweep does one shard per launch via a .sweep-shard cursor, so per-launch work is O(N/16) and never spikes (cf. benchmark #320). Sweep and deleteCacheFolder serialize on cacheMutationLock so the background sweep can't race clearCache. Also clears pre-sharding strays at the domain root (one-time migration). Tests updated/added (mtime expiry, disk-hit re-warm, sharding layout); full suite green (176). README/repo notes updated.
…test The persisted (mtime) clock is refreshed by disk reads and writes, not by in-memory NSCache hits — so an entry kept warm only in memory past cacheTTL can read as cold once evicted. The README claimed reading always keeps an entry warm, which overstated it. Narrow the wording (rather than add a per-hit disk touch / throttle map for a rare edge), document the accepted behavior in the repo notes, and add testMemoryHitDoesNotReWarmFileDate to pin it alongside the disk-hit re-warm test.
…istic) testMemoryHitDoesNotReWarmFileDate assumed the entry stays in NSCache between the write and the read, but NSCache can evict at any time (it did on CI) — on eviction the read becomes a disk hit that legitimately re-warms the mtime, so the assertion failed. A memory hit can't be forced deterministically, so the behavior stays documented (README/repo notes) + covered on the disk side by testDiskHitReWarmsFileDate.
- init sweep comment described a removed in-memory access log (stale from the pre-mtime design) — rewrite to the actual sharded-file-age sweep - cacheTTL doc said 'each read re-warms' but memory reads don't (the same overclaim flagged on the README) — narrow to 'a disk read or write' - trim clearCache's historical aside, a narration line in objectFromCache, and a doc/inline dup in the sweep
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements roadmap #10: consistent cache cleanup + warm/cold TTL, plus the long-URL fix. Persistence/layout reflect the benchmark in #320 (
mtime+ sharding beat SQLite for this profile). One PR — all the same cache surface.Consistency (latent bug)
reset()cleared memory + disk; the staticdeleteCachedFiles()cleared only disk → the in-memoryNSCachekept serving "deleted" data.clearCache()empties both tiers;reset()=clearCache()+ wipe credentials/headers/fakes. The disk-only static is removed.Warm/cold expiry — persisted, no manifest
On-disk entries carry a sliding TTL (
cacheTTL: Duration, non-optional, default 7 days;init(…, cacheTTL:)/setCacheTTL).objectFromCachedrops a disk entry whose mtime is past the TTL; a disk hit re-warms by bumping mtime.NSCacheis the debounce: it absorbs repeat reads, so the mtime touch happens ~once per entry per launch — no explicit debounce logic.NSCachecount/cost limits exposed (pressure eviction handles it).Sharded from the start
destinationURLlays files underdomain/<shard>/<file>(shard = one hex nibble of the key hash, 16 shards) — unconditionally, so there's no size threshold or runtime migration..sweep-shardcursor) → per-launch work is O(N/16), so the sweep never spikes (cf. spike: benchmark cache-access persistence — file mtime vs SQLite #320: 864 ms full-scan vs 53 ms sharded at 200k). Trade: a never-requested dead file is GC'd within ~16 launches (lazy-on-read expires anything you actually fetch immediately).deleteCacheFolderserialize on a lock so the background sweep can't race a clear. Pre-sharding strays at the domain root are cleaned once (v8 invalidates existing on-disk caches).Long-URL fix
destinationURLhashes the filename component past the filesystem's 255-byte limit (readable prefix + SHA-256). Red-first: flippedtestCachingALongURL…from asserting the throw to asserting success.Tests
CacheExpiry.isExpired(fileDate:), cold-disk-entry-expires-on-read, disk-hit-re-warms-mtime, sharded layout,clearCacheclears memory + folder-scoping,.nonedoesn't purge. Full suite green (176).Breaking vs 7.0.0 (v8 bump). README, repo notes, CHANGELOG, roadmap #10 updated.