Skip to content

feat: cache lifecycle — consistent cleanup, warm/cold TTL, long-URL fix#319

Merged
3lvis merged 6 commits into
masterfrom
feature/cache-lifecycle
Jun 17, 2026
Merged

feat: cache lifecycle — consistent cleanup, warm/cold TTL, long-URL fix#319
3lvis merged 6 commits into
masterfrom
feature/cache-lifecycle

Conversation

@3lvis

@3lvis 3lvis commented Jun 17, 2026

Copy link
Copy Markdown
Owner

Implements roadmap #10: consistent cache cleanup + warm/cold TTL, plus the long-URL fix. Persistence/layout reflect the benchmark in #320 (mtime + sharding beat SQLite for this profile). One PR — all the same cache surface.

Consistency (latent bug)

reset() cleared memory + disk; the static deleteCachedFiles() cleared only disk → the in-memory NSCache kept serving "deleted" data.

  • clearCache() empties both tiers; reset() = clearCache() + wipe credentials/headers/fakes. The disk-only static is removed.

Warm/cold expiry — persisted, no manifest

On-disk entries carry a sliding TTL (cacheTTL: Duration, non-optional, default 7 days; init(…, cacheTTL:) / setCacheTTL).

  • The clock is the file's modification date — persisted by the filesystem, so warmth survives launches with no in-memory map and no manifest. objectFromCache drops a disk entry whose mtime is past the TTL; a disk hit re-warms by bumping mtime.
  • The NSCache is the debounce: it absorbs repeat reads, so the mtime touch happens ~once per entry per launch — no explicit debounce logic.
  • Memory is the warm tier; no NSCache count/cost limits exposed (pressure eviction handles it).

Sharded from the start

  • destinationURL lays files under domain/<shard>/<file> (shard = one hex nibble of the key hash, 16 shards) — unconditionally, so there's no size threshold or runtime migration.
  • The startup sweep does one shard per launch (rotated via a .sweep-shard cursor) → per-launch work is O(N/16), so the sweep never spikes (cf. spike: benchmark cache-access persistence — file mtime vs SQLite #320: 864 ms full-scan vs 53 ms sharded at 200k). Trade: a never-requested dead file is GC'd within ~16 launches (lazy-on-read expires anything you actually fetch immediately).
  • Sweep and deleteCacheFolder serialize on a lock so the background sweep can't race a clear. Pre-sharding strays at the domain root are cleaned once (v8 invalidates existing on-disk caches).

Long-URL fix

destinationURL hashes the filename component past the filesystem's 255-byte limit (readable prefix + SHA-256). Red-first: flipped testCachingALongURL… from asserting the throw to asserting success.

Tests

CacheExpiry.isExpired(fileDate:), cold-disk-entry-expires-on-read, disk-hit-re-warms-mtime, sharded layout, clearCache clears memory + folder-scoping, .none doesn't purge. Full suite green (176).

Breaking vs 7.0.0 (v8 bump). README, repo notes, CHANGELOG, roadmap #10 updated.

3lvis added 2 commits June 17, 2026 07:55
Consistency: clearCache() empties BOTH the in-memory NSCache and the on-disk
files; reset() = clearCache() + wipe credentials/headers/fakes. The static
deleteCachedFiles() (disk-only — it left memory serving deleted data) is gone.

Warm/cold expiry: on-disk entries carry a sliding TTL whose clock is last use.
Every read/write records access into a lock-synchronized in-memory CacheExpiry
map (never persisted, files never touched on read); objectFromCache drops a
disk entry idle past cacheTTL (default 7 days), and a detached startup sweep
bulk-removes aged files off the request path. Memory is the warm tier — no
exposed NSCache limits; pressure eviction handles it.

Long-URL fix: destinationURL hashes the filename component past the filesystem
255-byte limit (readable prefix + SHA-256 suffix) so a long URL caches instead
of throwing. Flipped testCachingALong... from documenting the failure to
asserting success.

cacheTTL: Duration (non-optional, default 7 days) via init + setCacheTTL.
README, repo notes, CHANGELOG, roadmap #10 updated. Breaking (v8).
The README claimed a verb request with .none drops a single cached entry, but
.none is the *default* for get — making it purge would wipe deliberately-cached
entries on every plain get(path). The code is right (.none bypasses: no read,
no write, no purge); the doc was wrong. Reword to: no per-path purge (use
clearCache() or let the TTL expire entries); .none forces a refresh only on the
download path. Add testNoneCachingLevelDoesNotPurgeAnExistingEntry to guard it.
3lvis added 4 commits June 17, 2026 13:06
Replace the in-memory access map (which didn't survive launches) with the
file's modification date as the sliding-TTL clock: objectFromCache expires a
disk entry whose mtime is older than cacheTTL and re-warms on a disk hit by
bumping mtime. NSCache absorbs repeat reads, so the touch is ~once per entry
per launch — no explicit debounce. CacheExpiry collapses to a synchronized TTL
holder (isExpired(fileDate:)).

Shard the cache dir from the start: destinationURL lays files under
domain/<shard>/<file> (one hex nibble of the key hash, 16 shards); the startup
sweep does one shard per launch via a .sweep-shard cursor, so per-launch work
is O(N/16) and never spikes (cf. benchmark #320). Sweep and deleteCacheFolder
serialize on cacheMutationLock so the background sweep can't race clearCache.
Also clears pre-sharding strays at the domain root (one-time migration).

Tests updated/added (mtime expiry, disk-hit re-warm, sharding layout); full
suite green (176). README/repo notes updated.
…test

The persisted (mtime) clock is refreshed by disk reads and writes, not by
in-memory NSCache hits — so an entry kept warm only in memory past cacheTTL can
read as cold once evicted. The README claimed reading always keeps an entry
warm, which overstated it. Narrow the wording (rather than add a per-hit disk
touch / throttle map for a rare edge), document the accepted behavior in the
repo notes, and add testMemoryHitDoesNotReWarmFileDate to pin it alongside the
disk-hit re-warm test.
…istic)

testMemoryHitDoesNotReWarmFileDate assumed the entry stays in NSCache between
the write and the read, but NSCache can evict at any time (it did on CI) — on
eviction the read becomes a disk hit that legitimately re-warms the mtime, so
the assertion failed. A memory hit can't be forced deterministically, so the
behavior stays documented (README/repo notes) + covered on the disk side by
testDiskHitReWarmsFileDate.
- init sweep comment described a removed in-memory access log (stale from the
  pre-mtime design) — rewrite to the actual sharded-file-age sweep
- cacheTTL doc said 'each read re-warms' but memory reads don't (the same
  overclaim flagged on the README) — narrow to 'a disk read or write'
- trim clearCache's historical aside, a narration line in objectFromCache, and
  a doc/inline dup in the sweep
@3lvis 3lvis marked this pull request as ready for review June 17, 2026 21:11
@3lvis 3lvis merged commit 2306383 into master Jun 17, 2026
1 check passed
@3lvis 3lvis deleted the feature/cache-lifecycle branch June 17, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant