Skip to content

fix(journal): stamp header timestamp on watcher appends (#997)#1001

Merged
carlos-alm merged 6 commits intomainfrom
fix/journal-header-lag
Apr 23, 2026
Merged

fix(journal): stamp header timestamp on watcher appends (#997)#1001
carlos-alm merged 6 commits intomainfrom
fix/journal-header-lag

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Fixes #997.

Summary

  • The watcher appended journal entries but never touched the header timestamp, so after any watch session journal.timestamp stayed frozen at the previous build's finalize time while DB file_hashes.mtime advanced. The next build's Tier 0 check then saw journal.timestamp < MAX(file_hashes.mtime), bailed out, and fell through to the expensive mtime+size and SHA256 rescans — a silent performance cliff on the first build after every watch.
  • Introduces appendJournalEntriesAndStampHeader in src/domain/graph/journal.ts. It reads the existing entries, writes a new header + existing entries + new entries to a tmp file, then atomically renames over the journal. Entries are preserved; the header advances in the same write.
  • Watcher uses it in both journal write paths: the debounced batch handler (processPendingFiles) and the SIGINT flush handler. The "entries newer than header" invariant violation described in bug(journal): header timestamp lags entries during watch → Tier 0 invalidation forces expensive rescan #997 can no longer occur.

Scoped to the header-lag bug only — the parallel-append locking issue tracked in #996 is out of scope.

Test plan

  • npx vitest run tests/unit/journal.test.ts — 22 tests, all pass (added 5 new tests covering creation, header advance, empty-entries stamping, atomic rename, and a bug(journal): header timestamp lags entries during watch → Tier 0 invalidation forces expensive rescan #997 regression guard comparing journal.timestamp against a simulated latestDbMtime).
  • npx vitest run tests/builder/detect-changes.test.ts — Tier 0 path unaffected.
  • npx vitest run tests/integration/watcher-rebuild.test.ts tests/incremental/watcher-incremental.test.ts — watcher tests pass.
  • npx biome check src/domain/graph/journal.ts src/domain/graph/watcher.ts tests/unit/journal.test.ts — clean.

The watcher appended journal entries without touching the header timestamp,
so after any watch session the header remained frozen at the previous build's
finalize time while DB file_hashes.mtime advanced. Tier 0 change detection
then rejected the journal (timestamp stale) and fell through to the expensive
mtime+size + SHA256 rescan — a silent performance cliff on the first build
after every watch.

Introduces appendJournalEntriesAndStampHeader, an atomic tmp+rename that
preserves prior entries and advances the header in one write. Watcher uses
it on both the debounced batch path and the SIGINT flush path, so the
"entries newer than header" invariant violation described in #997 can no
longer occur.

Impact: 4 functions changed, 8 affected
@claude
Copy link
Copy Markdown

claude Bot commented Apr 21, 2026

Claude finished @carlos-alm's task —— View job


Pull Request Review Complete

Reviewing PR #1001: fix(journal): stamp header timestamp on watcher appends

Root Cause Analysis

Problem Being Solved: The watcher appended journal entries but never updated the header timestamp, causing it to remain frozen at the previous build's finalize time. This created a performance cliff where the next build's Tier 0 check would see journal.timestamp < MAX(file_hashes.mtime), reject the journal as stale, and fall back to expensive mtime+size and SHA256 rescans instead of using the fast incremental path.

Why This Approach: The solution introduces appendJournalEntriesAndStampHeader that atomically writes header + existing entries + new entries via tmp file rename. This ensures the "entries newer than header" invariant violation can never occur. The approach is sound - it preserves existing entries while advancing the timestamp in the same atomic operation.

Risk Assessment: Low risk. The implementation uses standard atomic write patterns (tmp file + rename), has comprehensive test coverage including regression guards, and the change is narrowly scoped to fix a specific performance issue without affecting correctness.

Backlog Compliance

  • Zero-dep: ✓ — no new dependencies, builds on existing fs/path modules
  • Foundation-aligned: ✓ — directly supports P1 (always-current graph) by fixing performance degradation in incremental builds
  • Problem-fit: 3 — performance optimization that keeps the graph rebuild fast, which indirectly helps AI agents by maintaining quick feedback loops, but doesn't directly address token waste or hallucinated dependencies
  • Breaking: No — purely internal implementation improvement, no API/CLI changes
  • Tier: 1

Code Quality Assessment

Strengths:

Implementation Details:

  • Line 133-135: Correctly extracts existing body after header line using indexOf('\n') and slice()
  • Line 139: Ensures proper newline handling for existing content
  • Line 143: Builds complete content with new timestamp header + preserved entries + new entries
  • Line 146-147: Standard atomic write pattern
  • Line 149-155: Proper error handling with tmp file cleanup

Watcher Integration:

  • Line 103: Used in debounced batch processing (processPendingFiles)
  • Line 287: Used in SIGINT flush handler for graceful shutdown
  • Both paths now maintain the timestamp invariant correctly

Test Coverage:

No Red Flags:

  • No configuration changes, linting rule relaxations, or test assertion weakening
  • No eslint-disable or workaround comments
  • No breaking changes to existing APIs

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Journal header timestamp lag causing performance cliff in incremental builds
**Why This Approach**: Atomic write operation that preserves entries while advancing timestamp  
**Risk Assessment**: Low - narrow scope, proper atomic write pattern, comprehensive tests

## Backlog Compliance
- Zero-dep: ✓ — no new dependencies
- Foundation-aligned: ✓ — supports P1 (always-current graph)
- Problem-fit: 3 — performance optimization, indirectly helps agent experience
- Breaking: No
- Tier: 1

## Critical Concerns
None. This is a well-implemented performance fix with proper atomic operations and thorough test coverage.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)  
- Action: APPROVE
- Reasoning: Solid fix for a real performance issue. Code quality is high with proper atomic writes, comprehensive testing including regression guards, and clear documentation. Narrowly scoped change with no breaking implications. The only reason it's not 5/5 is that it's "just" a performance fix rather than a major feature, but for what it is, it's excellent work.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR fixes the header-lag bug from #997 by introducing appendJournalEntriesAndStampHeader, which atomically reads the existing journal body, prepends an updated header, and renames a .tmp file into place — ensuring the Tier 0 timestamp check never sees a stale header after a watch session. It also adds buildFlushEntriesFromPending to detect deletions via existence check in the SIGINT flush path, directly addressing the "deleted files journaled as changed" concern from the prior review.

  • The debounce timer cancellation issue flagged as P1 in the prior review (ctx.timer not cleared before closeDb/process.exit in the SIGINT handler) is still unaddressed in this PR; it remains a live race where async DB work started by the timer could be killed mid-operation by process.exit(0) before SQLite commits.

Confidence Score: 4/5

Safe to merge with awareness that the debounce-timer/SIGINT race from the prior review remains open.

The core header-lag fix is correct and well-tested; the deletion-detection P1 from the prior review is now addressed. The outstanding P1 (debounce timer not cancelled before closeDb/process.exit) was already flagged in the previous review cycle and is explicitly out of scope for this PR — it keeps the score at 4 rather than 5.

src/domain/graph/watcher.ts — specifically the SIGINT handler in setupShutdownHandler where ctx.timer is not cleared before closeDb is called.

Important Files Changed

Filename Overview
src/domain/graph/journal.ts Adds appendJournalEntriesAndStampHeader: reads existing body, writes header + existingBody + newEntries to .tmp, then atomically renames — correctly fixes the frozen-header bug and preserves the read-header-strip-rewrite logic cleanly.
src/domain/graph/watcher.ts Switches both journal write paths to appendJournalEntriesAndStampHeader; adds buildFlushEntriesFromPending with existence-check deletion detection. The debounce timer (ctx.timer) is still not cancelled before closeDb/process.exit in the SIGINT handler — a pre-existing P1 flagged in the prior review that remains unaddressed.
tests/unit/journal.test.ts Five new tests added for appendJournalEntriesAndStampHeader: creation, header advance with entry preservation, empty-entry stamping, atomicity (no leftover .tmp), and the #997 regression guard — solid coverage.
tests/unit/watcher-flush.test.ts New test file for buildFlushEntriesFromPending; covers deletion detection, platform-normalized paths, and empty-pending-set guard — directly validates the prior-review P1 fix.

Sequence Diagram

sequenceDiagram
    participant W as Watcher (debounce / SIGINT)
    participant JN as journal.ts
    participant FS as Filesystem

    note over W,FS: Debounced batch path
    W->>W: processPendingFiles(files)
    W->>W: rebuildFile() per file
    W->>JN: appendJournalEntriesAndStampHeader(rootDir, entries, Date.now())
    JN->>FS: readFileSync(journal) — grab existing body
    JN->>FS: writeFileSync(journal.tmp, header + existingBody + newEntries)
    JN->>FS: renameSync(journal.tmp → journal) [atomic]

    note over W,FS: SIGINT flush path
    W->>W: buildFlushEntriesFromPending(rootDir, pending)
    W->>FS: existsSync(filePath) per pending path — detect deletions
    W->>JN: appendJournalEntriesAndStampHeader(rootDir, entries, Date.now())
    JN->>FS: readFileSync(journal)
    JN->>FS: writeFileSync(journal.tmp, header + existingBody + newEntries)
    JN->>FS: renameSync(journal.tmp → journal) [atomic]

    note over W,FS: Next incremental build
    W->>JN: readJournal(rootDir)
    JN-->>W: {timestamp >= MAX(file_hashes.mtime), changed[], removed[]}
    note over W: Tier 0 check passes — fast path preserved
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into fix/journal-hea..." | Re-trigger Greptile

Comment thread src/domain/graph/journal.ts Outdated
Comment on lines +127 to +129
if (!fs.existsSync(dir)) {
fs.mkdirSync(dir, { recursive: true });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 existsSync + mkdirSync redundant check

fs.mkdirSync(dir, { recursive: true }) is a no-op when the directory already exists, so the existsSync guard is unnecessary. The same pattern appears in appendJournalEntries and writeJournalHeader; removing it here would make the code more consistent with idiomatic Node.js and eliminate a minor TOCTOU window between the check and the create.

Suggested change
if (!fs.existsSync(dir)) {
fs.mkdirSync(dir, { recursive: true });
}
fs.mkdirSync(dir, { recursive: true });

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2274eda — replaced the existsSync+mkdirSync pair with bare fs.mkdirSync(dir, { recursive: true }) at all three callsites in journal.ts (appendJournalEntries, writeJournalHeader, appendJournalEntriesAndStampHeader). Closes the TOCTOU window and matches idiomatic Node usage.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 21, 2026

Codegraph Impact Analysis

7 functions changed12 callers affected across 5 files

  • appendJournalEntries in src/domain/graph/journal.ts:62 (0 transitive callers)
  • writeJournalHeader in src/domain/graph/journal.ts:83 (4 transitive callers)
  • appendJournalEntriesAndStampHeader in src/domain/graph/journal.ts:114 (4 transitive callers)
  • writeJournalAndChangeEvents in src/domain/graph/watcher.ts:97 (4 transitive callers)
  • buildFlushEntriesFromPending in src/domain/graph/watcher.ts:289 (1 transitive callers)
  • setupShutdownHandler in src/domain/graph/watcher.ts:300 (2 transitive callers)
  • event:SIGINT in src/domain/graph/watcher.ts:301 (0 transitive callers)

ctx.pending is an untyped Set<string> with no event-type metadata. The
SIGINT flush previously mapped every entry without a deleted flag, so a
file removed during a watch session was journaled as 'changed' and the
next incremental build would try to re-parse the missing path instead of
removing it from the graph.

Extract buildFlushEntriesFromPending which probes each path with
existsSync and emits { deleted: !exists }. Mirrors the deletion detection
in rebuildFile (builder/incremental.ts).

Addresses Greptile P1.

Impact: 3 functions changed, 3 affected
fs.mkdirSync(dir, { recursive: true }) is a no-op when the directory
already exists, so the existsSync guard is dead weight and introduces a
minor TOCTOU window between check and create.

Apply to all three callsites in journal.ts (appendJournalEntries,
writeJournalHeader, appendJournalEntriesAndStampHeader).

Addresses Greptile P2.

Impact: 3 functions changed, 8 affected
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile's review feedback:

  • P1 (SIGINT flush journals deletes as changed) — fixed in 2073c2e. Extracted buildFlushEntriesFromPending in watcher.ts which probes each pending path with existsSync and emits { deleted: !exists }. Mirrors the deletion detection in rebuildFile (builder/incremental.ts), so a file removed during a watch session now lands in the journal as DELETED instead of being treated as changed. Added tests/unit/watcher-flush.test.ts covering deletion detection, path normalization, and empty-set handling.
  • P2 (existsSync + mkdirSync redundancy) — fixed in 2274eda. Replied inline.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 552c4eb into main Apr 23, 2026
14 checks passed
@carlos-alm carlos-alm deleted the fix/journal-header-lag branch April 23, 2026 05:19
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 23, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(journal): header timestamp lags entries during watch → Tier 0 invalidation forces expensive rescan

1 participant