Skip to content

fix: handle empty context lines in unified diffs#51

Merged
sergeyt merged 1 commit intosergeyt:masterfrom
andyfeller:af/handle-empty-lines
Apr 17, 2026
Merged

fix: handle empty context lines in unified diffs#51
sergeyt merged 1 commit intosergeyt:masterfrom
andyfeller:af/handle-empty-lines

Conversation

@andyfeller
Copy link
Copy Markdown
Contributor

@andyfeller andyfeller commented Apr 17, 2026

Summary

Fix empty context lines being silently dropped by the unified diff parser, which causes hunk line counter desynchronization and can cascade into losing entire hunks and files from parsed output.

Fixes #50

Problem

The schemaContent array defines the context (unchanged) line pattern as /^\s+/, which requires one or more whitespace characters. In the unified diff format, context lines are prefixed with a single space — but when blank context lines have that leading space stripped, they become empty strings ("") that match none of the four content patterns and are silently dropped.

This commonly occurs when:

  • Git is configured with diff.suppressBlankEmpty = true
  • Diff output is post-processed to strip trailing whitespace

The Cascade

Dropping a context line means the oldLines/newLines counters never reach 0, so the parser stays in content mode. Subsequent @@ hunk headers and diff --git file boundaries are consumed as content instead of triggering transitions — entire hunks and files are lost.

Fix

One-character change in parse.js: /^\s+//^\s*/

This is safe because schemaContent patterns are checked in order:

  1. /^\ No newline/ — eof marker
  2. /^-/ — deletion
  3. /^\+/ — addition
  4. /^\s*/ — context line (catch-all for remaining hunk body lines)

Within a correctly-formed hunk body, anything that isn't a deletion, addition, or eof marker can only be a context line.

Tests Added

7 new tests in a "empty context lines (suppressBlankEmpty)" describe block:

Test What it covers
between del/add Empty context line between deletion and addition
start of hunk Empty context line as first line after @@ header
end of hunk Empty context line as last line in hunk
multiple consecutive Several empty context lines in a row
second hunk preserved Cascade: empty line in hunk 1 doesn't lose hunk 2
second file preserved Cascade: empty line in file 1 doesn't lose file 2
regression guard Standard context lines with leading space still work

All 30 tests pass (23 existing + 7 new), zero lint errors.

Empty context lines (lines with no leading space) are silently dropped
by the parser because the context line regex /^\s+/ requires one or more
whitespace characters. This causes hunk line counters to desynchronize,
which can cascade into losing entire hunks and files from parsed output.

This commonly occurs when git is configured with
diff.suppressBlankEmpty=true or when diff output is post-processed to
strip trailing whitespace.

Fix: change /^\s+/ to /^\s*/ so the context line pattern matches zero
or more whitespace characters. This is safe because schemaContent
patterns are checked in order — deletions and additions are matched
first, so /^\s*/ acts as a catch-all for remaining hunk body lines.

Fixes sergeyt#50

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@andyfeller andyfeller force-pushed the af/handle-empty-lines branch from 22e402a to a3180a5 Compare April 17, 2026 13:34
Copy link
Copy Markdown
Owner

@sergeyt sergeyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow 😄 . thanks

@sergeyt sergeyt merged commit 251d359 into sergeyt:master Apr 17, 2026
1 check passed
@andyfeller
Copy link
Copy Markdown
Contributor Author

wow 😄 . thanks

Thank you, Sergey! 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Empty context lines are silently dropped, causing lost hunks and corrupted parsing

2 participants