Bug Report
Description of the problem
When generating PDFs with more than 256 unique characters, PDFKit 0.17.2 generates a ToUnicode CMap with multiple bfrange entries (correctly split at 256-character boundaries), but the beginbfrange declaration is hardcoded to 1 instead of the actual number of ranges.
Impact:
- PDFs display correctly in all viewers
- Text copying works in WPS Office
- Text copying fails in Chrome/Edge (PDFium-based browsers) - produces garbled text even for numbers
- PDFium strictly validates PDF specifications and rejects the entire ToUnicode CMap when the count doesn't match
Root Cause:
In lib/font/embedded.js (or js/pdfkit.js in compiled version), the toUnicodeCmap() method correctly splits Unicode mappings into chunks of 256 characters, but hardcodes 1 beginbfrange instead of using ${ranges.length}.
Example:
When a PDF contains 377 unique characters:
- Code generates: 2 bfrange entries (0x0000-0x00ff and 0x0100-0x0178)
- CMap declares:
1 beginbfrange
- PDFium detects mismatch and rejects the CMap
- Result: Text copying uses fallback encoding → garbled output
Related Issue:
This appears to be a regression introduced while fixing issue #1498 (256-character boundary problem). The fix correctly implemented chunking but forgot to update the count declaration.
Code sample
Minimal Reproduction
const PDFDocument = require('pdfkit');
const fs = require('fs');
// Create a PDF with more than 256 unique characters
const doc = new PDFDocument();
doc.pipe(fs.createWriteStream('test.pdf'));
// Register a font (SimSun or any Unicode font)
doc.registerFont('SimSun', './path/to/SimSun.ttf');
doc.font('SimSun')
.fontSize(12)
.text('测试文本:' + 'A'.repeat(300), 100, 100); // More than 256 chars
doc.end();
Generated ToUnicode CMap (incorrect)
1 beginbfrange
<0000> <00ff> [<...>]
<0100> <0178> [<...>]
endbfrange
Problem: Declares 1 but has 2 entries.
Expected ToUnicode CMap (correct)
2 beginbfrange
<0000> <00ff> [<...>]
<0100> <0178> [<...>]
endbfrange
Fix
In lib/font/embedded.js, line ~2587 (or equivalent in compiled version):
Before:
cmap.end(`\
...
1 beginbfrange
${ranges.join('\n')}
endbfrange
...
`);
After:
cmap.end(`\
...
${ranges.length} beginbfrange
${ranges.join('\n')}
endbfrange
...
`);
Verification
- Generate PDF with >256 unique characters
- Open in Chrome
- Try to copy text
- Expected: Text copies correctly
- Actual: Text is garbled
To verify the CMap issue:
# Extract and decompress PDF streams
# Search for "beginbfrange" in decompressed content
# Count actual entries vs declared count
Your environment
- pdfkit version: 0.17.2
- Node version: v18.18.2
- Browser version (if applicable):
- Chrome 120+ (PDFium)
- Edge 120+ (Chromium-based, PDFium)
- Operating System: macOS 25.2.0 (Darwin)
Additional Information
PDF Specification Reference
According to PDF specification (ISO 32000-1:2008), the number after beginbfrange must exactly match the number of bfrange entries that follow:
The number after beginbfrange indicates how many bfrange entries follow. This number must match the actual count of entries.
Workaround
Temporary workaround: Patch node_modules/pdfkit/js/pdfkit.js:
- Find:
1 beginbfrange
- Replace:
${ranges.length} beginbfrange
Suggested Fix
Change line ~2587 in lib/font/embedded.js:
- 1 beginbfrange
+ ${ranges.length} beginbfrange
This ensures the count always matches the actual number of ranges generated.
Bug Report
Description of the problem
When generating PDFs with more than 256 unique characters, PDFKit 0.17.2 generates a ToUnicode CMap with multiple
bfrangeentries (correctly split at 256-character boundaries), but thebeginbfrangedeclaration is hardcoded to1instead of the actual number of ranges.Impact:
Root Cause:
In
lib/font/embedded.js(orjs/pdfkit.jsin compiled version), thetoUnicodeCmap()method correctly splits Unicode mappings into chunks of 256 characters, but hardcodes1 beginbfrangeinstead of using${ranges.length}.Example:
When a PDF contains 377 unique characters:
1 beginbfrangeRelated Issue:
This appears to be a regression introduced while fixing issue #1498 (256-character boundary problem). The fix correctly implemented chunking but forgot to update the count declaration.
Code sample
Minimal Reproduction
Generated ToUnicode CMap (incorrect)
Problem: Declares
1but has2entries.Expected ToUnicode CMap (correct)
Fix
In
lib/font/embedded.js, line ~2587 (or equivalent in compiled version):Before:
After:
Verification
To verify the CMap issue:
Your environment
Additional Information
PDF Specification Reference
According to PDF specification (ISO 32000-1:2008), the number after
beginbfrangemust exactly match the number ofbfrangeentries that follow:Workaround
Temporary workaround: Patch
node_modules/pdfkit/js/pdfkit.js:1 beginbfrange${ranges.length} beginbfrangeSuggested Fix
Change line ~2587 in
lib/font/embedded.js:This ensures the count always matches the actual number of ranges generated.