Benchmark
Wide table → client-ready PDF
The hard job: turn a wide spreadsheet into a PDF a client can actually read — every column kept, headers repeated, nothing shrunk to illegible or clipped off the page. Generic “print the HTML table” tools and one-shot LLM rendering fail this in two predictable ways. Here’s the measurement, on real files, with one ruler for every tool.
Results
| Dataset | Size | FitForPDF | Naive headless-Chrome |
|---|---|---|---|
Invoices (horizontal) 100+ rows x 12 columns - needs multi-page pagination with the header row repeated on every page. | 12×100 | 100OK 26 pages | 60WARN 2 pages font shrunk to illegible |
Operations log (wide + long) 181 rows x 15 columns - many rows AND wide; both axes paginate. | 15×180 | 100OK 48 pages | 60WARN 4 pages font shrunk to illegible |
24 wide columns 24 columns with long header names - overflows a single page; must split horizontally and stay legible. | 24×11 | 100OK 10 pages | 40FAIL 1 pages columns collapsed/clipped · font shrunk to illegible |
35 columns (extreme) 35 columns - the widest case; an extreme horizontal split into many labelled sections. | 35×17 | 100OK 10 pages | 60WARN 1 pages font shrunk to illegible |
Contacts (email / phone / company) 16 typed columns that each need a minimum readable width (email, phone, company) - splitting without clipping is the test. | 16×1 | 100OK 3 pages | 60WARN 1 pages font shrunk to illegible |
Unicode / accents Non-ASCII values and headers - font fallback and width safety (no clipping / replacement chars). | 6×7 | 100OK 4 pages | 45FAIL 1 pages font shrunk to illegible · columns collapsed/clipped |
One giant text column A single very long free-text column - overflow, wrapping and row-height limits. | 8×2 | 100OK 3 pages | 60WARN 1 pages font shrunk to illegible |
Score is 0–100 with an OK / WARN / FAIL verdict. FitForPDF typically uses more pages — it splits wide tables into sections to stay legible rather than cramming everything onto one clipped page. That trade-off is shown, not hidden.
How it’s measured
- One ruler for everyone. Every output PDF is scored by the same deterministic scorer, which reads the rendered PDF — it has no idea which tool produced it. It judges column preservation, header repetition, legible font size, clipped/blank pages, and pagination clarity.
- Real runs only. The two tools here were actually run: FitForPDF’s engine, and a plain
<table>printed to PDF by headless Chromium — the generic Puppeteer/Gotenberg approach, used the normal way. No numbers are invented. - The scorer is ours, and that’s disclosed. It’s FitForPDF’s production quality gate — open in the harness, so you can audit it, challenge it, or swap your own. The corpus and method don’t change.
Not yet run (method published, contribute a real run): DocRaptor, WeasyPrint, Gotenberg.
Reproduce it / read it as data
The corpus, the runner, and the scorer are open. Run it yourself or add your tool to the table — same corpus, same scorer, same options is the whole contract.
Run your own wide file
Drop a wide spreadsheet and see the client-ready PDF — no cleanup, no clipped columns.
