Benchmark

Wide table → client-ready PDF

The hard job: turn a wide spreadsheet into a PDF a client can actually read — every column kept, headers repeated, nothing shrunk to illegible or clipped off the page. Generic “print the HTML table” tools and one-shot LLM rendering fail this in two predictable ways. Here’s the measurement, on real files, with one ruler for every tool.

On 7 real wide-table files:FitForPDF wins 7/7all OK

Results

DatasetSizeFitForPDFNaive headless-Chrome
Invoices (horizontal)
100+ rows x 12 columns - needs multi-page pagination with the header row repeated on every page.
12×100100OK
26 pages
60WARN
2 pages
font shrunk to illegible
Operations log (wide + long)
181 rows x 15 columns - many rows AND wide; both axes paginate.
15×180100OK
48 pages
60WARN
4 pages
font shrunk to illegible
24 wide columns
24 columns with long header names - overflows a single page; must split horizontally and stay legible.
24×11100OK
10 pages
40FAIL
1 pages
columns collapsed/clipped · font shrunk to illegible
35 columns (extreme)
35 columns - the widest case; an extreme horizontal split into many labelled sections.
35×17100OK
10 pages
60WARN
1 pages
font shrunk to illegible
Contacts (email / phone / company)
16 typed columns that each need a minimum readable width (email, phone, company) - splitting without clipping is the test.
16×1100OK
3 pages
60WARN
1 pages
font shrunk to illegible
Unicode / accents
Non-ASCII values and headers - font fallback and width safety (no clipping / replacement chars).
6×7100OK
4 pages
45FAIL
1 pages
font shrunk to illegible · columns collapsed/clipped
One giant text column
A single very long free-text column - overflow, wrapping and row-height limits.
8×2100OK
3 pages
60WARN
1 pages
font shrunk to illegible

Score is 0–100 with an OK / WARN / FAIL verdict. FitForPDF typically uses more pages — it splits wide tables into sections to stay legible rather than cramming everything onto one clipped page. That trade-off is shown, not hidden.

How it’s measured

  • One ruler for everyone. Every output PDF is scored by the same deterministic scorer, which reads the rendered PDF — it has no idea which tool produced it. It judges column preservation, header repetition, legible font size, clipped/blank pages, and pagination clarity.
  • Real runs only. The two tools here were actually run: FitForPDF’s engine, and a plain <table> printed to PDF by headless Chromium — the generic Puppeteer/Gotenberg approach, used the normal way. No numbers are invented.
  • The scorer is ours, and that’s disclosed. It’s FitForPDF’s production quality gate — open in the harness, so you can audit it, challenge it, or swap your own. The corpus and method don’t change.

Not yet run (method published, contribute a real run): DocRaptor, WeasyPrint, Gotenberg.

Reproduce it / read it as data

The corpus, the runner, and the scorer are open. Run it yourself or add your tool to the table — same corpus, same scorer, same options is the whole contract.

Run your own wide file

Drop a wide spreadsheet and see the client-ready PDF — no cleanup, no clipped columns.