Loading…
Image-based and scanned PDFs need OCR before table reconstruction. pdfintoexcel runs optical character recognition, then rebuilds rows and columns geometrically — the same pipeline we use for digital PDFs.
Choose Scanned (OCR) when your PDF is a photo, scan, or print-to-PDF from a flatbed. Native text PDFs should stay on Normal PDF for faster, more accurate results.
Read the full tutorial: extract tables from scanned PDF.
Which languages does OCR support?
We support 25+ languages for scanned table extraction.
Will merged header cells survive OCR?
Yes. After OCR, our reconstructor preserves merged regions when the grid lines or column alignment are visible.