// PDF TO EXCEL
PDF Table to Excel Converter
Convert any table inside a PDF into a clean Excel sheet, preserving headers, merged cells, and units.
> convert tables.pdf --to xlsx
Pages: 26 Tables detected: 11 Merged-header tables: 4 Multi-page tables: 2 (stitched) Output: tables.xlsx Sheet 1 Table_p3_t1 Sheet 2 Table_p6_t1 Sheet 3 Table_p9_t1 (multi-page) ...
// EXAMPLE INPUT
$ convert tables.pdf --to xlsx
// EXAMPLE OUTPUT
Pages: 26 Tables detected: 11 Merged-header tables: 4 Multi-page tables: 2 (stitched) Output: tables.xlsx Sheet 1 Table_p3_t1 Sheet 2 Table_p6_t1 Sheet 3 Table_p9_t1 (multi-page) ...
// EXTRACTION LOGIC
Table detection works on text-based and image-based PDFs (OCR fallback). Merged header cells become Excel merged ranges. Numeric formatting is preserved.
// SOURCE-LINKED OUTPUT
Each output sheet includes a header row pointing to the PDF source page and table index, plus a per-cell coordinate mapping in sources.json.
{ file, page, table_id, row_id, cell_id, label, value, unit, period }// FAQ
Does it work on scanned PDFs?
Yes. Image-based PDFs are OCR'd before table detection.
Can it merge multi-page tables?
Yes — tables that continue across pages are stitched into a single sheet.
Are numeric formats preserved?
Currency symbols, thousand separators, and decimal precision are preserved in the Excel output.
// RELATED TOOLS
Annual Reports
Annual Report Table Extractor
Extract every table from a company annual report into clean, structured rows and columns.
PDF to Excel
Quarterly Results PDF Extractor
Convert quarterly results PDFs into a comparable workbook with standalone, consolidated, and segment views.
PDF to Excel
Shareholding Pattern Extractor
Convert shareholding pattern disclosures into a clean, category-level table.
// EARLY ACCESS
Get early access to the PDF Table to Excel Converter
Paper Data is currently in private beta. Request access to start converting your financial documents into source-linked tables.
