OCR (Optical Character Recognition) is the technology that reads text from a document image. For bank statements, that means identifying each transaction row — date, description, debit or credit amount, and running balance — and turning them into structured data you can edit, verify, and export to your accounting software. Documentric combines OCR with AI-powered table understanding, so it does not just read characters; it understands the layout of a bank statement and maps each value to the correct field.
A digital PDF is one you downloaded directly from your bank's online portal. It contains an invisible text layer — characters the computer can copy and paste. Extraction from digital PDFs is fast and nearly perfect.
A scanned PDF is a photograph of a paper statement, saved as a PDF. There is no text layer — just pixels. Documentric runs an OCR engine over each page to read the characters before structuring the data. Accuracy is slightly lower than digital PDFs but typically exceeds 97% on clean scans. You can review all extracted transactions before exporting, so any outliers are easy to catch and fix inline.
Three stages, fully automated. You upload — Documentric handles the rest.
Upload a digital or scanned PDF — or paste a URL. Documentric accepts any PDF regardless of which bank or software produced it.
For digital PDFs, the text layer is extracted directly. For image-based PDFs, LlamaParse OCR reads each page, detects columns, and corrects page rotation.
Dates, descriptions, debit/credit amounts, and running balances are mapped to a clean, editable transaction table. Review before you export.
Built for the variety of PDF bank statements that reach accountants, lenders, and investigators in real practice.
Extracts the embedded text layer from bank-generated PDFs at full fidelity — no image conversion required, no rounding errors.
Runs OCR on image-based PDFs including photocopied or faxed bank statements. Handles low-resolution scans and skewed pages.
Understands debit/credit column splits, running balance columns, and wrapped description text common in bank statement formatting.
Detects and corrects pages scanned sideways or upside-down before OCR runs, preventing misread rows and column shifts.
Powered by LlamaParse, Documentric achieves 99% extraction accuracy on standard digital statements and over 97% on clean scans.
No template configuration. Documentric reads Chase, Wells Fargo, Bank of America, HSBC, Barclays, and hundreds of other formats out of the box.
Any professional who regularly receives PDF bank statements from clients or counterparties.
Clients often provide months of paper or scanned statements. Documentric converts them to structured data in seconds — no manual entry, no data-entry errors passed to QuickBooks.
Underwriters need 2–3 months of bank statements to verify income and reserves. Documentric extracts every deposit and debit in seconds, preserving exact dates and amounts for the loan file.
Investigators receive PDFs from multiple banks. Documentric converts every statement to a clean transaction table that can be exported to Excel for pattern analysis or timeline reconstruction.
Self-employed clients rarely categorize transactions. Documentric extracts the full year's activity from bank statements so preparers can categorize quickly rather than re-key from paper.
For digital PDFs (text-layer files produced by a bank), accuracy is consistently 99% or higher. For clean scanned PDFs, accuracy typically exceeds 97%. Heavily degraded scans — faded ink, extreme skew, or very low resolution — may reduce accuracy, and we display a confidence indicator so you can spot-check manually.
Yes. Documentric automatically detects whether a PDF contains a text layer or only images. For image-only PDFs it runs full optical character recognition before structuring the data. You do not need to pre-process or convert files — upload the PDF as-is.
Documentric is optimized for machine-printed bank statements, not handwritten documents. Handwritten entries may be partially or incorrectly extracted. Printed statements with handwritten margin notes are typically fine — the printed transaction rows are extracted correctly and margin text is ignored.
Password-protected PDFs cannot be processed until the password protection is removed. Open the PDF in Adobe Acrobat Reader, print it to a PDF printer, and upload the resulting unlocked file. We do not store passwords and cannot decrypt files on your behalf.
Documentric currently handles bank statements in English. Currency symbols (USD, GBP, EUR, CAD, AUD) are extracted as-is alongside amounts. Support for non-English statement layouts is on our roadmap for 2026.