OCR PDF
Extract text from a scanned PDF using on-device OCR.
OCR a Scanned PDF Online
The PDF OCR tool runs optical character recognition on a scanned PDF so the text it contains becomes selectable, searchable, and copyable. Use it for digitising printed receipts, archived contracts, books, and historical documents.
Multi-language support
Pick one or more languages — the recogniser benefits from explicit hints about the script used in the document. For Indian invoices in English with Devanagari headers, select both English and Hindi.
Privacy
Recognition runs in your browser using a WebAssembly build of the open-source Tesseract engine. Language data is downloaded once and cached, after which OCR runs completely offline. Your PDF and the extracted text never leave your device.