Ocr

04-14-2009

Registered User

150, 1

Join Date: Mar 2009

Last Activity: 14 July 2013, 5:00 PM EDT

Posts: 150

Thanks Given: 1

Thanked 1 Time in 1 Post

Ocr

Is there any open-source software that OCRs PDFs?

CRGreathouse

View Public Profile for CRGreathouse

Find all posts by CRGreathouse

04-14-2009

Registered User

1,213, 19

Join Date: Sep 2006

Last Activity: 2 March 2020, 5:24 AM EST

Location: Rossem, Tazenda

Posts: 1,213

Thanks Given: 7

Thanked 19 Times in 18 Posts

check if this or this is what you want

or maybe this

Yogesh Sawant

View Public Profile for Yogesh Sawant

Visit Yogesh Sawant's homepage!

Find all posts by Yogesh Sawant

04-14-2009

Registered User

150, 1

Join Date: Mar 2009

Last Activity: 14 July 2013, 5:00 PM EDT

Posts: 150

Thanks Given: 1

Thanked 1 Time in 1 Post

I had downloaded Tesseract earlier, but it had a few problems:
* It wouldn't compile (./configure gave C++ errors).
* It doesn't work on pdf files, only tiffs.
* It doesn't work on files with multiple columns
* It doesn't deskew, despeckle, or do other cleanup needed to get sensible output.

I downloaded OCRopus to try to get around some of the limitations, but without being able to compile Tesseract of course that's all for naught.

Code:

checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking for cl.exe... no
checking for g++... no
checking for C++ compiler default output file name... configure: error: C++ compiler cannot create executables
See `config.log' for more details.

config.log.txt (4.9 KB)

CRGreathouse

View Public Profile for CRGreathouse

Find all posts by CRGreathouse

UNIX and Linux Applications

Ocr

3 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

OCR text that needs cleaning

Discussion started by: safran

2. UNIX for Advanced & Expert Users

Regular expression for finding OCR mistakes.

Discussion started by: gencon

3. Shell Programming and Scripting

Working with OCR text inside PDF files

Discussion started by: dorcas