How to scan and OCR like a pro with open source tools


 
Thread Tools Search this Thread
Special Forums News, Links, Events and Announcements UNIX and Linux RSS News How to scan and OCR like a pro with open source tools
# 1  
Old 06-24-2008
How to scan and OCR like a pro with open source tools

Tue, 24 Jun 2008 18:00:00 GMT
With optical character recognition (OCR), you can scan the contents of a document into a single file of editable text. This article, which focuses on scanning books, describes the steps you need to take to prepare pages for optimal OCR results, and compares various free OCR tools to determine which is the best at extracting the text.


Source...
Login or Register to Ask a Question

Previous Thread | Next Thread

1 More Discussions You Might Find Interesting

1. Programming

Developing Open Source tools

Hi Gurus, I am very much interested in developing and publishing a small piece of open source code which would be the best place to start off. I am just a beginner in unix ,c. Please provide me some suggestions. Thanks. (3 Replies)
Discussion started by: ennstate
3 Replies
Login or Register to Ask a Question
doctools::changelog(n)						Documentation tools					    doctools::changelog(n)

__________________________________________________________________________________________________________________________________________________

NAME
doctools::changelog - Processing text in Emacs ChangeLog format SYNOPSIS
package require Tcl 8.2 package require textutil package require doctools::changelog ?1? ::doctools::changelog::scan text ::doctools::changelog::toDoctools title module version entries ::doctools::changelog::merge entries... _________________________________________________________________ DESCRIPTION
This package provides Tcl commands for the processing and reformatting of text in the "ChangeLog" format generated by emacs. API
::doctools::changelog::scan text The command takes the text and parses it under the assumption that it contains a ChangeLog as generated by emacs. It returns a data structure describing the contents of this ChangeLog. This data structure is a list where each element describes one entry in the ChangeLog. Each element/entry is then a list of three elements describing the date of the entry, its author, and the comments made, in this order. The last item in each element/entry, the comments, is a list of sections. Each section is described by a list containing two elements, a list of file names, and a string containing the true comment associated with the files of the section. { { date author { { {file ...} commenttext } ... } } {...} } ::doctools::changelog::toDoctools title module version entries This command converts the pre-parsed ChangeLog entries as generated by the command ::doctools::changelog::scan into a document in doctools format and returns it as the result of the command. The other three arguments supply the information for the header of that document which is not available from the changelog itself. ::doctools::changelog::merge entries... Each argument of the command is assumed to be a pre-parsed Changelog as generated by the command ::doctools::changelog::scan. This command merges all of them into a single structure, and collapses multiple entries for the same date and author into a single entry. The new structure is returned as the result of the command. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category doctools of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. KEYWORDS
changelog, doctools, emacs CATEGORY
Documentation tools COPYRIGHT
Copyright (c) 2003-2008 Andreas Kupries <andreas_kupries@users.sourceforge.net> doctools 1 doctools::changelog(n)