05-05-2013
Search on CPAN for OCR modules/distributions.
Also, search for Tesseract.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all, again, I have lots of questions I guess. This one should be easier though :) . I have a goal to be able to put some preformatted text into a template (which is now a tiff file, but can be changed) and then output it to a printer. Right now we're thinking PostScript might work or some... (0 Replies)
Discussion started by: pageld
0 Replies
2. Shell Programming and Scripting
How can I redirect text data to an image (png, jpg, gif... etc) file using perl on unix solaris environment??
Please suggest.
Pooja (1 Reply)
Discussion started by: wadhwa.pooja
1 Replies
3. Shell Programming and Scripting
File1.xml
<?xml version.........>
- <abcd:abcd_list version="1" www.john_uncle's_server.com" xmlns:
- <device id="100">
<firmware>12.4(3d)</firmware>
<location id="500">Sitting Room</location>
</device>
- <device id="101">
<firmware>12.4(3d)</firmware>
<location id="501">Class... (1 Reply)
Discussion started by: sureshcisco
1 Replies
4. Shell Programming and Scripting
This is the ldf file
dn: sdcsmsisdn=1000000049,sdcsDatabase=subscriberCache,dc=example,dc=com
objectClass: sdcsSubscriber
objectClass: top
postalCode: 29600
sdcsServiceLevel: 10
sdcsCustomerType: 14
givenName: Adelia
sdcsBlackListAll: FALSE
sdcsOwnerType: T-Mobile
sn: Actionteam... (1 Reply)
Discussion started by: LinuxFriend
1 Replies
5. UNIX for Dummies Questions & Answers
Hi,
We have an app specific legacy environment running SCO Openserver 5.0.7. I need to be able to (1) scan a pre-existing “form” consisting of logo/boxes/lines/static text as an image , (2) lay a print file from the app "on top of the image" and (3) output the "merge" as a PDF file.
Scanning... (1 Reply)
Discussion started by: 65bit
1 Replies
6. Shell Programming and Scripting
hi
i am trying to extract some specific data out of a text file using regular expressions with shell script
that is using a multiline grep .. and the tool i am using is pcregrep so that i can get compatibility with perl's regular expressions
for a sample data like this, i am trying to grab... (6 Replies)
Discussion started by: vemkiran
6 Replies
7. Shell Programming and Scripting
All,
I require a script that grabs some text from the gitHub API and will grep (or other function) for a string a characters that starts with (") quotes followed by two letters, may contain a pipe |, and ending with ) . What i have so far is below but it's not returning anything.
... (4 Replies)
Discussion started by: ChocoTaco
4 Replies
8. Shell Programming and Scripting
Hi everyone,
I'm a perl newbie and need your help to extract a word inside the list of files with same pattern.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:mycode xmlns:ns2="http://www.abcd.com/pqrs/acfSchema-2007a.xsd">
<id>10</id>
<name>PaymentServices</name>
... (7 Replies)
Discussion started by: jhamaks
7 Replies
9. Shell Programming and Scripting
Hi
Can someone help me writing a script to convert the texts in a text file into images for each token?
Thanks in advance.
:) (1 Reply)
Discussion started by: my_Perl
1 Replies
10. Shell Programming and Scripting
I am trying to extract text after keywords fron an html file. The keywords are reportLink":, "barcodedSamples": {", "barcodedSamples": {". Both the perl and awk run but the output is just the entire index.html not the desired output. Also for the reportLink": only the text after the second / until... (5 Replies)
Discussion started by: cmccabe
5 Replies
LEARN ABOUT CENTOS
extract
EXTRACT(1) General Commands Manual EXTRACT(1)
NAME
extract - determine meta-information about a file
SYNOPSIS
extract [ -bghLnvV ] [ -H hash-algorithm ] [ -i ] [ -l library ] [ -p type ] [ -x type ] file ...
DESCRIPTION
This manual page documents version 0.6.0 of the extract command.
extract tests each file specified in the argument list in an attempt to infer meta-information from it. Each file is subjected to the
meta-data extraction libraries from libextractor.
libextractor classifies meta-information (also referred to as keywords) into types. A list of all types can be obtained with the -L option.
OPTIONS
-b Display the output in BiBTeX format.
-g Use grep-friendly output (all keywords on a single line for each file). Use the verbose option to print the filename first, fol-
lowed by the keywords. Use the verbose option twice to also display the keyword types. This option will not print keyword types
or non-textual metadata.
-h Print a brief summary of the options.
-i Run plugins in-process (for debugging). By default, each plugin is run in its own process.
-l libraries
Use the specified libraries to extract keywords. The general format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*] where
LIBRARYNAME is a libextractor compatible library and typically of the form .Ijpeg. The minus before the libraryname indicates that
this library should be removed from the existing list. To run only a few selected plugins, use -l in combination with -n.
-L Print a list of all known keyword types.
-n Do not use the default set of extractors (typically all standard extractors, currently mp3, ogg, jpg, gif, png, tiff, real, html,
pdf and mime-types), use only the extractors specified with the .B -l option.
-p type
Print only the keywords matching the specified type. By default, all keywords that are found and not removed as duplicates are
printed.
-v Print the version number and exit.
-V Be verbose. This option can be specified multiple times to increase verbosity further.
-x type
Exclude keywords of the specified type from the output. By default, all keywords that are found and not removed as duplicates are
printed.
SEE ALSO
libextractor(3) - description of the libextractor library
EXAMPLES
$ extract test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
mimetype - image/jpeg
$ extract -V -x comment test/test.jpg
Keywords for file test/test.jpg:
mimetype - image/jpeg
$ extract -p comment test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
$ extract -nV -l png.so -p comment test/test.jpg test/test.png
Keywords for file test/test.jpg:
Keywords for file test/test.png:
comment - Testing keyword extraction
LEGAL NOTICE
libextractor and the extract tool are released under the GPL. libextractor is a GNU package.
BUGS
A couple of file-formats (on the order of 10^3) are not recognized...
AUTHORS
extract was originally written by Christian Grothoff <christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use <libextrac-
tor@gnu.org> to contact the current maintainer(s).
AVAILABILITY
You can obtain the original author's latest version from http://www.gnu.org/software/libextractor/
libextractor 0.6.0 Dec 20, 2009 EXTRACT(1)