Sponsored Content
Top Forums Shell Programming and Scripting Perl script to extract text from image file Post 302802885 by elixir_sinari on Sunday 5th of May 2013 10:31:45 AM
Old 05-05-2013
Search on CPAN for OCR modules/distributions.

Also, search for Tesseract.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Stamping Text on an Image File

Hi all, again, I have lots of questions I guess. This one should be easier though :) . I have a goal to be able to put some preformatted text into a template (which is now a tiff file, but can be changed) and then output it to a printer. Right now we're thinking PostScript might work or some... (0 Replies)
Discussion started by: pageld
0 Replies

2. Shell Programming and Scripting

Redirect text to image file

How can I redirect text data to an image (png, jpg, gif... etc) file using perl on unix solaris environment?? Please suggest. Pooja (1 Reply)
Discussion started by: wadhwa.pooja
1 Replies

3. Shell Programming and Scripting

Perl script to extract 'ID' From XML File

File1.xml <?xml version.........> - <abcd:abcd_list version="1" www.john_uncle's_server.com" xmlns: - <device id="100"> <firmware>12.4(3d)</firmware> <location id="500">Sitting Room</location> </device> - <device id="101"> <firmware>12.4(3d)</firmware> <location id="501">Class... (1 Reply)
Discussion started by: sureshcisco
1 Replies

4. Shell Programming and Scripting

shell or perl script needed for ldif file to text file conversion

This is the ldf file dn: sdcsmsisdn=1000000049,sdcsDatabase=subscriberCache,dc=example,dc=com objectClass: sdcsSubscriber objectClass: top postalCode: 29600 sdcsServiceLevel: 10 sdcsCustomerType: 14 givenName: Adelia sdcsBlackListAll: FALSE sdcsOwnerType: T-Mobile sn: Actionteam... (1 Reply)
Discussion started by: LinuxFriend
1 Replies

5. UNIX for Dummies Questions & Answers

Text file + image/form/overlay file to PDF

Hi, We have an app specific legacy environment running SCO Openserver 5.0.7. I need to be able to (1) scan a pre-existing “form” consisting of logo/boxes/lines/static text as an image , (2) lay a print file from the app "on top of the image" and (3) output the "merge" as a PDF file. Scanning... (1 Reply)
Discussion started by: 65bit
1 Replies

6. Shell Programming and Scripting

regular expression with shell script to extract data out of a text file

hi i am trying to extract some specific data out of a text file using regular expressions with shell script that is using a multiline grep .. and the tool i am using is pcregrep so that i can get compatibility with perl's regular expressions for a sample data like this, i am trying to grab... (6 Replies)
Discussion started by: vemkiran
6 Replies

7. Shell Programming and Scripting

Script extract text from txt file with grep

All, I require a script that grabs some text from the gitHub API and will grep (or other function) for a string a characters that starts with (") quotes followed by two letters, may contain a pipe |, and ending with ) . What i have so far is below but it's not returning anything. ... (4 Replies)
Discussion started by: ChocoTaco
4 Replies

8. Shell Programming and Scripting

Perl script to extract a word from the file

Hi everyone, I'm a perl newbie and need your help to extract a word inside the list of files with same pattern. <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <ns2:mycode xmlns:ns2="http://www.abcd.com/pqrs/acfSchema-2007a.xsd"> <id>10</id> <name>PaymentServices</name> ... (7 Replies)
Discussion started by: jhamaks
7 Replies

9. Shell Programming and Scripting

Script to convert text to image

Hi Can someone help me writing a script to convert the texts in a text file into images for each token? Thanks in advance. :) (1 Reply)
Discussion started by: my_Perl
1 Replies

10. Shell Programming and Scripting

Extract text from html using perl or awk

I am trying to extract text after keywords fron an html file. The keywords are reportLink":, "barcodedSamples": {", "barcodedSamples": {". Both the perl and awk run but the output is just the entire index.html not the desired output. Also for the reportLink": only the text after the second / until... (5 Replies)
Discussion started by: cmccabe
5 Replies
EXTRACT(1)						      General Commands Manual							EXTRACT(1)

NAME
extract - determine meta-information about a file SYNOPSIS
extract [ -bghLnvV ] [ -H hash-algorithm ] [ -i ] [ -l library ] [ -p type ] [ -x type ] file ... DESCRIPTION
This manual page documents version 0.6.0 of the extract command. extract tests each file specified in the argument list in an attempt to infer meta-information from it. Each file is subjected to the meta-data extraction libraries from libextractor. libextractor classifies meta-information (also referred to as keywords) into types. A list of all types can be obtained with the -L option. OPTIONS
-b Display the output in BiBTeX format. -g Use grep-friendly output (all keywords on a single line for each file). Use the verbose option to print the filename first, fol- lowed by the keywords. Use the verbose option twice to also display the keyword types. This option will not print keyword types or non-textual metadata. -h Print a brief summary of the options. -i Run plugins in-process (for debugging). By default, each plugin is run in its own process. -l libraries Use the specified libraries to extract keywords. The general format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*] where LIBRARYNAME is a libextractor compatible library and typically of the form .Ijpeg. The minus before the libraryname indicates that this library should be removed from the existing list. To run only a few selected plugins, use -l in combination with -n. -L Print a list of all known keyword types. -n Do not use the default set of extractors (typically all standard extractors, currently mp3, ogg, jpg, gif, png, tiff, real, html, pdf and mime-types), use only the extractors specified with the .B -l option. -p type Print only the keywords matching the specified type. By default, all keywords that are found and not removed as duplicates are printed. -v Print the version number and exit. -V Be verbose. This option can be specified multiple times to increase verbosity further. -x type Exclude keywords of the specified type from the output. By default, all keywords that are found and not removed as duplicates are printed. SEE ALSO
libextractor(3) - description of the libextractor library EXAMPLES
$ extract test/test.jpg comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1 mimetype - image/jpeg $ extract -V -x comment test/test.jpg Keywords for file test/test.jpg: mimetype - image/jpeg $ extract -p comment test/test.jpg comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1 $ extract -nV -l png.so -p comment test/test.jpg test/test.png Keywords for file test/test.jpg: Keywords for file test/test.png: comment - Testing keyword extraction LEGAL NOTICE
libextractor and the extract tool are released under the GPL. libextractor is a GNU package. BUGS
A couple of file-formats (on the order of 10^3) are not recognized... AUTHORS
extract was originally written by Christian Grothoff <christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use <libextrac- tor@gnu.org> to contact the current maintainer(s). AVAILABILITY
You can obtain the original author's latest version from http://www.gnu.org/software/libextractor/ libextractor 0.6.0 Dec 20, 2009 EXTRACT(1)
All times are GMT -4. The time now is 07:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy