03-28-2011
Searching for a string in .PDF files inside .RAR & .ZIP archives.
Hi,
I have got a large number of .PDF files that are archived in .RAR & ZIP files in various directories and I would like to search for strings inside the PDF files.
I would think you would need something that can recursively read directories, extract the .RAR/.ZIP file in memory, read the PDF in memory, search for the given string in the PDF, display the result and in what .RAR/.ZIP filename and PDF it was found and discard everything to /dev/null so that you don't sit with everything extracted on your hard drive after the script is done, then move on to the next .RAR/.ZIP file etc. until done.
Is there any shell scripting wizards that could assist me with this?
Thanks
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi,
I've searched this site and not found this already, so if I missed on my search, sorry.
I need to pass in a variable to a script, where the first three characters of that variable represent a calendar quarter, and the last 2 characters are the year. I.E. Q0105 for Q1, Q0205 for Q2, and... (3 Replies)
Discussion started by: Rediranch
3 Replies
2. UNIX Desktop Questions & Answers
i want know how to compress and uncompress file using unix,
compress uncompress,zip,unzip,rar,unrar,how its work and more about this.:confused: (1 Reply)
Discussion started by: ismael xavier
1 Replies
3. UNIX for Dummies Questions & Answers
hey,
i need to use grep to search a bunch of header files inside a directory to return which file i can find the function i'm searching for in. how do i use wild cards to search through the files? i can only figure out how to search inside the directory, not inside the files that are in the... (4 Replies)
Discussion started by: kylethesir
4 Replies
4. Shell Programming and Scripting
I'm trying to find a way to automate cleanup of OCR for a large number of scanned pages - due to limitations of the access mechanism where these are to end up, I need to create pdf files that include the background text for searching.
Going in I have Tif images too dirty to OCR and re-keyed text... (2 Replies)
Discussion started by: dorcas
2 Replies
5. UNIX for Dummies Questions & Answers
Hello
I've an old xbox connected to Ubuntu 8.04 with an ethernet cable.
I use gFTP to transfer files on xbox (through FTP).
When I have to transfer a rar file, first of all I have to extract it on ubuntu, then on xbox.
I would like to transfer rar files directly on xbox. Is it possible with... (4 Replies)
Discussion started by: paolobitta
4 Replies
6. Shell Programming and Scripting
Hello to all,
I have a zip file with any name like FileName.zip, within the zip file there are more than 30 files with different extensions in the following format.
FileName_BMN_ROSJ.txt
FileName_THEUS.jpg
.
.
.
FileName_KWPWP.shx
I would like to unzip the file and rename each file... (2 Replies)
Discussion started by: Ophiuchus
2 Replies
7. UNIX for Dummies Questions & Answers
Hi all, need help here in moving a .zip file into a suse system and want it to be in .rar format. How can i do this? (1 Reply)
Discussion started by: mena
1 Replies
8. UNIX for Dummies Questions & Answers
I have a text which I divided them into sentences and now printed them in a rows.
I want to get the list of most of words ( the, and, a) and print 5 words after them (so 6 with the word itself). I have created an acceptfile with those rows and using grep but I have rows that have these words more... (2 Replies)
Discussion started by: A-V
2 Replies
9. Programming
the titele was wrong ... the true one is: Is it possible to search words inside .pdf or .doc files?
is it possible if i changed the word into binary combination:eek:?
and this way is super too hyper huge of greatest codes i ever seen:D to read only 1 word so is there any other ways:confused:?
... (1 Reply)
Discussion started by: fwrlfo
1 Replies
10. Shell Programming and Scripting
Hii,
Could someone help me to append string to the starting of all the filenames inside a directory but it should exclude .zip files and subdirectories.
Eg.
file1: test1.log
file2: test2.log
file3 test.zip
After running the script
file1: string_test1.log
file2: string_test2.log
file3:... (4 Replies)
Discussion started by: Ravi Kishore
4 Replies
LEARN ABOUT LINUX
zipgrep
ZIPGREP(1) General Commands Manual ZIPGREP(1)
NAME
zipgrep - search files in a ZIP archive for lines matching a pattern
SYNOPSIS
zipgrep [egrep_options] pattern file[.zip] [file(s) ...] [-x xfile(s) ...]
DESCRIPTION
zipgrep will search files within a ZIP archive for lines matching the given string or pattern. zipgrep is a shell script and requires
egrep(1) and unzip(1) to function. Its output is identical to that of egrep(1).
ARGUMENTS
pattern
The pattern to be located within a ZIP archive. Any string or regular expression accepted by egrep(1) may be used. file[.zip] Path
of the ZIP archive. (Wildcard expressions for the ZIP archive name are not supported.) If the literal filename is not found, the
suffix .zip is appended. Note that self-extracting ZIP files are supported, as with any other ZIP archive; just specify the .exe
suffix (if any) explicitly.
[file(s)]
An optional list of archive members to be processed, separated by spaces. If no member files are specified, all members of the ZIP
archive are searched. Regular expressions (wildcards) may be used to match multiple members:
* matches a sequence of 0 or more characters
? matches exactly 1 character
[...] matches any single character found inside the brackets; ranges are specified by a beginning character, a hyphen, and an end-
ing character. If an exclamation point or a caret (`!' or `^') follows the left bracket, then the range of characters within
the brackets is complemented (that is, anything except the characters inside the brackets is considered a match).
(Be sure to quote any character that might otherwise be interpreted or modified by the operating system.)
[-x xfile(s)]
An optional list of archive members to be excluded from processing. Since wildcard characters match directory separators (`/'),
this option may be used to exclude any files that are in subdirectories. For example, ``zipgrep grumpy foo *.[ch] -x */*'' would
search for the string ``grumpy'' in all C source files in the main directory of the ``foo'' archive, but none in any subdirectories.
Without the -x option, all C source files in all directories within the zipfile would be searched.
OPTIONS
All options prior to the ZIP archive filename are passed to egrep(1).
SEE ALSO
egrep(1), unzip(1), zip(1), funzip(1), zipcloak(1), zipinfo(1), zipnote(1), zipsplit(1)
URL
The Info-ZIP home page is currently at
http://www.info-zip.org/pub/infozip/
or
ftp://ftp.info-zip.org/pub/infozip/ .
AUTHORS
zipgrep was written by Jean-loup Gailly.
Info-ZIP 20 April 2009 ZIPGREP(1)