Sponsored Content
Top Forums Shell Programming and Scripting Searching for a string in .PDF files inside .RAR & .ZIP archives. Post 302508355 by lewk on Monday 28th of March 2011 01:50:41 AM
Old 03-28-2011
Searching for a string in .PDF files inside .RAR & .ZIP archives.

Hi,

I have got a large number of .PDF files that are archived in .RAR & ZIP files in various directories and I would like to search for strings inside the PDF files.

I would think you would need something that can recursively read directories, extract the .RAR/.ZIP file in memory, read the PDF in memory, search for the given string in the PDF, display the result and in what .RAR/.ZIP filename and PDF it was found and discard everything to /dev/null so that you don't sit with everything extracted on your hard drive after the script is done, then move on to the next .RAR/.ZIP file etc. until done.

Is there any shell scripting wizards that could assist me with this?

Thanks
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Pattern searching inside Variable - not looking at files

Hi, I've searched this site and not found this already, so if I missed on my search, sorry. I need to pass in a variable to a script, where the first three characters of that variable represent a calendar quarter, and the last 2 characters are the year. I.E. Q0105 for Q1, Q0205 for Q2, and... (3 Replies)
Discussion started by: Rediranch
3 Replies

2. UNIX Desktop Questions & Answers

file zip,rar,tar,compress,uncompress,unzip,unrar

i want know how to compress and uncompress file using unix, compress uncompress,zip,unzip,rar,unrar,how its work and more about this.:confused: (1 Reply)
Discussion started by: ismael xavier
1 Replies

3. UNIX for Dummies Questions & Answers

searching files inside directory

hey, i need to use grep to search a bunch of header files inside a directory to return which file i can find the function i'm searching for in. how do i use wild cards to search through the files? i can only figure out how to search inside the directory, not inside the files that are in the... (4 Replies)
Discussion started by: kylethesir
4 Replies

4. Shell Programming and Scripting

Working with OCR text inside PDF files

I'm trying to find a way to automate cleanup of OCR for a large number of scanned pages - due to limitations of the access mechanism where these are to end up, I need to create pdf files that include the background text for searching. Going in I have Tif images too dirty to OCR and re-keyed text... (2 Replies)
Discussion started by: dorcas
2 Replies

5. UNIX for Dummies Questions & Answers

extract rar/zip files from pc DIRECTLY on xbox (FTP)

Hello I've an old xbox connected to Ubuntu 8.04 with an ethernet cable. I use gFTP to transfer files on xbox (through FTP). When I have to transfer a rar file, first of all I have to extract it on ubuntu, then on xbox. I would like to transfer rar files directly on xbox. Is it possible with... (4 Replies)
Discussion started by: paolobitta
4 Replies

6. Shell Programming and Scripting

Rename files that are inside zip file

Hello to all, I have a zip file with any name like FileName.zip, within the zip file there are more than 30 files with different extensions in the following format. FileName_BMN_ROSJ.txt FileName_THEUS.jpg . . . FileName_KWPWP.shx I would like to unzip the file and rename each file... (2 Replies)
Discussion started by: Ophiuchus
2 Replies

7. UNIX for Dummies Questions & Answers

Move a .zip file to a unix system in .rar format

Hi all, need help here in moving a .zip file into a suse system and want it to be in .rar format. How can i do this? (1 Reply)
Discussion started by: mena
1 Replies

8. UNIX for Dummies Questions & Answers

searching words & print prefixed string after it

I have a text which I divided them into sentences and now printed them in a rows. I want to get the list of most of words ( the, and, a) and print 5 words after them (so 6 with the word itself). I have created an acceptfile with those rows and using grep but I have rows that have these words more... (2 Replies)
Discussion started by: A-V
2 Replies

9. Programming

Is it possible to change search inside .pdf or .doc files?

the titele was wrong ... the true one is: Is it possible to search words inside .pdf or .doc files? is it possible if i changed the word into binary combination:eek:? and this way is super too hyper huge of greatest codes i ever seen:D to read only 1 word so is there any other ways:confused:? ... (1 Reply)
Discussion started by: fwrlfo
1 Replies

10. Shell Programming and Scripting

Append string to all the files inside a directory excluding subdirectories and .zip files

Hii, Could someone help me to append string to the starting of all the filenames inside a directory but it should exclude .zip files and subdirectories. Eg. file1: test1.log file2: test2.log file3 test.zip After running the script file1: string_test1.log file2: string_test2.log file3:... (4 Replies)
Discussion started by: Ravi Kishore
4 Replies
PDF::API2::Basic::PDF::Filter(3pm)			User Contributed Perl Documentation			PDF::API2::Basic::PDF::Filter(3pm)

NAME
PDF::API2::Basic::PDF::Filter - Abstract superclass for PDF stream filters SYNOPSIS
$f = PDF::API2::Basic::PDF::Filter->new; $str = $f->outfilt($str, 1); print OUTFILE $str; while (read(INFILE, $dat, 4096)) { $store .= $f->infilt($dat, 0); } $store .= $f->infilt("", 1); DESCRIPTION
A Filter object contains state information for the process of outputting and inputting data through the filter. The precise state information stored is up to the particular filter and may range from nothing to whole objects created and destroyed. Each filter stores different state information for input and output and thus may handle one input filtering process and one output filtering process at the same time. METHODS
PDF::API2::Basic::PDF::Filter->new Creates a new filter object with empty state information ready for processing data both input and output. $dat = $f->infilt($str, $isend) Filters from output to input the data. Notice that $isend == 0 implies that there is more data to come and so following it $f may contain state information (usually due to the break-off point of $str not being tidy). Subsequent calls will incorporate this stored state information. $isend == 1 implies that there is no more data to follow. The final state of $f will be that the state information is empty. Error messages are most likely to occur here since if there is required state information to be stored following this data, then that would imply an error in the data. $str = $f->outfilt($dat, $isend) Filter stored data ready for output. Parallels "infilt". NAME
PDF::API2::Basic::PDF::ASCII85Decode - Ascii85 filter for PDF streams. Inherits from PDF::API2::Basic::PDF::Filter NAME
PDF::API2::Basic::PDF::RunLengthDecode - Run Length encoding filter for PDF streams. Inherits from PDF::API2::Basic::PDF::Filter NAME
PDF::API2::Basic::PDF::ASCIIHexDecode - Ascii Hex encoding (very inefficient) for PDF streams. Inherits from PDF::API2::Basic::PDF::Filter perl v5.14.2 2011-03-10 PDF::API2::Basic::PDF::Filter(3pm)
All times are GMT -4. The time now is 05:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy