Speeding up a Shell Script (find, grep and a for loop)
Hi all,
I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs.
The first thing I did was:
Which generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like this:
file1.pdf
file2.pdf
file3.pdf
file4.pdf
I then put this info into an array in a shell script, and loop through the array, searching all .htm and .html files in the site
for the value:
This does work.
However, our site is huge (1491 PDFs, and a whole lot of .htm and .html pages). Each iteration through the loop
takes around about 55 seconds. I've calculated that this shell script will take 6 days to complete.
Does anyone please know of a better (and significantly faster) way of doing this?
Any help would be greatly appreciated. I'm a bit of a unix newbie, and it took me hours just to get this far.
I have a file that is 20 - 80+ MB in size that is a certain type of log file.
It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example:
The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
I'm trying to make a simple search script but cannot get it right. The script should search for keywords inside files. Then return the file paths in a variable. (Each file path separated with \n).
#!/bin/bash
SEARCHQUERY="searchword1 searchword2 searchword3";
for WORD in $SEARCHQUERY
do
... (6 Replies)
Hi,
I have a text file with data in that I wish to extract, assign to a variable and process through a loop.
Kind of the process that I am after:
1: Grep the text file for the values.
Currently using:
cat /root/test.txt | grep TESTING= | awk -F"=" '{ a = $2 } {print a}' | sort -u
... (0 Replies)
Hi all,
I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min).
Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search.
Example:... (11 Replies)
Hello,
I am using sed in a for loop to replace text in a 100MB file. I have about 55,000 entries to convert in a csv file with two entries per line. The following script works to search file.txt for the first field from conversion.csv and then replace it with the second field. While it works fine,... (15 Replies)
This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up?
cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Dear all,
Please help with the following.
I have a file, let's call it data.txt, that has 3 columns and approx 700,000 lines, and looks like this:
rs1234 A C
rs1236 T G
rs2345 G T
Please use code tags as required by forum rules!
I have a second file, called reference.txt,... (1 Reply)
HI Guys hoping some one can help
I have two files on both containing uk phone numbers
master is a file which has been collated over a few years ad currently contains around 4 million numbers
new is a file which also contains 4 million number i need to split new nto two separate files... (4 Replies)
Hi,
I've written a ksh script that read a file and parse/filter/format each line. The script runs as expected but it runs for 24+ hours for a file that has 2million lines. And sometimes, the input file has 10million lines which means it can be running for more than 2 days and still not finish.... (9 Replies)
Hello experts,
we have input files with 700K lines each (one generated for every hour). and we need to convert them as below and move them to another directory once.
Sample INPUT:-
# cat test1
1559205600000,8474,NormalizedPortInfo,PctDiscards,0.0,Interface,BG-CTA-AX1.test.com,Vl111... (7 Replies)
Discussion started by: prvnrk
7 Replies
LEARN ABOUT OPENSOLARIS
ps2pdf13
PS2PDF(1) Ghostscript PS2PDF(1)NAME
ps2pdf - Convert PostScript to PDF using ghostscript
ps2pdf12 - Convert PostScript to PDF 1.2 (Acrobat 3-and-later compatible) using ghostscript
ps2pdf13 - Convert PostScript to PDF 1.3 (Acrobat 4-and-later compatible) using ghostscript
SYNOPSIS
ps2pdf [options...] {input.[e]ps|-} [output.pdf|-]
ps2pdf12 [options...] {input.[e]ps|-} [output.pdf|-]
ps2pdf13 [options...] {input.[e]ps|-} [output.pdf|-]
DESCRIPTION
The ps2pdf scripts are work-alikes for nearly all the functionality (but not the user interface) of Adobe's Acrobat(TM) Distiller(TM) prod-
uct: they convert PostScript files to Portable Document Format (PDF) files.
If the output filename is not specified, the output is placed is a file of the same name with a '.pdf' extension. Either the input filename
or the output filename can be '-' to request reading from stdin or writing to stdout, respectively, when used as a filter.
The three scripts differ as follows:
- ps2pdf12 will always produce PDF 1.2 output (Acrobat 3-and-later compatible).
- ps2pdf13 will always produce PDF 1.3 output (Acrobat 4-and-later compatible).
- ps2pdf per se currently produces PDF 1.4 output. However, this may change in the future. If you care about the compatibility level
of the output, use ps2pdf12 or ps2pdf13, or use the -dCompatibility=1.x switch in the command line.
There are some limitations in ps2pdf's conversion. See the HTML documentation for more information. A large number of Adobe Distiller(TM)
parameters which can be used to control the conversion are also documented there, including instructions for generating PDF/X and PDF/A
documents.
EXAMPLES
Converting a figure.ps to figure.pdf:
ps2pdf figure.ps
A conversion with more specifics:
ps2pdf -dPDFSETTINGS=/prepress figure.ps proof.pdf
Converting as part of a pipe:
make_report.pl -t ps | ps2pdf -dCompatibility=1.3 - - | lpr
SEE ALSO gs(1), ps2pdfwr(1),
Ps2pdf.htm in the Ghostscript documentation
BUGS
See http://bugs.ghostscript.com/ and the Usenet news group comp.lang.postscript.
VERSION
This document was last revised for Ghostscript version 8.63.
AUTHOR
Artifex Software, Inc. are the primary maintainers of Ghostscript. This manpage by George Ferguson.
8.63 1 August 2008 PS2PDF(1)