echo "Finding All PDFs..."
ls -R | grep .pdf > /tmp/pdfs/all_pdfs.out
echo "Done."
# Remove rubbish from list
echo "Removing Rubbish From List..."
sed 's|^\./[a-zA-Z0-9_ &./:]*$||g' /tmp/pdfs/all_pdfs.out > /tmp/pdfs/all_pdfs2.out
sed '/^$/d' /tmp/pdfs/all_pdfs2.out > /tmp/pdfs/all_pdfs.out
echo "Done."
You could trim this down to avoid using so many temporary files.
Code:
ls -R | sed -e '/\.pdf$/!d' -e 's|^\./[a-zA-Z0-9_ &./:]*$||g' -e '/^$/d' >/tmp/pdfs/all_pdfs.out
The first sed command is somewhat more specific than just grep ,pdf -- instead of accepting any character (sic) followed by "pdf" anywhere in the file name, it looks specifically for .pdf at the end of the line. Maybe that's not what you want; if so, take out the $ perhaps.
Quote:
Originally Posted by Dave Stockdale
Code:
echo "Finding All PDFs..."
# List all PDFs Linked to
echo "Gathering List of PDF Links..."
find . -name "*.htm*" -exec grep -o "[a-zA-Z0-9_]\{1,\}\.pdf" {} \; > /tmp/pdfs/all_links.out
find . -name "*.php" -exec grep -o "[a-zA-Z0-9_]\{1,\}\.pdf" {} \; >> /tmp/pdfs/all_links.out
echo "Done."
Also, you could run a single find here; that should reduce running time significantly if the directory tree is big.
I have a file that is 20 - 80+ MB in size that is a certain type of log file.
It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example:
The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
I'm trying to make a simple search script but cannot get it right. The script should search for keywords inside files. Then return the file paths in a variable. (Each file path separated with \n).
#!/bin/bash
SEARCHQUERY="searchword1 searchword2 searchword3";
for WORD in $SEARCHQUERY
do
... (6 Replies)
Hi,
I have a text file with data in that I wish to extract, assign to a variable and process through a loop.
Kind of the process that I am after:
1: Grep the text file for the values.
Currently using:
cat /root/test.txt | grep TESTING= | awk -F"=" '{ a = $2 } {print a}' | sort -u
... (0 Replies)
Hi all,
I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min).
Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search.
Example:... (11 Replies)
Hello,
I am using sed in a for loop to replace text in a 100MB file. I have about 55,000 entries to convert in a csv file with two entries per line. The following script works to search file.txt for the first field from conversion.csv and then replace it with the second field. While it works fine,... (15 Replies)
This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up?
cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Dear all,
Please help with the following.
I have a file, let's call it data.txt, that has 3 columns and approx 700,000 lines, and looks like this:
rs1234 A C
rs1236 T G
rs2345 G T
Please use code tags as required by forum rules!
I have a second file, called reference.txt,... (1 Reply)
HI Guys hoping some one can help
I have two files on both containing uk phone numbers
master is a file which has been collated over a few years ad currently contains around 4 million numbers
new is a file which also contains 4 million number i need to split new nto two separate files... (4 Replies)
Hi,
I've written a ksh script that read a file and parse/filter/format each line. The script runs as expected but it runs for 24+ hours for a file that has 2million lines. And sometimes, the input file has 10million lines which means it can be running for more than 2 days and still not finish.... (9 Replies)
Hello experts,
we have input files with 700K lines each (one generated for every hour). and we need to convert them as below and move them to another directory once.
Sample INPUT:-
# cat test1
1559205600000,8474,NormalizedPortInfo,PctDiscards,0.0,Interface,BG-CTA-AX1.test.com,Vl111... (7 Replies)
Discussion started by: prvnrk
7 Replies
LEARN ABOUT DEBIAN
image::exiftool::writepostscript
Image::ExifTool::WritePostScript(3pm) User Contributed Perl Documentation Image::ExifTool::WritePostScript(3pm)NAME
Image::ExifTool::WritePostScript.pl - Write PostScript meta information
SYNOPSIS
This file is autoloaded by Image::ExifTool::PostScript.
DESCRIPTION
This file contains routines to write meta information in PostScript documents. Six forms of meta information may be written:
1) PostScript comments (Adobe DSC specification)
2) XMP information embedded in a document-level XMP stream
3) EXIF information embedded in a Photoshop record
4) IPTC information embedded in a PhotoShop record
5) ICC_Profile information embedded in an ICCProfile record
6) TIFF information embedded in DOS-style binary header
NOTES
Currently, information is written only in the outer-level document.
Photoshop will discard meta information in a PostScript document if it has to rasterize the image, and it will rasterize anything that
doesn't contain the Photoshop-specific 'ImageData' tag. So don't expect Photoshop to read any meta information added to EPS images that it
didn't create.
The following two acronyms may be confusing since they are so similar and have different meanings with respect to PostScript documents:
DSC = Document Structuring Conventions
DCS = Desktop Color Separation
REFERENCES
See references in PostScript.pm, plus:
<http://www.adobe.com/products/postscript/pdfs/PLRM.pdf>
http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf <http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf>
<http://partners.adobe.com/public/developer/en/acrobat/sdk/pdf/pdf_creation_apis_and_specs/pdfmarkReference.pdf>
ACKNOWLEDGEMENTS
Thanks to Tim Kordick for his help testing the EPS writer.
AUTHOR
Copyright 2003-2011, Phil Harvey (phil at owl.phy.queensu.ca)
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SEE ALSO Image::ExifTool::PostScript(3pm), Image::ExifTool(3pm)perl v5.12.4 2011-01-03 Image::ExifTool::WritePostScript(3pm)