Sponsored Content
Top Forums Shell Programming and Scripting PDF Script to extract PDF Links MOD in Need Post 302902318 by clx on Tuesday 20th of May 2014 02:52:17 AM
Old 05-20-2014
It depends on how the tool lynx accepts the pages, I think it should accept the multiple pages as a list. Better to look for its manual.

So, your page is the "$WEBSITE" variable inside the script.
For multiple pages, you could use it like

Code:
pdflinkextractor "www.website.com" "www.anotherpage.com"

Inside the script,

Code:
WEBSITE="$@"

Incase, it doesn't accept the multiple pages,

Code:
WEBSITE="$@"
for PAGE in $WEBSITE
do
 lynx -cache=0 -dump -listonly "$PAGE" | grep ".*\.pdf$" | awk '{print $2}' | tee -a pdflinks.txt
done


If your page list is long, I would prefer to put them in a file

Code:
$ cat mypages.txt
www.website.com
www.anotherpage.com
www.anotherpage2.com
www.anotherpage3.com

And use it like

Code:
pdflinkextractor mypages.txt

Inside script

Code:
PAGEFILE=$1
while read PAGE
do
 lynx -cache=0 -dump -listonly "$PAGE" | grep ".*\.pdf$" | awk '{print $2}' | tee -a pdflinks.txt
done < $PAGEFILE


Last edited by clx; 05-20-2014 at 03:14 PM.. Reason: typos
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script To dlete PDF file s and Folders

Hi We have to delete PDF files and Folders older than five days .Can anyone help with the shell script Regards Ved (10 Replies)
Discussion started by: ved123
10 Replies

2. Shell Programming and Scripting

Regarding Shell Script References,PDF and Tutorials

Hi, Could you pls guide me a reference materials or PDF or Tutorials link for Shell Scripting.I'm new to Unix Shell Scripting.want to explore as much as possible in Shell Scripting.... Thanks Sollins (2 Replies)
Discussion started by: sollins
2 Replies

3. Shell Programming and Scripting

Extract Table from PDF

Hi Guys! I want to extract table from PDF in HTML. Can we do this using Shell script....??. Please provide me your suggestions. Any help will be highly appreciated. Thanks! (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies

4. Shell Programming and Scripting

Perl - Convert html to pdf - PDF::FromHTML

Hi, I am trying to convert html to pdf using perl module PDF::FromHTML, am getting the error as given below. not well-formed (invalid token) at line 2, column 17, byte 56 at C:/Perl/lib/XML/Parser.pm line 187 at C:/Perl/site/lib/PDF/FromHTML.pm line 140 The perl code is as given... (2 Replies)
Discussion started by: DILEEP410
2 Replies

5. Shell Programming and Scripting

Script for converting a pdf to book format

Hello, excuse my English... I'm trying to do a nautilus-script to transform a normal A4 pdf to another pdf with book format, ready to be printed (double sided). I mean, the script put pages in order and also put 2 pages per horizontal A4 page (p.e.: a pdf with 8 pages would look like: 8-1, 2-7,... (2 Replies)
Discussion started by: dokan
2 Replies

6. Programming

help me with perl script that creat pdf

Hi, I have one xml file, I extracted some comments and saved in pdf file.I written code like this #!/usr/bin/perl use warnings; use strict; use PDF::API2; use PDF::API2::Page; use XML::LibXML::Reader; use Data::Dumper; my $file; open( $file, 'formal.xml'); my $reader =... (1 Reply)
Discussion started by: veerubiji
1 Replies

7. Shell Programming and Scripting

Shell Script to Dynamically Extract file content based on Parameters from a pdf file

Hi Guru's, I am new to shell scripting. I have a unique requirement: The system generates a single pdf(/tmp/ABC.pdf) file with Invoices for Multiple Customers, the format is something like this: Page1 >> Customer 1 >>Invoice1 + invoice 2 >> Page1 end Page2 >> Customer 2 >>Invoice 3 + Invoice 4... (3 Replies)
Discussion started by: DIps
3 Replies

8. Shell Programming and Scripting

Converting secured pdf files to pdf using acroread

Does anybody have idea of Converting secured pdf files to pdf using acroread ? ---------- Post updated at 04:49 PM ---------- Previous update was at 04:44 PM ---------- This file is not password protected. (4 Replies)
Discussion started by: Soham
4 Replies

9. Shell Programming and Scripting

Perl to extract from a pdf

The below perl script produces the metrics.txt below using the run.txt as the input. perl -ne 'BEGIN{print join("\t","R_Index", "ISP Loading", "Pre-Enrichment", "Total Reads", "Read Length", "Key Signal", "Usable Sequence", "Enrichment", "Polyclonal" ,"Low Quality" ,"Test Fragment", "Aligned... (2 Replies)
Discussion started by: cmccabe
2 Replies
IRSEND(1)							   User Commands							 IRSEND(1)

NAME
irsend - basic LIRC program to send infra-red commands SYNOPSIS
irsend [options] DIRECTIVE REMOTE CODE [CODE...] DESCRIPTION
Asks the lircd daemon to send one or more CIR (Consumer Infra-Red) commands. This is intended for remote control of electronic devices such as TV boxes, HiFi sets, etc. DIRECTIVE can be: SEND_ONCE - send CODE [CODE ...] once SEND_START - start repeating CODE SEND_STOP - stop repeating CODE LIST - list configured remote items SET_TRANSMITTERS - set transmitters NUM [NUM ...] SIMULATE - simulate IR event REMOTE is the name of a remote, as described in the lircd configuration file. CODE is the name of a remote control key of REMOTE, as it appears in the lircd configuration file. NUM is the transmitter number of the hardware device. For the LIST DIRECTIVE, REMOTE and/or CODE can be empty: LIST "" "" - list all configured remote names LIST REMOTE "" - list all codes of REMOTE LIST REMOTE CODE - list only CODE of REMOTE The SIMULATE command only works if it has been explicitly enabled in lircd. -h --help display usage summary -v --version display version -d --device use given lircd socket [/var/run/lirc/lircd] -a --address=host[:port] connect to lircd at this address -# --count=n send command n times EXAMPLES
irsend LIST DenonTuner "" irsend SEND_ONCE DenonTuner PROG-SCAN irsend SEND_ONCE OnkyoAmpli VOL-UP VOL-UP VOL-UP VOL-UP irsend SEND_START OnkyoAmpli VOL-DOWN ; sleep 3 irsend SEND_STOP OnkyoAmpli VOL-DOWN irsend SET_TRANSMITTERS 1 irsend SET_TRANSMITTERS 1 3 4 irsend SIMULATE "0000000000000476 00 OK TECHNISAT_ST3004S" FILES
/etc/lirc/lircd.conf Default lircd configuration file. It should contain all the remotes, their infra-red codes and the corresponding timing and wave- form details. DIAGNOSTICS
If lircd is not running (or /var/run/lirc/lircd lacks write permissions) irsend aborts with the following diagnostics: "irsend: could not connect to socket" "irsend: Connection refused" (or "Permission denied"). SEE ALSO
The documentation for lirc is maintained as html pages. They are located under html/ in the documentation directory. lircd(8), mode2(1), smode2(1), xmode2(1), irrecord(1), irw(1), http://www.lirc.org. irsend 0.9.0-pre1 October 2010 IRSEND(1)
All times are GMT -4. The time now is 01:16 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy