Perl to extract from a pdf


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl to extract from a pdf
Prev   Next
# 3  
Old 04-30-2017
The perl script below is very close but I am struggling with extracting from the last 4 fields and running a calculation in the last field. I hope I have included enough detail in the below and thank you Smilie.

Extraction of Reads1, Reads2,Reads,3,NoBarcode fields are extracted from this portion of
run.txt:

Code:
Reads1 is IonXpress 004 with the Reads value extracted
Reads2 is IonXpress 005 with the Reads value extracted
Reads3 is IonXpress 006 with the Reads value extracted
NoBarcode is No Barcode with the Reads value extracted

The above Reads1,2,3 can all have different IonXpress digits but the format will always be the same. So in this example the digits were 004,005,006, but the next time it might be 001,002,003. However the same format below will apply:
f[0] will always be the barcode and f[4] will alsways ne the reads
Code:
Barcode Name       Sample             Bases           ≥ Q20           Reads           Mean Read Length

No barcode         none               151,751,086     122,614,710     844,020         180 bp

IonXpress 004      00-0000 Last-      8,373,945,632 7,188,703,690 38,774,136          216 bp

                       First
) * 100
IonXpress 005      00-0001 LastN-   5,226,515,080 4,502,314,522 24,025,446          218 bp

                       FirstN

IonXpress 006      00-0002 La-    6,651,737,354 5,681,526,265 30,850,757          216 bp

                       Fi

Calculation in last field:
The very last field is a calculation that uses the f[4] reads in the f[0] No barcode
divided by the Total Reads in [ICODE], in the original code in post 1 Key Signal/ and ($ks,$tr,$rl)=@p[3..5]
extracted the Total Reads, but I am not sure how to perform the calculation of (844020 / 94495222)*100.

Code:
perl -ne 'BEGIN{print join("\t","ReadLength", "UsableSequence", "Polyclonal", "LowQuality", "UnalignedBases", "Barcode1", "Barcode2", "Barcode3", "NoBarcode", "Exception"),"\n"};s/[\%\,]//g;@f=split/\s+/> ;/Key Signal/ and ($ks,$tr,$rl)=@p[3..5];/Usable/ and ($il,$us)=@p[1,2];/Polyclonal/ and $pc=$f[-1];/Low Quality/ and $lq=$f[-1];/Unaligned Reads/ and $ur=$f[-1] and print join("\t",$rl,$us,$pc,$lq,$ur,$f[-1]," "),"\n";@p=@f' run.txt

current output
Code:
ReadLength UsableSequence Polyclonal LowQuality UnalignedBases Barcode1 Barcode2 Barcode3 NoBarcode Exception
216       68              27.5         04.7    0.6           0.6

desired output tab-delimeted
Code:
ReadLength UsableSequence Polyclonal LowQuality UnalignedBases Barcode1 Barcode2 Barcode3 NoBarcode Exception
216       68              27.5         4.7    0.6      8774136  24025446 30850757  0.89   
                                                                                   (Reads / TotalReads) * 100


Last edited by cmccabe; 04-30-2017 at 10:03 AM.. Reason: fixed format
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting secured pdf files to pdf using acroread

Does anybody have idea of Converting secured pdf files to pdf using acroread ? ---------- Post updated at 04:49 PM ---------- Previous update was at 04:44 PM ---------- This file is not password protected. (4 Replies)
Discussion started by: Soham
4 Replies

2. Shell Programming and Scripting

PDF Script to extract PDF Links MOD in Need

In here we have a script to extract all pdf links from a single page.. any idea's in how make this read instead of a page a list of pages.. and extract all pdf links ? #!/bin/bash # NAME: pdflinkextractor # AUTHOR: Glutanimate (http://askubuntu.com/users/81372/), 2013 #... (1 Reply)
Discussion started by: danielldf
1 Replies

3. Shell Programming and Scripting

Perl how to compare two pdf files line by line

Hi Experts, Would really appreciate if anyone can guide me how to compare two pdf files line by line and report the difference to another file. (3 Replies)
Discussion started by: prasanth_babu
3 Replies

4. Shell Programming and Scripting

Shell Script to Dynamically Extract file content based on Parameters from a pdf file

Hi Guru's, I am new to shell scripting. I have a unique requirement: The system generates a single pdf(/tmp/ABC.pdf) file with Invoices for Multiple Customers, the format is something like this: Page1 >> Customer 1 >>Invoice1 + invoice 2 >> Page1 end Page2 >> Customer 2 >>Invoice 3 + Invoice 4... (3 Replies)
Discussion started by: DIps
3 Replies

5. Programming

help me with perl script that creat pdf

Hi, I have one xml file, I extracted some comments and saved in pdf file.I written code like this #!/usr/bin/perl use warnings; use strict; use PDF::API2; use PDF::API2::Page; use XML::LibXML::Reader; use Data::Dumper; my $file; open( $file, 'formal.xml'); my $reader =... (1 Reply)
Discussion started by: veerubiji
1 Replies

6. Shell Programming and Scripting

Perl program to convert PDF to text/CSV

Please suggest ways to easily convert pdf to text in perl only on windows (no other tools can be downloaded) Here is what I have been doing : using a module CAM::PDF to extract data. But it shows everything in messy format :wall: But this module is the only one working with the pdf... (0 Replies)
Discussion started by: chakrapani
0 Replies

7. Shell Programming and Scripting

Perl - Convert html to pdf - PDF::FromHTML

Hi, I am trying to convert html to pdf using perl module PDF::FromHTML, am getting the error as given below. not well-formed (invalid token) at line 2, column 17, byte 56 at C:/Perl/lib/XML/Parser.pm line 187 at C:/Perl/site/lib/PDF/FromHTML.pm line 140 The perl code is as given... (2 Replies)
Discussion started by: DILEEP410
2 Replies

8. Shell Programming and Scripting

Converting html to pdf perl

Hi All, I have a requirement of converting an html form into pdf using perl. The html form contains images, tables and css implementation. I tried using various perl modules but failed to achive the target. I succeeded in generating a pdf from the html file using... (2 Replies)
Discussion started by: DILEEP410
2 Replies

9. Shell Programming and Scripting

Extract Table from PDF

Hi Guys! I want to extract table from PDF in HTML. Can we do this using Shell script....??. Please provide me your suggestions. Any help will be highly appreciated. Thanks! (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies
Login or Register to Ask a Question