Sponsored Content
Top Forums Shell Programming and Scripting Shell Script to Dynamically Extract file content based on Parameters from a pdf file Post 302804017 by Chubler_XL on Wednesday 8th of May 2013 12:38:41 AM
Old 05-08-2013
If you have python on your system you could try the PyPDF2 Library


Assumption is last page of invoice contains some text you can match to like "Total Due:"

Code:
#!/usr/bin/env python
from PyPDF2.pdf import PdfFileReader, PdfFileWriter
import sys

filenum = 1
Pageadded = False

output_pdf = PdfFileWriter()
input_pdf = PdfFileReader(open(sys.argv[1], "rb"))

for i in range(0, input_pdf.getNumPages()):
    Pageadded = True
    output_pdf.addPage(input_pdf.getPage(i))

    if input_pdf.getPage(i).extractText().find("Total Due:") != -1:
        outputStream = file("Cust_" + str(filenum) + ".pdf", "wb")
        output_pdf.write(outputStream)
        filenum = filenum + 1
        output_pdf = PdfFileWriter()
        Pageadded = False
if Pageadded:
    outputStream = file("Cust_" + str(filenum) + ".pdf", "wb")
    output_pdf.write(outputStream)

Note: indentation is a part of the python syntax so ensure you keep the indent levels correct. Call the script like this:

Code:
$ ./split_invoice.py Invoice_file.pdf

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete content of file 1 in file 2 with shell script

OK, best is I explain what the operating enviroment is. Linux, but Motomagx. It is a Linux operated mobile phone, Motorola V8. I am writting a shell script, but got stuck. I have to delete the complete content of file 1 in file 2. I have attached the 2 files. You can see that the content of... (2 Replies)
Discussion started by: rasputin007
2 Replies

2. Shell Programming and Scripting

Calling sql file from shell script with parameters.

Hi, I am calling a sql file script.sql from shell script and passing few parameters also as shown below: sqlplus -S id/password @script.sql $param1 $param2 Now,In sql file I have to create a extract text file after querying oracle tables based on the parameters passed(param1,param2) as... (7 Replies)
Discussion started by: anil029
7 Replies

3. Shell Programming and Scripting

Create shell script to extract unique information from one file to a new file.

Hi to all, I got this content/pattern from file http.log.20110808.gz mail1 httpd: Account Notice: close igchung@abc.com 2011/8/7 7:37:36 0:00:03 0 0 1 mail1 httpd: Account Information: login sastria9@abc.com proxy sid=gFp4DLm5HnU mail1 httpd: Account Notice: close sastria9@abc.com... (16 Replies)
Discussion started by: Mr_47
16 Replies

4. Shell Programming and Scripting

HELP: Shell Script to read a Log file line by line and extract Info based on KEYWORDS matching

I have a LOG file which looks like this Import started at: Mon Jul 23 02:13:01 EDT 2012 Initialization completed in 2.146 seconds. -------------------------------------------------------------------------------- -- Import summary for Import item: PolicyInformation... (8 Replies)
Discussion started by: biztank
8 Replies

5. Shell Programming and Scripting

Shell script to monitor new file in a directory and mail the file content

Hi I am looking for a help in designing a bash script on linux which can do below:- 1) Look in a specific directory for any new files 2) Mail the content of the new file Appreciate any help Regards Neha (5 Replies)
Discussion started by: neha0785
5 Replies

6. Shell Programming and Scripting

Convert excel file to PDF file using shell script

Hi All, Is it possible to convert the excel file to PDF file(Without loosing any format) using unix shell scripting ??? If yes Kindly help me on the code Thanks in advance!!! (5 Replies)
Discussion started by: Balasankar
5 Replies

7. UNIX for Dummies Questions & Answers

Shell script to extract data from csv file based on certain conditions

Hi Guys, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (1 Reply)
Discussion started by: Vivekit82
1 Replies

8. Shell Programming and Scripting

PDF Script to extract PDF Links MOD in Need

In here we have a script to extract all pdf links from a single page.. any idea's in how make this read instead of a page a list of pages.. and extract all pdf links ? #!/bin/bash # NAME: pdflinkextractor # AUTHOR: Glutanimate (http://askubuntu.com/users/81372/), 2013 #... (1 Reply)
Discussion started by: danielldf
1 Replies

9. Shell Programming and Scripting

Shell script to create runtime variables based on the number of parameters passed in the script

Hi All, I have a script which intends to create as many variables at runtime, as the number of parameters passed to it. The script needs to save these parameter values in the variables created and print them abc.sh ---------- export Numbr_Parms=$# export a=1 while do export... (3 Replies)
Discussion started by: dev.devil.1983
3 Replies

10. Shell Programming and Scripting

Parameterizing to dynamically generate the extract file from Oracle table using Shell Script

I have below 2 requirements for parameterize the generate the extract file from Oracle table using Shell Script. Could you please help me by modifying the script and show me how to execute it. First Requirement: I have a requirement where I need to parameterize to generate one... (0 Replies)
Discussion started by: hareshvikram
0 Replies
dpm_python(3)							 Python Reference						     dpm_python(3)

NAME
dpm - Python interface to the DPM SYNOPSIS
import dpm DESCRIPTION
The dpm module permits you to access the DPM client interface from python programs. The dpm module is a swig wrapping of the standard C interface. For detailed descriptions of each function see the individual man page of each function. There follows a series of examples of how to use selected functions and how to retrieve the information returned by them: Examples are listing the replicas of a given entry, reading the content of a directory, getting and setting ACLs. etc. EXAMPLE
#!/usr/bin/python """ # Using the dpns_readdirxr method """ import sys import dpm name = "/dpm/cern.ch/home/dteam/"; dir = dpm.dpns_opendirg(name,"") if (dir == None) or (dir == 0): err_num = dpm.cvar.serrno err_string = dpm.sstrerror(err_num) print "Error while looking for " + name + ": Error " + str(err_num) + " (" + err_string + ")" sys.exit(1) while 1: read_pt = dpm.dpns_readdirxr(dir,"") if (read_pt == None) or (read_pt == 0): break entry, list = read_pt print entry.d_name try: for i in range(len(list)): print " ==> %s" % list[i].sfn except TypeError, x: print " ==> None" dpm.dpns_closedir(dir) EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getlinks method """ result, list = dpm.dpns_getlinks("/dpm/cern.ch/home/dteam/file.test", "") print result print len(list) if (result == 0): for i in list: print i.path EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getreplica method """ result, list = dpm.dpns_getreplica("/dpm/cern.ch/home/dteam/file.test", "", "") print result print len(list) if (result == 0): for i in list: print i.host print i.sfn EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getacl and dpns_setacl methods to add a user ACL """ nentries, acls_list = dpm.dpns_getacl("/dpm/cern.ch/home/dteam/file.test", dpm.CA_MAXACLENTRIES) print nentries print len(acls_list) for i in acls_list: print i.a_type print i.a_id print i.a_perm # When adding a first ACL for a given user, you also need to add the mask # When adding the second user ACL, it is not necessary anymore acl_user = dpm.dpns_acl() acl_mask = dpm.dpns_acl() acl_user.a_type=2 # 2 corresponds to CNS_ACL_USER acl_user.a_id=18701 # user id acl_user.a_perm=5 acl_mask.a_type=5 # 5 corresponds to CNS_ACL_MASK acl_mask.a_id=0 # no user id specified acl_mask.a_perm=5 acls_list.append(acl_user) acls_list.append(acl_mask) res = dpm.dpns_setacl("/dpm/cern.ch/home/dteam/file.test", acls_list) if res == 0: print "OK" else: err_num = dpm.cvar.serrno err_string = dpm.sstrerror(err_num) print "There was an error : Error " + str(err_num) + " (" + err_string + ")" sys.exit(1) EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getacl and dpns_setacl methods to remove a user ACL """ nentries, acls_list = dpm.dpns_getacl("/dpm/cern.ch/home/dteam/file.test", dpm.CA_MAXACLENTRIES) # Note : you cannot remove the owner ACL (i.e. for CNS_ACL_USER_OBJ type) if # ====== ACLs for other users exist. If all the other user ACLs are deleted, # ====== the owner ACL is automatically removed. for i in acls_list: print i.a_type print i.a_id print i.a_perm del acls_list[1] # delete a given user ACL from the list of ACLs res = dpm.dpns_setacl("/dpm/cern.ch/home/dteam/file.test", acls_list) if res == 0: print "OK" else: err_num = dpm.cvar.serrno err_string = dpm.sstrerror(err_num) print "There was an error : Error " + str(err_num) + " (" + err_string + ")" sys.exit(1) EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getusrmap method """ result, list = dpm.dpns_getusrmap() print result print len(list) if (result == 0): for i in list: print i.userid + " " + i.username EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpns_getgrpmap method """ result, list = dpm.dpns_getgrpmap() print result print len(list) if (result == 0): for i in list: print i.gid + " " + i.groupname EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_addfs method """ result = dpm.dpm_addfs("mypool", "mydiskserver.domain.com", "/mountpoint", dpm.FS_READONLY) print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_modifyfs method """ result = dpm.dpm_modifyfs("mydiskserver.domain.com", "/mountpoint", dpm.FS_READONLY) print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_rmfs method """ result = dpm.dpm_rmfs("mypool", "mydiskserver.domain.com", "/mountpoint") print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_addpool method """ dpmpool = dpm.dpm_pool() dpmpool.poolname = "mypool" dpmpool.defsize = 209715200 dpmpool.def_lifetime = 604800 dpmpool.defpintime = 604800 dpmpool.max_lifetime = 604800 dpmpool.max_pintime = 604800 dpmpool.nbgids = 1 dpmpool.gids = [0] dpmpool.ret_policy = 'R' dpmpool.s_type = 'D' result = dpm.dpm_addpool(dpmpool) print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_modifypool method """ dpmpool = dpm.dpm_pool() dpmpool.poolname = "mypool" dpmpool.defsize = 209715200 dpmpool.def_lifetime = 604800 dpmpool.defpintime = 604800 dpmpool.max_lifetime = 604800 dpmpool.max_pintime = 604800 dpmpool.nbgids = 1 dpmpool.gids = [0] dpmpool.ret_policy = 'R' dpmpool.s_type = 'D' result = dpm.dpm_modifypool(dpmpool) print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_rmpool method """ result = dpm.dpm_rmpool("mypool") print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_getpoolfs method """ result,list = dpm.dpm_getpoolfs("mypool") print result print len(list) if (result == 0): for i in list: print "POOL " + i.poolname + " SERVER " + i.server + " FS " + i.fs + " CAPACITY " + i.capacity + " FREE " + i.free EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_getpools method """ result,list = dpm.dpm_getpools() print result print len(list) if (result == 0): for i in list: print "POOL " + i.poolname + " CAPACITY " + i.capacity + " FREE " + i.free EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_getprotocols method """ result,list = dpm.dpm_getprotocols() print result print len(list) if (result == 0): for i in list: print i EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_getspacemd method """ result, list = dpm.dpm_getspacemd(["myspacetoken"]) print result print len(list) if (result == 0): for i in list: print "TYPE " + i.s_type + " SPACETOKEN " i.s_token + " USERTOKEN " + i.u_token + " TOTAL " + i.t_space + " GUARANTUEED " + i.g_space + " UNUSED " + i.u_space + " POOL " + i.poolname EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_getspacetoken method """ result, list = dpm.dpm_getspacetoken("myspacetokendesc") print result print len(list) if (result == 0): for i in list: print i EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_reservespace method """ result,actual_s_type,actual_t_space,actual_g_space,actual_lifetime,s_token = dpm.dpm_reservespace('D', "myspacetokendesc", 'R', 'O', 209715200, 209715200, 2592000, 0, "mypoolname") print result if (result == 0): print "TYPE " + actual_s_type + " TOTAL " + actual_t_space + " GUARANTEED " + actual_g_space + " LIFETIME " + actual_lifetime + " TOKEN " + s_token EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_updatespace method """ result,actual_t_space,actual_g_space,actual_lifetime = dpm.dpm_updatespace("myspacetoken", 209715200, 209715200, 2592000) print result if (result == 0): print " TOTAL " + actual_t_space + " GUARANTEED " + actual_g_space + " LIFETIME " + actual_lifetime EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_releasespace method """ result = dpm.dpm_releasespace("myspacetoken", 0) print result EXAMPLE
#!/usr/bin/python import dpm """ # Using the dpm_ping method """ result,info = dpm.dpm_ping("mydpmserver.domain.com") print result if (result == 0): print info KNOWN BUGS
The current interface to the dpns_getcwd(3), dpns_readlink(3), dpns_seterrbuf(3) requires the passing of str object which is modified to contain the result (in a similar way to the C functions, which accept a buffer). However this breaks the immutability of python str. This will be changed in the future. SEE ALSO
DPM C interface man pages DPM
$Date: 2010-02-04 13:08:39 +0100 (Thu, 04 Feb 2010) $ dpm_python(3)
All times are GMT -4. The time now is 02:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy