Pdf to xls or csv Linux

Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Pdf to xls or csv Linux
# 1  
Old 12-18-2013
Code Pdf to xls or csv Linux


I want to convert a pdf to xls or csv. I used pdftotext to convert the pdf to text. A complication with the original dataset is there are empty spaces. For example, there is an empty space between Gala and Apple in the Product field and in the Quantity field (I put double quotes there but there is an empty space without any characters).
Date Product Quantity Price
1/1/2013 Gala Apple 100 $1.00
1/2/2013 Gala Apple " " $1.00
1/3/2013 Gala Apple 200 $1.00

I want the final product to be a properly aligned Excel file. Therefore, I would need 'Gala Apple' to be in one cell and preserve the blank cell (or with a unique character signifying there was no data to begin with).

Anyone have a simple fix for this?

By the way, I'm not a sophisticated programmer. I mostly write shell scripts with AWK.



Last edited by Franklin52; 12-19-2013 at 03:10 AM.. Reason: Please use code tags
# 2  
Old 12-19-2013
pdftotext - Wikipedia, the free encyclopedia

Use your awk skills to reformat a text file created from pdf -> text.
# 3  
Old 12-19-2013
Jim, don't bother responding to questions if you're not going to offer any useful insight.
# 4  
Old 12-19-2013
A free utility for converting PDF to text is certainly a useful insight to solving your problem. Have you actually tried it, yet? Does it produce text like you have shown? Are you able to install or build it on your system? In short, what have you tried?

There's little point going farther if it does not work or you're unable to do anything. It's far too easy for us to craft solutions that don't work given this minimal information, and we are not a discount coding warehouse. One step at a time.

Last edited by Corona688; 12-19-2013 at 11:16 PM..
# 5  
Old 12-19-2013
In order to use awk the source file cannot be pdf, it has to be a text file. Step 1. You cannot do anything until that happens. There is no pdfawk-like software. You can buy Nitro or some other pdf editor, you can use the Poppler API - if you can write C code. Those do not apply to you. Apparently.

If you want real help give simple example input and expected output. We already have what I think is input.
# 6  
Old 12-20-2013
pdf really isn't made to allow this to happen. A pdf can contain many types of content. Shoot, the spreadsheet data could be inside of an image. Attempting pdftotext or other program is probably your best bet, but only as a starting point and even then, as I mentioned, not necessarily a full proof solution.

With all of that said, if the pdf file is something that is regularly generated in the same way, maybe if a sample were posted somewhere, something could be created (maybe by someone here) to extract the data as a csv.

Recommendation, upload the sample pdf somewhere (or provide a link)... and then let's see what is possible.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Mail sending with multiple attachement(pdf and csv) with html content from Linux

Hi, We have a requirement to send multiple attachment(pdf and csv) along with html content in a single mail. For that we are using uuencode. It is working for single pdf attachment and html content. But we are unable to send both pdf and csv attachment with html content. Below is the script.... (5 Replies)
Discussion started by: dholea
5 Replies

2. Shell Programming and Scripting

Csv to xls

Hello I have a script which converts log to csv. Now I need to have xls. Is there any easy way/command which can convert csv to xls?:confused: preferably just using bash and not perl,... is it possible? (1 Reply)
Discussion started by: frhling
1 Replies

3. Shell Programming and Scripting

Oracle to CSV to XLS

I would like to know if have one way with read table from oracle converter in CSV o TXT and After converter in XLS or spreedsheet Thanks so much JAvier (3 Replies)
Discussion started by: javeiregh
3 Replies

4. Shell Programming and Scripting


Ok, every morning at my office we send out excel sheets to Economy people with statistics for yesterdays trading. All the trading run's in Redhat or Solaris environments. We run a script on a Redhat server whitch generates the stats in CSV format. After we download we open it in Excel and... (3 Replies)
Discussion started by: chipmunken
3 Replies

5. Shell Programming and Scripting

how to convert .xls to .csv

Hi, I have problem..How to convert .xls file to .csv.. Plz help me for this problem.. (1 Reply)
Discussion started by: varma457
1 Replies

6. Shell Programming and Scripting

converting xls file to txt file and xls to csv

I need to convert an excel file into a text file and an excel file into a CSV file.. any code to do that is appreciated thanks (6 Replies)
Discussion started by: bandar007
6 Replies

7. Shell Programming and Scripting

xls to csv

how to convert a xls file into .csv file? is tghere any command in unix for that? please help thanks (3 Replies)
Discussion started by: infyanurag
3 Replies

8. Shell Programming and Scripting

.xls to .csv conversion

Hi Please can someone tell me how i can convert .xls file into .csv on both platforms, windows and unix. many thanks, neil (4 Replies)
Discussion started by: neil546
4 Replies

9. Shell Programming and Scripting

From xls to csv file

Can we convert an xls file into csv format in Unix Thanks Suresh (1 Reply)
Discussion started by: sureshg_sampat
1 Replies

10. Shell Programming and Scripting

Converting csv to xls

Hi, Can anyone tell the option to change the file type in unix. i.e. if a file is in csv(Comma Separating Values) format, it should be changed to xls(ordinary MS-Excel) format. But renaming command is not changing to correct file format. Thanks in advance, Milton. (1 Reply)
Discussion started by: miltony
1 Replies
Login or Register to Ask a Question