Today (Saturday) We will make some minor tuning adjustments to MySQL.

You may experience 2 up to 10 seconds "glitch time" when we restart MySQL. We expect to make these adjustments around 1AM Eastern Daylight Saving Time (EDT) US.

Pdf to xls or csv Linux

Thread Tools Search this Thread
# 1  
Code Pdf to xls or csv Linux


I want to convert a pdf to xls or csv. I used pdftotext to convert the pdf to text. A complication with the original dataset is there are empty spaces. For example, there is an empty space between Gala and Apple in the Product field and in the Quantity field (I put double quotes there but there is an empty space without any characters).
Date Product Quantity Price
1/1/2013 Gala Apple 100 $1.00
1/2/2013 Gala Apple " " $1.00
1/3/2013 Gala Apple 200 $1.00

I want the final product to be a properly aligned Excel file. Therefore, I would need 'Gala Apple' to be in one cell and preserve the blank cell (or with a unique character signifying there was no data to begin with).

Anyone have a simple fix for this?

By the way, I'm not a sophisticated programmer. I mostly write shell scripts with AWK.



Last edited by Franklin52; 12-19-2013 at 03:10 AM.. Reason: Please use code tags
# 4  
A free utility for converting PDF to text is certainly a useful insight to solving your problem. Have you actually tried it, yet? Does it produce text like you have shown? Are you able to install or build it on your system? In short, what have you tried?

There's little point going farther if it does not work or you're unable to do anything. It's far too easy for us to craft solutions that don't work given this minimal information, and we are not a discount coding warehouse. One step at a time.

Last edited by Corona688; 12-19-2013 at 11:16 PM..
# 5  
In order to use awk the source file cannot be pdf, it has to be a text file. Step 1. You cannot do anything until that happens. There is no pdfawk-like software. You can buy Nitro or some other pdf editor, you can use the Poppler API - if you can write C code. Those do not apply to you. Apparently.

If you want real help give simple example input and expected output. We already have what I think is input.
# 6  
pdf really isn't made to allow this to happen. A pdf can contain many types of content. Shoot, the spreadsheet data could be inside of an image. Attempting pdftotext or other program is probably your best bet, but only as a starting point and even then, as I mentioned, not necessarily a full proof solution.

With all of that said, if the pdf file is something that is regularly generated in the same way, maybe if a sample were posted somewhere, something could be created (maybe by someone here) to extract the data as a csv.

Recommendation, upload the sample pdf somewhere (or provide a link)... and then let's see what is possible.

Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Mail sending with multiple attachement(pdf and csv) with html content from Linux
Hi, We have a requirement to send multiple attachment(pdf and csv) along with html content in a single mail. For that we are using uuencode. It is working for single pdf attachment and html content. But we are unable to send both pdf and csv attachment with html content. Below is the script....... Shell Programming and Scripting
Shell Programming and Scripting
Csv to xls
Hello I have a script which converts log to csv. Now I need to have xls. Is there any easy way/command which can convert csv to xls?:confused: preferably just using bash and not perl,... is it possible?... Shell Programming and Scripting
Shell Programming and Scripting
Ok, every morning at my office we send out excel sheets to Economy people with statistics for yesterdays trading. All the trading run's in Redhat or Solaris environments. We run a script on a Redhat server whitch generates the stats in CSV format. After we download we open it in Excel and...... Shell Programming and Scripting
Shell Programming and Scripting
converting xls file to txt file and xls to csv
I need to convert an excel file into a text file and an excel file into a CSV file.. any code to do that is appreciated thanks... Shell Programming and Scripting
Shell Programming and Scripting
xls to csv
how to convert a xls file into .csv file? is tghere any command in unix for that? please help thanks... Shell Programming and Scripting
Shell Programming and Scripting

Featured Tech Videos