Pdf to text


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Pdf to text
# 1  
Old 09-12-2014
Pdf to text

Is there a way using the pdf to text utility to convert all the pdf in a given directory?

So instead of one at a time:
Code:
pdftotext pdftotext hp-manual.pdf hp-manual.txt

a directory of 50 pdf files would be converted:
Code:
 pdftotext /home/dnascopev/Desktop/PDF.pdf /home/dnascopev/Desktop/PDF.txt

Thank you.
# 2  
Old 09-12-2014
Take a look at the portable bit map utilities, like pdftopbm and gocr, the tool that converts to text. You could convert to pbm, jpg, etc...and then use gocr to get text.

I am not sure if gocr works on pdf files, but if not you can use pdftopdm.
# 3  
Old 09-12-2014
The pdftotext works great for converting pdf files to text, but only seems to do one at a time. Can the command be modified for a directory? Thanks.
# 4  
Old 09-12-2014
Quote:
Originally Posted by cmccabe
The pdftotext works great for converting pdf files to text, but only seems to do one at a time. Can the command be modified for a directory? Thanks.
If you have the source for pdftotext, you can change it to do anything you want. If you don't have source, or if you want a simple solution, write a shell script that calls pdftotext for each PDF file in your current directory:
Code:
for file in *.pdf
do      pdftotext "$file" "${file%.pdf}".txt
done

# 5  
Old 09-12-2014
Code:
 for file in *.pdf
do      pdftotext "$file" "${file%.pdf}".txt
done

So, if the directory is /home/dnascopev/Desktop/PDF are you saying that can put in the shell scripr or each pdf name ans where? Thank you Smilie.
# 6  
Old 09-12-2014
You would use cd to change directory.

Also I'd use [pP][dD][fF] in case any of them were wonky case.

Code:
cd /home/dnascopev/Desktop/PDF
for file in *.[pP][dD][fF]
do
...
done

# 7  
Old 09-12-2014
Sorry. By posting in the Shell Programming and Scripting forum, I assumed that you knew how to write and run a shell script.

Making more wild assumptions:
  1. you are using a UNIX or Linux system,
  2. you have more than one directory that contains files you want to process,
  3. you have a bin directory in your home directory, and
  4. $HOME/bin is in your command search path:
then create a file named pdftotextdir in $HOME/bin containing:
Code:
#!/bin/ksh
if [ $# -eq 1 ]
then    cd "$1"
else    printf 'Usage: %s directory\n' "${0##*/}" >&2
        exit 1
fi
for file in *.[Pp][Dd][Ff]
do      pdftotext "$file" "${file%.[Pp][Dd][Ff]}".txt
done

(If you don't have a Korn shell, you can change /bin/ksh to /bin/bash or the pathname of any shell that understands POSIX required shell variable expansions.)

Then issue the command:
Code:
chmod +x $HOME/bin/pdftotextdir

Then you can run your new utility to use pdftotext on every PDF file in whatever directory you want to process by issuing the command:
Code:
pdftotextdir directory

which for you latest request would be:
Code:
pdftotextdir /home/dnascopev/Desktop/PDF

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. HP-UX

Best text to pdf converter for Hp/UX 11.31

Hi, Can anyone tell me the best converter I can use to convert text to pdf for HP/UX 11.31? Thanks, Linda (0 Replies)
Discussion started by: lnemitz
0 Replies

2. Post Here to Contact Site Administrators and Moderators

Best text to pdf converter for UNIX hp/ux 11.31

Hi, Can anyone tell me the best text to pdf converter? I need to convert several text files to pdf. :)I'm looking to evaluate any products before I purchase. thanks, Linda (1 Reply)
Discussion started by: lnemitz
1 Replies

3. Shell Programming and Scripting

Text to pdf

Is there a way to covert te attached text file to a pdf? I have tried: enscript-p output.ps article.txt PRE.cjk { font-family: "WenQuanYi Micro Hei",monospace; }PRE.ctl { font-family: "Lohit Hindi",monospace; }P { margin-bottom: 0.08in; } awk '{ A=$2; next} END { for (i in A) print... (7 Replies)
Discussion started by: cmccabe
7 Replies

4. UNIX for Advanced & Expert Users

PDF to Text Conversion

Hi Guys, My OS is Suse Linux.. Is there a Command to convert PDF file to Text?? Cheers!!!!! (2 Replies)
Discussion started by: mac4rfree
2 Replies

5. Red Hat

create pdf of text file help

Can someone please tell me why this is not working? I have created numerous pdf's from text files by following these instructions and this time it is not working. Convert jpeg files to PDF under Linux | bitPrison.net convert /home/liveuser/Documents/hw7 /home/liveuser/Documents/hw7.pdf... (5 Replies)
Discussion started by: cokedude
5 Replies

6. HP-UX

Convert text file into pdf on HP-UX platform

I am looking for a tool to convert text file into pdf on HP-UX platform. Current version of OS - $ uname -a HP-UX XXXXXXX B.11.23 U ia64 4125393404 unlimited-user license I'm able to get "ClibPDF-2.02"; A library for creating PDF files on HP-UX 11.00, PA-RISC 1.1 Not sure if... (2 Replies)
Discussion started by: mail4usr
2 Replies

7. Shell Programming and Scripting

Pdf to text conversion and vice versa

Hi, I have a pdf file. i want to convert it to text file and do some work on it and later want to convert it back to pdf. Can this be done via unix? or Is there a way unix can directly work on PDF file? (2 Replies)
Discussion started by: saltysumi
2 Replies

8. AIX

How convert PCL or PDF file to text in AIX

I need to conver a PCL or PDF file to text in AIX, and I donīt know how. (0 Replies)
Discussion started by: 10789
0 Replies

9. UNIX for Dummies Questions & Answers

pdf to text

Hi, Can anyone help me in converting a PDF file into a text file? Thanks, sskb (2 Replies)
Discussion started by: sskb
2 Replies
Login or Register to Ask a Question