Sponsored Content
Full Discussion: Pdf to text
Top Forums Shell Programming and Scripting Pdf to text Post 302916967 by Don Cragun on Friday 12th of September 2014 04:54:36 PM
Old 09-12-2014
Quote:
Originally Posted by cmccabe
The pdftotext works great for converting pdf files to text, but only seems to do one at a time. Can the command be modified for a directory? Thanks.
If you have the source for pdftotext, you can change it to do anything you want. If you don't have source, or if you want a simple solution, write a shell script that calls pdftotext for each PDF file in your current directory:
Code:
for file in *.pdf
do      pdftotext "$file" "${file%.pdf}".txt
done

 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

pdf to text

Hi, Can anyone help me in converting a PDF file into a text file? Thanks, sskb (2 Replies)
Discussion started by: sskb
2 Replies

2. AIX

How convert PCL or PDF file to text in AIX

I need to conver a PCL or PDF file to text in AIX, and I donīt know how. (0 Replies)
Discussion started by: 10789
0 Replies

3. Shell Programming and Scripting

Pdf to text conversion and vice versa

Hi, I have a pdf file. i want to convert it to text file and do some work on it and later want to convert it back to pdf. Can this be done via unix? or Is there a way unix can directly work on PDF file? (2 Replies)
Discussion started by: saltysumi
2 Replies

4. HP-UX

Convert text file into pdf on HP-UX platform

I am looking for a tool to convert text file into pdf on HP-UX platform. Current version of OS - $ uname -a HP-UX XXXXXXX B.11.23 U ia64 4125393404 unlimited-user license I'm able to get "ClibPDF-2.02"; A library for creating PDF files on HP-UX 11.00, PA-RISC 1.1 Not sure if... (2 Replies)
Discussion started by: mail4usr
2 Replies

5. Red Hat

create pdf of text file help

Can someone please tell me why this is not working? I have created numerous pdf's from text files by following these instructions and this time it is not working. Convert jpeg files to PDF under Linux | bitPrison.net convert /home/liveuser/Documents/hw7 /home/liveuser/Documents/hw7.pdf... (5 Replies)
Discussion started by: cokedude
5 Replies

6. UNIX for Advanced & Expert Users

PDF to Text Conversion

Hi Guys, My OS is Suse Linux.. Is there a Command to convert PDF file to Text?? Cheers!!!!! (2 Replies)
Discussion started by: mac4rfree
2 Replies

7. Shell Programming and Scripting

Text to pdf

Is there a way to covert te attached text file to a pdf? I have tried: enscript-p output.ps article.txt PRE.cjk { font-family: "WenQuanYi Micro Hei",monospace; }PRE.ctl { font-family: "Lohit Hindi",monospace; }P { margin-bottom: 0.08in; } awk '{ A=$2; next} END { for (i in A) print... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. Post Here to Contact Site Administrators and Moderators

Best text to pdf converter for UNIX hp/ux 11.31

Hi, Can anyone tell me the best text to pdf converter? I need to convert several text files to pdf. :)I'm looking to evaluate any products before I purchase. thanks, Linda (1 Reply)
Discussion started by: lnemitz
1 Replies

9. HP-UX

Best text to pdf converter for Hp/UX 11.31

Hi, Can anyone tell me the best converter I can use to convert text to pdf for HP/UX 11.31? Thanks, Linda (0 Replies)
Discussion started by: lnemitz
0 Replies
pdftotext(1)						      General Commands Manual						      pdftotext(1)

NAME
pdftotext - Portable Document Format (PDF) to text converter (version 3.00) SYNOPSIS
pdftotext [options] [PDF-file [text-file]] DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to plain text. Pdftotext reads the PDF file, PDF-file, and writes a text file, text-file. If text-file is not specified, pdftotext converts file.pdf to file.txt. If text-file is '-', the text is sent to stdout. OPTIONS
-f number Specifies the first page to convert. -l number Specifies the last page to convert. -r number Specifies the resolution, in DPI. The default is 72 DPI. -x number Specifies the x-coordinate of the crop area top left corner -y number Specifies the y-coordinate of the crop area top left corner -W number Specifies the width of crop area in pixels (default is 0) -H number Specifies the height of crop area in pixels (default is 0) -layout Maintain (as best as possible) the original physical layout of the text. The default is to 'undo' physical layout (columns, hyphen- ation, etc.) and output the text in reading order. -raw Keep the text in content stream order. This is a hack which often "undoes" column formatting, etc. Use of raw mode is no longer recommended. -htmlmeta Generate a simple HTML file, including the meta information. This simply wraps the text in <pre> and </pre> and prepends the meta headers. -bbox Generate an XHTML file containing bounding box information for each word in the file. -enc encoding-name Sets the encoding to use for text output. This defaults to "UTF-8". -listenc Lits the available encodings -eol unix | dos | mac Sets the end-of-line convention to use for text output. -nopgbrk Don't insert page breaks (form feed characters) between pages. -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -q Don't print any messages or errors. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) BUGS
Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files. EXIT CODES
The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR
The pdftotext software and documentation are copyright 1996-2004 Glyph & Cog, LLC. pdffonts(1), pdfimages(1), pdfinfo(1), pdftocairo(1), pdftohtml(1), pdftoppm(1), pdftops(1) 22 January 2004 pdftotext(1)
All times are GMT -4. The time now is 11:22 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy