Visit Our UNIX and Linux User Community

Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

dumppdf(1) [debian man page]

DUMPPDF(1)							  PDFMiner Manual							DUMPPDF(1)

NAME
dumppdf - dumps internal contents of a PDF files SYNOPSIS
dumppdf [option...] file... DESCRIPTION
dumppdf dumps the internal contents of a PDF file in pseudo-XML format. This program is primarily for debugging purposes, but it's also possible to extract some meaningful contents OPTIONS
-a Dump all the objects. By default only the document trailer is printed. -i objno[,objno,...] Specifies PDF object IDs to display. Comma-separated IDs, or multiple -i options are accepted. -p pageno[,pageno,...] Specifies the comma-separated list of the page numbers to be extracted. Page numbers start at one. By default, it extracts text from all the pages. -r, -b, -t Specifies the output format of stream contents. Because the contents of stream objects can be very large, they are omitted when none of the options above is specified. With -r option, the "raw" stream contents are dumped without decompression. With -b option, the decompressed contents are dumped as a binary blob. With -t option, the decompressed contents are dumped in a text format, similar to repr() manner. When -r or -b option is given, no stream header is displayed for the ease of saving it to a file. -T Show the table of contents. -P password Provides the user password to access PDF contents. -d Increase the debug level. EXAMPLES
Dump all the headers and contents, except stream objects: $ dumppdf -a test.pdf Dump the table of contents: $ dumppdf -T test.pdf Extract a JPEG image: $ dumppdf -r -i6 test.pdf > image.jpeg SEE ALSO
pdf2txt(1) AUTHORS
Jakub Wilk <jwilk@debian.org> Wrote this manual page for the Debian system. Yusuke Shinyama <yusuke@cs.nyu.edu> Author of PDFMiner and its original HTML documentation. dumppdf 08/24/2011 DUMPPDF(1)

Check Out this Related Man Page

PDFDRAW(1)						      General Commands Manual							PDFDRAW(1)

NAME
pdfdraw - render PDF documents SYNOPSIS
pdfdraw [options] input.pdf [pages] DESCRIPTION
pdfdraw will render a PDF document to image files. The supported image formats are: pgm, ppm, pam and png. Select the pages to be ren- dered by specifying a comma separated list of ranges and individual page numbers (for example: 1,5,10-15). In no pages are specified all the pages will be rendered. OPTIONS
-o output The image format is deduced from the output file name. Embed %d in the name to indicate the page number (for example: "page%d.png"). -p password Use the specified password if the file is encrypted. -r resolution Render the page at the specified resolution. The default resolution is 72 dpi. -R angle Rotate clockwise by given number of degrees. -a Save the alpha channel. The default behavior is to render each page with a white background. With this option, the page background is transparent. Only supported for pam and png output formats. -g Render in grayscale. The default is to render a full color RGB image. If the output format is pgm or ppm this option is ignored. -m Show timing information. Take the time it takes for each page to render and print a summary at the end. -5 Print an MD5 checksum of the rendered image data for each page. -t Print the text contents of each page in UTF-8 encoding. Give the option twice to print detailed information about the location of each character in XML format. -x Print the display list used to render each page. -A Disable the use of accelerated functions. -G gamma Gamma correct the output image. Some typical values are 0.7 or 1.4 to thin or darken text rendering. -I Invert the output image colors. pages Comma separated list of ranges to render. SEE ALSO
mupdf(1), pdfclean(1). pdfshow(1). AUTHOR
MuPDF was written by Tor Andersson <tor@ghostscript.com>. MuPDF is Copyright 2006-2010 Artifex Software, Inc. September 4, 2011 PDFDRAW(1)

Featured Tech Videos