debian man page for cam::pdf::pagetext

Query: cam::pdf::pagetext

OS: debian

Section: 3pm

Format: Original Unix Latex Style Formatted with HTML and a Horizontal Scroll Bar

CAM::PDF::PageText(3pm) 				User Contributed Perl Documentation				   CAM::PDF::PageText(3pm)

NAME
CAM::PDF::PageText - Extract text from PDF page tree
SYNOPSIS
my $pdf = CAM::PDF->new($filename); my $pageone_tree = $pdf->getPageContentTree(1); print CAM::PDF::PageText->render($pageone_tree);
DESCRIPTION
This module attempts to extract sequential text from a PDF page. This is not a robust process, as PDF text is graphically laid out in arbitrary order. This module uses a few heuristics to try to guess what text goes next to what other text, but may be fooled easily by, say, subscripts, non-horizontal text, changes in font, form fields etc. All those disclaimers aside, it is useful for a quick dump of text from a simple PDF file.
LICENSE
Same as CAM::PDF
FUNCTIONS
$pkg->render($pagetree) $pkg->render($pagetree, $verbose) Turn a page content tree into a string. This is a class method that should be called like: CAM::PDF::PageText->render($pagetree);
AUTHOR
See CAM::PDF perl v5.14.2 2012-07-08 CAM::PDF::PageText(3pm)
Related Man Pages
cam::pdf::content(3pm) - debian
cam::pdf::gs(3pm) - debian
image::exiftool::pdf(3pm) - debian
pdf::api2::resource::font::synfont(3pm) - debian
text::pdf::dict(3pm) - debian
Similar Topics in the Unix Linux Community
Adding the individual columns of a matrix.
Installing Dash Shell on OS X Lion
Find columns in a file based on header and print to new file
Weird 'find' results
New UNIX and Linux History Sections