cam::pdf::pagetext(3pm) debian man page

Man Page: cam::pdf::pagetext

Operating Environment: debian

Section: 3pm

CAM::PDF::PageText(3pm) 				User Contributed Perl Documentation				   CAM::PDF::PageText(3pm)

NAME
       CAM::PDF::PageText - Extract text from PDF page tree

SYNOPSIS
	  my $pdf = CAM::PDF->new($filename);
	  my $pageone_tree = $pdf->getPageContentTree(1);
	  print CAM::PDF::PageText->render($pageone_tree);

DESCRIPTION
       This module attempts to extract sequential text from a PDF page.  This is not a robust process, as PDF text is graphically laid out in
       arbitrary order.  This module uses a few heuristics to try to guess what text goes next to what other text, but may be fooled easily by,
       say, subscripts, non-horizontal text, changes in font, form fields etc.

       All those disclaimers aside, it is useful for a quick dump of text from a simple PDF file.

LICENSE
       Same as CAM::PDF

FUNCTIONS
       $pkg->render($pagetree)
       $pkg->render($pagetree, $verbose)
	   Turn a page content tree into a string.  This is a class method that should be called like:

	      CAM::PDF::PageText->render($pagetree);

AUTHOR
       See CAM::PDF

perl v5.14.2							    2012-07-08						   CAM::PDF::PageText(3pm)

cam::pdf::pagetext(3pm) debian man page | unix.com