PDFTextStream 2.2.5 (Default branch)

12-30-2008

Registered User

26,240, 27

Join Date: Sep 2000

Last Activity: 1 August 2008, 3:09 PM EDT

Posts: 26,240

Thanks Given: 0

Thanked 27 Times in 26 Posts

PDFTextStream 2.2.5 (Default branch)

PDFTextStream is a PDF text and metadata extraction library available for Java, Python, and .NET. It supports all versions of the PDF document specification, (including v1.7, used by Acrobat 8), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of 40-bit and 128-bit encrypted documents, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability. License: Other/Proprietary License with Free Trial Changes:
This release adds support for extracting XFA forms data as XML. It significantly improves the performance of text extraction using VisualOutputTarget. Support for PDF documents larger than 2GB. A fix for a bug where the encodings from embedded Type1 fonts were previously not being applied properly in some circumstances. A fix for a problem where newer content in updated PDF documents was sometimes being ignored. A fix for a problem where PDFDocEncoding-encoded bookmarks and metadata were not being decoded properly. A .getDestinationName() method in com.snowtide.pdf.Bookmark.

More...

Linux Bot

View Public Profile for Linux Bot

Find all posts by Linux Bot

Software Releases - RSS News

PDFTextStream 2.2.5 (Default branch)