PDFTextStream 2.0.2 review

Download

PDFTextStream project is a PDF text and metadata extraction library available for Java, Python, and .NET. It supports all versions

License:	Other/Proprietary License with Free Trial
File size:	5423K
Developer:	Snowtide Informatics Systems, Inc.

PDFTextStream project is a PDF text and metadata extraction library available for Java, Python, and .NET.

It supports all versions of the PDF document specification, (including v1.6, used by Acrobat 7), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of 40-bit and 128-bit encrypted documents, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations).

Easy integration with Jakarta Lucene is included.

Requirements:
Apache Lucene (optional)

What's New in This Release:
This release adds a com.snowtide.pdf.RegionOutputTarget to support region-specific content extraction.
It adds the ability to derive encoding and spatial metrics of Type3 fonts.
It adds a pdfts.type3.derive system property to disable derivation if necessary.
A problem with com.snowtide.pdf.VisualOutputTarget, where lines would sometimes be inappropriately combined, has been fixed.

PDFTextStream 2.0.2 screenshot
Zoom

PDFTextStream 2.0.2 search tags

PDFTextStream 2.0.2: PDFTextStream project is a PDF text and metadata extraction library available for Java, Python, and .NET. It supports all versions
FoX Desktop 1.0 Professional: FoX Desktop Professional is the first Professional edition of FoX Desktop Linux, a Fedora-based desktop-oriented distribution, is now
The Freeduc-cd 1.5: Freeduc is a "run-from-CD" Linux distribution based on Knoppix and created by OFSET in France: "Until now - and probably for a while
Featherweight Linux 1.3: Featherweight Linux is my Live-CD installable Linux distribution that I remastered from Feather Linux, which is built on knoppix te
Mini-Pentoo 2006.1: Mini-Pentoo is a mini LiveCD distribution generated from Pentoo Linux. Here are some key features of "Mini Pentoo": · net-analy
SquirrelMail 1.5.1: SquirrelMail is a standards-based Webmail package written in PHP4
PDFreactor 1.1.936.7: RealObjects PDFreactor is a powerful formatting processor for converting XML and XHTML/HTML documents into PDF

PDFTextStream 2.0.2 review

Alternative/similar