Linux SoftwareProgrammingLibrarieslibany2uni 1.0.3

libany2uni 1.0.3


libany2uni is a library to extract raw unicode text from any written documents (office documents). It should be useful to developp
Developer:   Romuald Texier and Gwendal Dufresne
      more software by author →
Price:  0.00
License:   GPL (GNU General Public License)
File size:   0K
Language:   
OS:   
Rating:   0 /5 (0 votes)
Your vote:  
enlarge screenshot


libany2uni is a library to extract raw unicode text from any written documents (office documents).

It should be useful to developpers of search engine, text processing, corpus analysis, ....

UTF8 tool:

In the 'utils' directory, you can find a tool using libany2uni. It is called 'any2utf8' It reads a document and outputs the text in UTF8, to the standard output.

To compile it, just type make.

Run it with './any2utf8 < path + name of the document >'.

You can also get metadata with the -m option.
tags can  

Download libany2uni 1.0.3


 http://download.berlios.de/libany2uni/libany2uni1.0.3.tgz
 http://download2.berlios.de/libany2uni/libany2uni1.0.3.tgz


Authors software

libany2uni 1.0.3 (by Romuald Texier and Gwendal Dufresne)
libany2uni is a library to extract raw unicode text from any written documents (office documents).

It should be useful to developp


Similar software

libany2uni 1.0.3 (by Romuald Texier and Gwendal Dufresne)
libany2uni is a library to extract raw unicode text from any written documents (office documents).

It should be useful to developp

OpenOffice::OODoc::Intro 2.027 (by Jean-Marie Gouarne)
OpenOffice::OODoc::Intro is a Perl module for an introduction to the Open OpenDocument Connector.

The main goal of the Open OpenDo

PDFBox 0.7.3 (by Ben Litchfield)
PDFBox is an open source Java PDF library for working with PDF documents

PDFspy 1.0 Beta3 (by Apago, Inc.)
PDFspy is the ultimate "get info" utility for your PDF documents

mod_line_edit 0.9.2 (by WebThing Ltd.)
mod_line_edit is a general-purpose fast text filter Apache module

Document Library 1.2b2 (by Martijn Faassen)
Document Library is a Web application for document management in larger organizations with a lot of documents.

Organizations deal

PlainDoc 1.55 (by Sampo Kellomaki)
PlainDoc (pd2tex) document production system allows you to write documents as normal text files

Estraier 1.2.29 (by Mikio Hirabayashi)
Estraier is a full-text search system for personal use

xtranslate 0.2 (by Nir Tzachar)
xtranslate is a tool to convert the text in the X11 selection buffer from ASCII to arbitrary UTF-8 characters.

This is particularly

PDFGambas 1.0.0 (by Daniel Campos Fern?nd)
PDFGambas is a viewer for PDF documents written as an example of the gb.gtk.pdf component usage


Other software in this category

zlib 1.2.3 (by Jean-loup Gailly)
zlib is designed to be a free, general-purpose, legally unencumbered, lossless data-compression library for use on virtually any comp

libjpeg v6b (by Independent JPEG Group)
libjpeg is a library for handling the JPEG (JFIF) image format

OpenSSL 0.9.7c (by The OpenSSL Project Team)
The OpenSSL Project is a collaborative effort to develop a robust, commercial-grade, full-featured, and Open Source toolkit implement

libxml2 2.6.27 (by DV)
Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), libxml2 library i

GNU C library 2.4 (by Andreas Jaeger)
GNU C library (glibc) is one of the most important components of GNU Hurd and most modern Linux distributions.

GNU C library is us

    search


Featured Software

jEdit 4.3 pre8
jEdit is an Open Source text editor written in Java

Opera 9.02
Surf the Internet in a safer, faster, and easier way with Opera browser

GNU Aspell 0.60.4
GNU Aspell is a Free and Open Source spell checker designed to eventually replace Ispell


Subscribe in Rojo
Google Reader
Add to My Yahoo!

Add to My AOL
Subscribe with Bloglines
Subscribe in NewsGator Online
Add 'nixbit linux software' to Newsburst from CNET News.com
del.icio.us nixbit linux software


Top tags