libany2uni 1.0.3 review

Download
by rbytes.net on

libany2uni is a library to extract raw unicode text from any written documents (office documents). It should be useful to developp

License: GPL (GNU General Public License)
File size: 0K
Developer: Romuald Texier and Gwendal Dufresne
0 stars award from rbytes.net

libany2uni is a library to extract raw unicode text from any written documents (office documents).

It should be useful to developpers of search engine, text processing, corpus analysis, ....

UTF8 tool:

In the 'utils' directory, you can find a tool using libany2uni. It is called 'any2utf8' It reads a document and outputs the text in UTF8, to the standard output.

To compile it, just type make.

Run it with './any2utf8 < path + name of the document >'.

You can also get metadata with the -m option.

libany2uni 1.0.3 keywords