libany2uni 1.0.3 review
Downloadlibany2uni is a library to extract raw unicode text from any written documents (office documents). It should be useful to developp
|
|
libany2uni is a library to extract raw unicode text from any written documents (office documents).
It should be useful to developpers of search engine, text processing, corpus analysis, ....
UTF8 tool:
In the 'utils' directory, you can find a tool using libany2uni. It is called 'any2utf8' It reads a document and outputs the text in UTF8, to the standard output.
To compile it, just type make.
Run it with './any2utf8 < path + name of the document >'.
You can also get metadata with the -m option.
libany2uni 1.0.3 search tags