libany2uni is a library to extract raw unicode text from any written documents (office documents). It should be useful to developp
libany2uni is a library to extract raw unicode text from any written documents (office documents).
It should be useful to developpers of search engine, text processing, corpus analysis, ....
In the 'utils' directory, you can find a tool using libany2uni. It is called 'any2utf8' It reads a document and outputs the text in UTF8, to the standard output.
To compile it, just type make.
Run it with './any2utf8 < path + name of the document >'.
You can also get metadata with the -m option.
Download libany2uni 1.0.3
Other software in this category
- Desktop Environment
- Science and Engineering
- Text Editing&Processing