Sherlock Holmes 3.9 review
DownloadSherlock Holmes is a universal search engine, a system for gathering and indexing of textual data (text files, web pages, etc), both
|
|
Sherlock Holmes is a universal search engine, a system for gathering and indexing of textual data (text files, web pages, etc), both locally and over the network.
Here are some key features of "Sherlock Holmes":
Gathers files via HTTP or from local files.
Parses text files, HTML, PDF, and several other formats using external parsers (such as MS Word and PostScript).
The whole system is modular, so adding your own data sources or parsers is just matter of plugging in right module (well, usually also writing it).
Works well in mixed charset environment.
Considers multiple occurences of the same file (even with minor changes) a single document with multiple URL's.
Everything is highly configurable. You can write filtering rules in a special language which allows to tweak configuration variables depending on the document being processed.
Searching of words, phrases, and boolean expressions. Searching in filenames and link texts.
Proximity search and proximity weighting of regular searches.
Recognition of languages, easy integration of stemmers and synonymic dictionaries.
Spelling checker based on word frequencies observed in the indexed data, hinting the user that his query might be misspelled.
Search results include context in each document.
Scales well to tens of millions of documents on normal PC hardware.
User interface (the front-end) is completely separated from the rest of the system, making it easy to modify and also to embed the search engine in existing applications.
Downloaded files and indices are compressed to save space.
What's New in This Release:
Sherlock now contains a new library for analyzing the contents of the documents.
An existing index can now be quickly patched by new cards.
The search server dumps the context of long cards better, and it can serve as a simple database by allowing browsing of all cards.
A faster utility, "shcp", was added for copying the index into different machines.
The configuration mechanism has been improved.
Sherlock now supports the AMD64 architecture.
Most modules have been substantially optimized, cleaned up, and corrected.
Sherlock Holmes 3.9 keywords