Apache Nutch 0.8.1 review
Download
|
|
Nutch project is Web searching software which builds on Lucene Java, adding Web specifics such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.
Interesting files include:
docs/api/index.html
Javadocs for the Nutch software.
CHANGES.txt
Log of changes to Nutch.
For the latest information about Nutch, please visit our website at:
http://lucene.apache.org/nutch/
and our wiki, at:
http://wiki.apache.org/nutch/
To get started using Nutch read Tutorial:
http://lucene.apache.org/nutch/tutorial.html
What's New in This Release:
A thread blocking issue that negatively impacted crawling performance has been fixed.
Bugs in scoring have been fixed.
Problems with updatedb on Windows/Cygwin have been fixed.
A bug in the generator where the lowest scoring pages were selected instead of highest scoring pages has been fixed.
Apache Nutch 0.8.1 keywords