Apache Nutch 0.8.1 review

Download
by rbytes.net on

License: GPL (GNU General Public License)
File size: 67755K
Developer: Sami Siren
0 stars award from rbytes.net

Nutch project is Web searching software which builds on Lucene Java, adding Web specifics such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.

Interesting files include:


docs/api/index.html
Javadocs for the Nutch software.

CHANGES.txt
Log of changes to Nutch.


For the latest information about Nutch, please visit our website at:

http://lucene.apache.org/nutch/

and our wiki, at:

http://wiki.apache.org/nutch/

To get started using Nutch read Tutorial:

http://lucene.apache.org/nutch/tutorial.html

What's New in This Release:
A thread blocking issue that negatively impacted crawling performance has been fixed.
Bugs in scoring have been fixed.
Problems with updatedb on Windows/Cygwin have been fixed.
A bug in the generator where the lowest scoring pages were selected instead of highest scoring pages has been fixed.

Apache Nutch 0.8.1 keywords