Natural Language Toolkit 0.7b1 review

Download
by rbytes.net on

Natural Language Toolkit is a suite of Python libraries and programs for symbolic and statistical natural language processing

License: GPL (GNU General Public License)
File size: 0K
Developer: Steven Bird
0 stars award from rbytes.net

Natural Language Toolkit is a suite of Python libraries and programs for symbolic and statistical natural language processing. NLTK includes graphical demonstrations and sample data.

It is accompanied by extensive documentation, including tutorials that explain the underlying concepts behind the language processing tasks supported by the toolkit.

Documentation:

A substantial amount of documentation about how to use NLTK is available from the nltk home page:

< http://nltk.sourceforge.net >

In particular, the NLTK home page contains three types of documentation:

- Tutorials teach students how to use the toolkit, in the context of performing specific tasks. They are appropriate for anyone who wishes to learn how to use the toolkit.
< http://nltk.sourceforge.net/tutorial/ >

- The toolkit's reference documentation describes every module, interface, class, method, function, and variable in the toolkit. This documentation should be useful to both users and developers.
< http://nltk.sourceforge.net/ref/nltk.html >

- A number of technical reports are available. These reports explain and justify the toolkit's design and implementation. They are used by the developers of the toolkit to guide and document the toolkit's construction. Students can consult these reports if they would like further information about how the toolkit is designed and why it is designed that way.
< http://nltk.sourceforge.net/tech/ >

What's New in This Release:
Code:
expanded semantic interpretation package
new high-level chunking interface, with cascaded chunking
split chunking code into new chunk package
updated wordnet package to support version 2.1 of Wordnet.
prototyped basic wordnet similarity measures (path distance, Wu + Palmer and Leacock + Chodorow similarity measures.)
bugfixes (tag.Window, tag.ngram)
more doctests
Contrib:
toolbox language settings module
Tutorials:
rewrite of chunking chapter, switched from Treebank to CoNLL format as main focus, simplified evaluation framework, added ngram chunking section
substantial updates throughout (esp programming and semantics chapters)
Corpora:
Chat-80 Prolog data files provided as corpora, plus corpus reader

Natural Language Toolkit 0.7b1 keywords