Computational Linguistics Toolset 1.1.4 review

Download
by rbytes.net on

The Computational Linguistics Toolset is a set of tools for computational linguistics

License: GPL (GNU General Public License)
File size: 116K
Developer: Wybo Wiersma
0 stars award from rbytes.net

The Computational Linguistics Toolset is a set of tools for computational linguistics. Computational Linguistics Toolset project contains re-usable code for cleaning, splitting, refining, and taking samples from corpora (ICE, Penn, and a native one), for tagging them using the TnT-tagger, for doing permutation statistics on N-grams (useful for finding statistically significant syntactical differences between any two sets of tagged texts), and various examination-tools. The tools themselves are well documented.

The tools are free (licensed under the General Public License). You get the entire tool-package (containing the newest version of all fiauimenre tools and the library) [in one download]. The tools are each documented, also a general [readme-file] is included. You can always e-mail me if something is still unclear (or if you found a bug).

A large number of these tools have been used for the Finnish Australian Immigrants Research. The Goall scripts are still configured for their usage in that research.

The core sensing tools for disambiguating using the [WordNet Similarity tools by Ted Pedersen], have been completed. They were built for speed (about 10 times faster). Actually some optimalizations I made for this have now also been included in the WordNet Similarity package (v 0.16).

What's New in This Release:
Compression was made the default for NgramPermutator and the PermutationStatter, and it was removed as an option.
A bug was fixed in the compression of NgramPermutator that prevented the creation of data since version 1.1.2.

Computational Linguistics Toolset 1.1.4 keywords