nltk

package documentation

(source)

The Natural Language Toolkit (NLTK) is an open source Python library for Natural Language Processing. A free online book is available. (If you use the library for academic research, please cite the book.)

Steven Bird, Ewan Klein, and Edward Loper (2009). Natural Language Processing with Python. O'Reilly Media Inc. http://nltk.org/book

Package	`app`	Interactive NLTK Applications:
Module	`book`	Undocumented
Package	`ccg`	Combinatory Categorial Grammar.
Package	`chat`	A class for simple chatbots. These perform simple pattern matching on sentences typed by users, and respond with automatically generated sentences.
Package	`chunk`	Classes and interfaces for identifying non-overlapping linguistic groups (such as base noun phrases) in unrestricted text. This task is called "chunk parsing" or "chunking", and the identified groups are called "chunks"...
Package	`classify`	Classes and interfaces for labeling tokens with category labels (or "class labels"). Typically, labels are represented with strings (such as `'health'` or `'sports'`). Classifiers can be used to perform a wide range of classification tasks...
Module	`cli`	No module docstring; 0/1 constant, 1/2 function documented
Package	`cluster`	This module contains a number of basic clustering algorithms. Clustering describes the task of discovering groups of similar items with a large collection. It is also describe as unsupervised machine learning, as the data from which it learns is unannotated with class information, as is the case for supervised learning...
Module	`collections`	No module docstring; 8/9 classes documented
Module	`collocations`	Tools to identify collocations --- words that often appear consecutively --- within corpora. They may also be used to find other associations between word occurrences. See Manning and Schutze ch. 5 at ...
Module	`compat`	Undocumented
Package	`corpus`	NLTK corpus readers. The modules in this package provide functions that can be used to read corpus files in a variety of formats. These functions can be used to read both the corpus files that are distributed in the NLTK corpus package, and corpus files that are part of external corpora.
Module	`data`	Functions to find and load NLTK resource files, such as corpora, grammars, and saved processing objects. Resource files are identified using URLs, such as `nltk:corpora/abc/rural.txt` or `http://nltk.org/sample/toy.cfg`...
Module	`decorators`	Decorator module by Michele Simionato <michelesimionato@libero.it> Copyright Michele Simionato, distributed under the terms of the BSD License (see below). http://www.phyast.pitt.edu/~micheles/python/documentation.html...
Module	`downloader`	The NLTK corpus and module downloader. This module defines several interfaces which can be used to download corpora, models, and other data packages that can be used with NLTK.
Package	`draw`	No package docstring; 5/5 modules documented
Module	`featstruct`	Basic data classes for representing feature structures, and for performing basic operations on those feature structures. A feature structure is a mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure...
Module	`grammar`	Basic data classes for representing context free grammars. A "grammar" specifies which trees can represent the structure of a given text. Each of these trees is called a "parse tree" for the text (or simply a "parse")...
Module	`help`	Provide structured access to documentation.
Package	`inference`	Classes and interfaces for theorem proving and model building.
Module	`internals`	No module docstring; 0/4 variable, 0/3 constant, 15/22 functions, 1/1 exception, 3/3 classes documented
Module	`jsontags`	Register JSON tags, so the nltk data loader knows what module and class to look for.
Module	`lazyimport`	Helper to enable simple lazy module import.
Package	`lm`	Currently this module covers only ngram language models, but it should be easy to extend to neural models.
Package	`metrics`	NLTK Metrics
Package	`misc`	No package docstring; 3/5 modules documented
Package	`parse`	NLTK Parsers
Module	`probability`	Classes for representing and processing probabilistic information.
Package	`sem`	NLTK Semantic Interpretation Package
Package	`sentiment`	NLTK Sentiment Analysis Package
Package	`stem`	NLTK Stemmers
Package	`tag`	NLTK Taggers
Package	`tbl`	Transformation Based Learning
Package	`test`	Unit tests for the NLTK modules. These tests are intended to ensure that source code changes don't accidentally introduce bugs. For instructions, please see:
Module	`text`	This module brings together a variety of NLTK functionality for text analysis, and provides simple, interactive interfaces. Functionality includes: concordancing, collocation discovery, regular expression search over tokenized strings, and distributional similarity.
Module	`tgrep`	This module supports TGrep2 syntax for matching parts of NLTK Trees. Note that many tgrep operators require the tree passed to be a `ParentedTree`.
Package	`tokenize`	NLTK Tokenizer Package
Module	`toolbox`	Module for reading, writing and manipulating Toolbox databases and settings files.
Package	`translate`	Experimental features for machine translation. These interfaces are prone to change.
Module	`tree`	Class for representing hierarchical language structures, such as syntax trees and morphological trees.
Module	`treeprettyprinter`	Pretty-printing of discontinuous trees. Adapted from the disco-dop project, by Andreas van Cranenburgh. https://github.com/andreasvc/disco-dop
Module	`treetransforms`	A collection of methods for tree (grammar) transformations used in parsing natural language.
Package	`twitter`	NLTK Twitter Package
Module	`util`	No module docstring; 26/34 functions, 0/1 class documented
Module	`wsd`	No module docstring; 1/1 function documented

From __init__.py:

Function	`demo`	Undocumented
Variable	`__classifiers__`	Undocumented
Variable	`__copyright__`	Undocumented
Variable	`__keywords__`	Undocumented
Variable	`__license__`	Undocumented
Variable	`__longdescr__`	Undocumented
Variable	`__maintainer__`	Undocumented
Variable	`__maintainer_email__`	Undocumented
Variable	`__url__`	Undocumented
Variable	`__version__`	Undocumented
Variable	`version_file`	Undocumented
Function	`_fake_PIPE`	Undocumented
Function	`_fake_Popen`	Undocumented