nltk
- The Natural Language Toolkit (NLTK) is an open source Python library for Natural Language Processing. A free online book is available. (If you use the library for academic research, please cite the book...app
- Interactive NLTK Applications:chartparser_app
- A graphical tool for exploring chart parsing.chunkparser_app
- A graphical tool for exploring the regular expression based chunk parser nltk.chunk.RegexpChunkParser.collocations_app
- Undocumentedconcordance_app
- Undocumentednemo_app
- Finding (and Replacing) Nemordparser_app
- A graphical tool for exploring the recursive descent parser.srparser_app
- A graphical tool for exploring the shift-reduce parser.wordfreq_app
- Undocumentedwordnet_app
- A WordNet Browser application which launches the default browser (if it is not already running) and opens a new tab with a connection to http://localhost:port/ . It also starts an HTTP server on the specified port and begins serving browser requests...book
- Undocumentedccg
- Combinatory Categorial Grammar.api
- No module docstring; 5/5 classes documentedchart
- The lexicon is constructed by calling lexicon.fromstring(<lexicon string>).combinator
- CCG Combinatorslexicon
- CCG Lexiconslogic
- Helper functions for CCG semantics computationchat
- A class for simple chatbots. These perform simple pattern matching on sentences typed by users, and respond with automatically generated sentences.eliza
- Undocumentediesha
- This chatbot is a tongue-in-cheek take on the average teen anime junky that frequents YahooMessenger or MSNM. All spelling mistakes and flawed grammar are intentional.rude
- Undocumentedsuntsu
- Tsu bot responds to all queries with a Sun Tsu sayingsutil
- Undocumentedzen
- Zen Chatbot talks in gems of Zen wisdom.chunk
- Classes and interfaces for identifying non-overlapping linguistic groups (such as base noun phrases) in unrestricted text. This task is called "chunk parsing" or "chunking", and the identified groups are called "chunks"...api
- No module docstring; 1/1 class documentednamed_entity
- Named entity chunkerregexp
- No module docstring; 0/1 constant, 3/3 functions, 12/12 classes documentedutil
- No module docstring; 0/3 constant, 7/10 functions, 1/1 class documentedclassify
- Classes and interfaces for labeling tokens with category labels (or "class labels"). Typically, labels are represented with strings (such as 'health' or 'sports'). Classifiers can be used to perform a wide range of classification tasks...api
- Interfaces for labeling tokens with category labels (or "class labels").decisiontree
- A classifier model that decides which label to assign to a token on the basis of a tree structure, where branches correspond to conditions on feature values, and leaves correspond to label assignments.maxent
- A classifier model based on maximum entropy modeling framework. This framework considers all of the probability distributions that are empirically consistent with the training data; and chooses the distribution with the highest entropy...megam
- A set of functions used to interface with the external megam maxent optimization package. Before megam can be used, you should tell NLTK where it can find the megam binary, using the config_megam() function...naivebayes
- A classifier based on the Naive Bayes algorithm. In order to find the probability for a label, this algorithm first uses the Bayes rule to express P(label|features) in terms of P(label) and P(features|label):...positivenaivebayes
- A variant of the Naive Bayes Classifier that performs binary classification with partially-labeled training sets. In other words, assume we want to build a classifier that assigns each example to one of two complementary classes (e...rte_classify
- Simple classifier for RTE corpus.scikitlearn
- scikit-learn (http://scikit-learn.org) is a machine learning library for Python. It supports many classification algorithms, including SVMs, Naive Bayes, logistic regression (MaxEnt) and decision trees.senna
- A general interface to the SENNA pipeline that supports any of the operations specified in SUPPORTED_OPERATIONS.svm
- nltk.classify.svm was deprecated. For classification based on support vector machines SVMs use nltk.classify.scikitlearn (or scikit-learn directly).tadm
- No module docstring; 0/1 variable, 3/6 functions documentedtextcat
- A module for language identification using the TextCat algorithm. An implementation of the text categorization algorithm presented in Cavnar, W. B. and J. M. Trenkle, "N-Gram-Based Text Categorization".util
- Utility functions and classes for classifiers.weka
- Classifiers that make use of the external 'Weka' package.cli
- No module docstring; 0/1 constant, 1/2 function documentedcluster
- This module contains a number of basic clustering algorithms. Clustering describes the task of discovering groups of similar items with a large collection. It is also describe as unsupervised machine learning, as the data from which it learns is unannotated with class information, as is the case for supervised learning...collections
- No module docstring; 8/9 classes documentedcollocations
- Tools to identify collocations --- words that often appear consecutively --- within corpora. They may also be used to find other associations between word occurrences. See Manning and Schutze ch. 5 at ...compat
- Undocumentedcorpus
- NLTK corpus readers. The modules in this package provide functions that can be used to read corpus files in a variety of formats. These functions can be used to read both the corpus files that are distributed in the NLTK corpus package, and corpus files that are part of external corpora.europarl_raw
- Undocumentedreader
- NLTK corpus readers. The modules in this package provide functions that can be used to read corpus fileids in a variety of formats. These functions can be used to read both the corpus fileids that are distributed in the NLTK corpus package, and corpus fileids that are part of external corpora.aligned
- No module docstring; 1/1 class documentedapi
- API for corpus readers.bnc
- Corpus reader for the XML version of the British National Corpus.bracket_parse
- Corpus reader for corpora that consist of parenthesis-delineated parse trees.categorized_sents
- CorpusReader structured for corpora that contain one instance on each row. This CorpusReader is specifically used for the Subjectivity Dataset and the Sentence Polarity Dataset.chasen
- No module docstring; 0/2 function, 1/1 class documentedchildes
- Corpus reader for the XML version of the CHILDES corpus.chunked
- A reader for corpora that contain chunked (and optionally tagged) documents.cmudict
- The Carnegie Mellon Pronouncing Dictionary [cmudict.0.6] ftp://ftp.cs.cmu.edu/project/speech/dict/ Copyright 1998 Carnegie Mellon Universitycomparative_sents
- CorpusReader for the Comparative Sentence Dataset.conll
- Read CoNLL-style chunk fileids.crubadan
- An NLTK interface for the n-gram statistics gathered from the corpora for each language using An Crubadan.dependency
- Undocumentedframenet
- Corpus reader for the FrameNet 1.7 lexicon and corpus.ieer
- Corpus reader for the Information Extraction and Entity Recognition Corpus.indian
- Indian Language POS-Tagged Corpus Collected by A Kumaran, Microsoft Research, India Distributed with permissionipipan
- Undocumentedknbc
- Undocumentedlin
- Undocumentedmte
- A reader for corpora whose documents are in MTE format.nkjp
- No module docstring; 1/1 function, 4/5 classes documentednombank
- No module docstring; 2/5 classes documentednps_chat
- Undocumentedopinion_lexicon
- CorpusReader for the Opinion Lexicon.panlex_lite
- CorpusReader for PanLex Lite, a stripped down version of PanLex distributed as an SQLite database. See the README.txt in the panlex_lite corpus directory for more information on PanLex Lite.panlex_swadesh
- Undocumentedpl196x
- Undocumentedplaintext
- A reader for corpora that consist of plaintext documents.ppattach
- Read lines from the Prepositional Phrase Attachment Corpus.propbank
- No module docstring; 2/6 classes documentedpros_cons
- CorpusReader for the Pros and Cons dataset.reviews
- CorpusReader for reviews corpora (syntax based on Customer Review Corpus).rte
- Corpus reader for the Recognizing Textual Entailment (RTE) Challenge Corpora.semcor
- Corpus reader for the SemCor Corpus.senseval
- Read from the Senseval 2 Corpus.sentiwordnet
- An NLTK interface for SentiWordNetsinica_treebank
- Sinica Treebank Corpus Samplestring_category
- Read tuples from a corpus consisting of categorized strings. For example, from the question classification corpus:switchboard
- No module docstring; 1/1 class documentedtagged
- A reader for corpora whose documents contain part-of-speech-tagged words.timit
- Read tokens, phonemes and audio data from the NLTK TIMIT Corpus.toolbox
- Module for reading, writing and manipulating Toolbox databases and settings fileids.twitter
- A reader for corpora that consist of Tweets. It is assumed that the Tweets have been serialised into line-delimited JSON.udhr
- UDHR corpus reader. It mostly deals with encodings.util
- No module docstring; 4/11 functions, 3/3 classes documentedverbnet
- An NLTK interface to the VerbNet verb lexiconwordlist
- Undocumentedwordnet
- An NLTK interface for WordNetxmldocs
- Corpus reader for corpora whose documents are xml files.ycoe
- Corpus reader for the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE), a 1.5 million word syntactically-annotated corpus of Old English prose texts. The corpus is distributed by the Oxford Text Archive: ...util
- No module docstring; 0/1 constant, 1/1 function, 1/1 class documenteddata
- Functions to find and load NLTK resource files, such as corpora, grammars, and saved processing objects. Resource files are identified using URLs, such as nltk:corpora/abc/rural.txt or http://nltk.org/sample/toy.cfg...decorators
- Decorator module by Michele Simionato <michelesimionato@libero.it> Copyright Michele Simionato, distributed under the terms of the BSD License (see below). http://www.phyast.pitt.edu/~micheles/python/documentation.html...downloader
- The NLTK corpus and module downloader. This module defines several interfaces which can be used to download corpora, models, and other data packages that can be used with NLTK.draw
- No package docstring; 5/5 modules documentedcfg
- Visualization tools for CFGs.dispersion
- A utility for displaying lexical dispersion.table
- Tkinter widgets for displaying multi-column listboxes and tables.tree
- Graphically display a Tree.util
- Tools for graphically displaying and interacting with the objects and processing classes defined by the Toolkit. These tools are primarily intended to help students visualize the objects that they create.featstruct
- Basic data classes for representing feature structures, and for performing basic operations on those feature structures. A feature structure is a mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure...grammar
- Basic data classes for representing context free grammars. A "grammar" specifies which trees can represent the structure of a given text. Each of these trees is called a "parse tree" for the text (or simply a "parse")...help
- Provide structured access to documentation.inference
- Classes and interfaces for theorem proving and model building.api
- Interfaces and base classes for theorem provers and model builders.discourse
- Module for incrementally developing simple discourses, and checking for semantic ambiguity, consistency and informativeness.mace
- A model builder that makes use of the external 'Mace4' package.nonmonotonic
- A module to perform nonmonotonic reasoning. The ideas and demonstrations in this module are based on "Logical Foundations of Artificial Intelligence" by Michael R. Genesereth and Nils J. Nilsson.prover9
- A theorem prover that makes use of the external 'Prover9' package.resolution
- Module for a resolution-based First Order theorem prover.tableau
- Module for a tableau-based First Order theorem prover.internals
- No module docstring; 0/4 variable, 0/3 constant, 15/22 functions, 1/1 exception, 3/3 classes documentedjsontags
- Register JSON tags, so the nltk data loader knows what module and class to look for.lazyimport
- Helper to enable simple lazy module import.lm
- Currently this module covers only ngram language models, but it should be easy to extend to neural models.api
- Language Model Interface.counter
- No summarymodels
- Language Modelspreprocessing
- No module docstring; 1/1 variable, 2/2 functions documentedsmoothing
- Smoothing algorithms for language modeling.util
- Language Model Utilitiesvocabulary
- Language Model Vocabularymetrics
- NLTK Metricsagreement
- Implementations of inter-annotator agreement coefficients surveyed by Artstein and Poesio (2007), Inter-Coder Agreement for Computational Linguistics.aline
- ALINE http://webdocs.cs.ualberta.ca/~kondrak/ Copyright 2002 by Grzegorz Kondrak.association
- Provides scoring functions for a number of association measures through a generic, abstract implementation in NgramAssocMeasures, and n-specific BigramAssocMeasures and TrigramAssocMeasures.confusionmatrix
- No module docstring; 0/1 function, 1/1 class documenteddistance
- Distance Metrics.paice
- Counts Paice's performance statistics for evaluating stemming algorithms.scores
- No module docstring; 6/7 functions documentedsegmentation
- Text Segmentation Metricsspearman
- Tools for comparing ranked lists.misc
- No package docstring; 3/5 modules documentedbabelfish
- This module previously provided an interface to Babelfish online translation service; this service is no longer available; this module is kept in NLTK source code in order to provide better error messages for people following the NLTK Book 2...chomsky
- CHOMSKY is an aid to writing linguistic papers in the style of the great master. It is based on selected phrases taken from actual books and articles written by Noam Chomsky. Upon request, it assembles the phrases in the elegant stylistic patterns that Chomsky is noted for...minimalset
- No module docstring; 1/1 class documentedsort
- This module provides a variety of list sorting algorithms, to illustrate the many different algorithms (recipes) for solving a problem, and how to analyze algorithms experimentally.wordfinder
- No module docstring; 1/5 function documentedparse
- NLTK Parsersapi
- No module docstring; 1/1 class documentedbllip
- No module docstring; 1/4 function, 1/1 class documentedchart
- Data classes and parser implementations for "chart parsers", which use dynamic programming to efficiently parse a text. A chart parser derives parse trees for a text by iteratively adding "edges" to a "chart...corenlp
- No module docstring; 0/1 variable, 0/2 function, 1/1 exception, 3/4 classes documenteddependencygraph
- Tools for reading and writing dependency trees. The input is assumed to be in Malt-TAB format (http://stp.lingfil.uu.se/~nivre/research/MaltXML.html).earleychart
- Data classes and parser implementations for incremental chart parsers, which use dynamic programming to efficiently parse a text. A "chart parser" derives parse trees for a text by iteratively adding "edges" to a "chart"...evaluate
- No module docstring; 1/1 class documentedfeaturechart
- Extension of chart parsing implementation to handle grammars with feature structures as nodes.generate
- No module docstring; 0/1 variable, 1/4 function documentedmalt
- No module docstring; 2/3 functions, 1/1 class documentednonprojectivedependencyparser
- No module docstring; 0/1 variable, 0/4 function, 4/5 classes documentedpchart
- Classes and interfaces for associating probabilities with tree structures that represent the internal organization of a text. The probabilistic parser module defines BottomUpProbabilisticChartParser.projectivedependencyparser
- No module docstring; 3/4 functions, 4/4 classes documentedrecursivedescent
- No module docstring; 1/1 function, 2/2 classes documentedshiftreduce
- No module docstring; 1/1 function, 2/2 classes documentedstanford
- No module docstring; 0/1 variable, 4/4 classes documentedtransitionparser
- No module docstring; 1/1 function, 3/3 classes documentedutil
- Utility functions for parsers.viterbi
- No module docstring; 1/1 function, 1/1 class documentedprobability
- Classes for representing and processing probabilistic information.sem
- NLTK Semantic Interpretation Packageboxer
- An interface to Boxer.chat80
- Chat-80 was a natural language system which allowed the user to interrogate a Prolog knowledge base in the domain of world geography. It was developed in the early '80s by Warren and Pereira; see http://www.aclweb.org/anthology/J82-3002.pdf...cooper_storage
- No module docstring; 1/2 function, 1/1 class documenteddrt
- No module docstring; 1/5 function, 0/1 exception, 4/20 classes documenteddrt_glue_demo
- Undocumentedevaluate
- This module provides data structures for representing first-order models.glue
- Undocumentedhole
- An implementation of the Hole Semantics model, following Blackburn and Bos, Representation and Inference for Natural Language (CSLI, 2005).lfg
- Undocumentedlinearlogic
- No module docstring; 0/1 variable, 0/1 function, 0/3 exception, 1/9 class documentedlogic
- A version of first order predicate logic, built on top of the typed lambda calculus.relextract
- Code for extracting relational triples from the ieer and conll2002 corpora.skolemize
- No module docstring; 2/2 functions documentedutil
- Utility functions for batch-processing sentences: parsing and extraction of the semantic representation of the root node of the the syntax tree, followed by evaluation of the semantic representation in a first-order model.sentiment
- NLTK Sentiment Analysis Packagesentiment_analyzer
- A SentimentAnalyzer is a tool to implement and facilitate Sentiment Analysis tasks using NLTK features and classifiers, especially for teaching and demonstrative purposes.util
- Utility methods for Sentiment Analysis.vader
- If you use the VADER sentiment analysis tools, please cite:stem
- NLTK Stemmersapi
- No module docstring; 1/1 class documentedarlstem
- ARLSTem Arabic Stemmer The details about the implementation of this algorithm are described in: K. Abainia, S. Ouamour and H. Sayoud, A Novel Robust Arabic Light Stemmer , Journal of Experimental & Theoretical Artificial Intelligence (JETAI'17), Vol...arlstem2
- ARLSTem2 Arabic Light Stemmer The details about the implementation of this algorithm are described in: K. Abainia and H. Rebbani, Comparing the Effectiveness of the Improved ARLSTem Algorithm with Existing Arabic Light Stemmers, International Conference on Theoretical and Applicative Aspects of Computer Science (ICTAACS'19), Skikda, Algeria, December 15-16, 2019...cistem
- No module docstring; 1/1 class documentedisri
- ISRI Arabic Stemmerlancaster
- A word stemmer based on the Lancaster (Paice/Husk) stemming algorithm. Paice, Chris D. "Another Stemmer." ACM SIGIR Forum 24.3 (1990): 56-61.porter
- Porter Stemmerregexp
- No module docstring; 1/1 class documentedrslp
- No module docstring; 1/1 class documentedsnowball
- Snowball stemmersutil
- No module docstring; 2/2 functions documentedwordnet
- No module docstring; 1/1 class documentedtag
- NLTK Taggersapi
- Interface for tagging each token in a sentence with supplementary information, such as its part of speech.brill
- No module docstring; 5/5 functions, 3/3 classes documentedbrill_trainer
- No module docstring; 1/1 class documentedcrf
- A module for POS tagging using CRFSuitehmm
- Hidden Markov Models (HMMs) largely used to assign the correct label sequence to sequential data or assess the probability of a given label and data sequence. These models are finite state machines characterised by a number of states, transitions between these states, and output symbols emitted while in each state...hunpos
- A module for interfacing with the HunPos open-source POS-tagger.mapping
- Interface for converting POS tags from various treebanks to the universal tagset of Petrov, Das, & McDonald.perceptron
- No module docstring; 0/1 constant, 0/3 function, 2/2 classes documentedsenna
- Senna POS tagger, NER Tagger, Chunk Taggersequential
- Classes for tagging sentences sequentially, left to right. The abstract base class SequentialBackoffTagger serves as the base class for all the taggers in this module. Tagging of individual words is performed by the method ...stanford
- A module for interfacing with the Stanford taggers.tnt
- Implementation of 'TnT - A Statisical Part of Speech Tagger' by Thorsten Brantsutil
- No module docstring; 3/3 functions documentedtbl
- Transformation Based Learningapi
- Undocumenteddemo
- No module docstring; 0/2 constant, 13/16 functions documentederroranalysis
- No module docstring; 1/1 function documentedfeature
- No module docstring; 1/1 class documentedrule
- No module docstring; 2/2 classes documentedtemplate
- No module docstring; 2/2 classes documentedtest
- Unit tests for the NLTK modules. These tests are intended to ensure that source code changes don't accidentally introduce bugs. For instructions, please see:all
- Test suite that runs all NLTK tests.childes_fixt
- Undocumentedclassify_fixt
- Undocumentedconftest
- No module docstring; 2/2 functions documenteddiscourse_fixt
- Undocumentedgensim_fixt
- Undocumentedgluesemantics_malt_fixt
- Undocumentedinference_fixt
- Undocumentednonmonotonic_fixt
- Undocumentedportuguese_en_fixt
- Undocumentedprobability_fixt
- Undocumentedunit
- No package docstring; 14/32 modules, 0/2 package documentedlm
- Undocumentedtest_counter
- No module docstring; 1/2 class documentedtest_models
- No module docstring; 0/1 function, 6/9 classes documentedtest_preprocessing
- Undocumentedtest_vocabulary
- No module docstring; 1/1 class documentedtest_aline
- Unit tests for nltk.metrics.alinetest_brill
- Tests for Brill tagger.test_cfd_mutation
- Undocumentedtest_cfg2chomsky
- Undocumentedtest_chunk
- Undocumentedtest_classify
- Unit tests for nltk.classify. See also: nltk/test/classify.doctesttest_collocations
- No module docstring; 0/1 constant, 1/1 function, 0/1 class documentedtest_concordance
- No module docstring; 0/1 function, 1/1 class documentedtest_corenlp
- Mock test for Stanford CoreNLP wrappers.test_corpora
- Undocumentedtest_corpus_views
- Corpus View Regression Teststest_data
- Undocumentedtest_disagreement
- No module docstring; 1/1 class documentedtest_freqdist
- Undocumentedtest_hmm
- Undocumentedtest_json2csv_corpus
- Regression tests for json2csv()
and json2csv_entities()
in Twitter package.test_json_serialization
- Undocumentedtest_metrics
- Undocumentedtest_naivebayes
- Undocumentedtest_nombank
- Unit tests for nltk.corpus.nombanktest_pl196x
- Undocumentedtest_pos_tag
- Tests for nltk.pos_tagtest_rte_classify
- Undocumentedtest_seekable_unicode_stream_reader
- Undocumentedtest_senna
- Unit tests for Sennatest_stem
- Undocumentedtest_tag
- Undocumentedtest_tgrep
- Unit tests for nltk.tgrep.test_tokenize
- Unit tests for nltk.tokenize. See also nltk/test/tokenize.doctesttest_twitter_auth
- Tests for static parts of Twitter packagetest_util
- Unit tests for nltk.util.test_wordnet
- Unit tests for nltk.corpus.wordnet See also nltk/test/wordnet.doctesttranslate
- No package docstring; 10/11 modules documentedtest_bleu
- Tests for BLEU translation evaluation metrictest_gdfa
- Tests GDFA alignmentstest_ibm1
- Tests for IBM Model 1 training methodstest_ibm2
- Tests for IBM Model 2 training methodstest_ibm3
- Tests for IBM Model 3 training methodstest_ibm4
- Tests for IBM Model 4 training methodstest_ibm5
- Tests for IBM Model 5 training methodstest_ibm_model
- Tests for common methods of IBM translation modelstest_meteor
- Undocumentedtest_nist
- Tests for NIST translation evaluation metrictest_stack_decoder
- Tests for stack decodertext
- This module brings together a variety of NLTK functionality for text analysis, and provides simple, interactive interfaces. Functionality includes: concordancing, collocation discovery, regular expression search over tokenized strings, and distributional similarity.tgrep
- This module supports TGrep2 syntax for matching parts of NLTK Trees. Note that many tgrep operators require the tree passed to be a ParentedTree.tokenize
- NLTK Tokenizer Packageapi
- Tokenizer Interfacecasual
- Twitter-aware tokenizer, designed to be flexible and easy to adapt to new domains and tasks. The basic logic is this:destructive
- No module docstring; 2/2 classes documentedlegality_principle
- The Legality Principle is a language agnostic principle maintaining that syllable onsets and codas (the beginning and ends of syllables not including the vowel) are only legal if they are found as word onsets or codas in the language...mwe
- Multi-Word Expression Tokenizernist
- This is a NLTK port of the tokenizer used in the NIST BLEU evaluation script, https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/mteval-v14.pl#L926 which was also ported into Python in ...punkt
- Punkt Sentence Tokenizerregexp
- Regular-Expression Tokenizersrepp
- No module docstring; 1/1 class documentedsexpr
- S-Expression Tokenizersimple
- Simple Tokenizerssonority_sequencing
- The Sonority Sequencing Principle (SSP) is a language agnostic algorithm proposed by Otto Jesperson in 1904. The sonorous quality of a phoneme is judged by the openness of the lips. Syllable breaks occur before troughs in sonority...stanford
- No module docstring; 0/1 variable, 1/1 class documentedstanford_segmenter
- No module docstring; 0/1 variable, 1/1 class documentedtexttiling
- No module docstring; 0/4 variable, 0/1 constant, 1/2 function, 3/3 classes documentedtoktok
- The tok-tok tokenizer is a simple, general tokenizer, where the input has one sentence per line; thus only final period is tokenized.treebank
- Penn Treebank Tokenizerutil
- No module docstring; 7/7 functions, 1/1 class documentedtoolbox
- Module for reading, writing and manipulating Toolbox databases and settings files.translate
- Experimental features for machine translation. These interfaces are prone to change.api
- No module docstring; 0/1 variable, 1/3 function, 3/3 classes documentedbleu_score
- BLEU score implementation.chrf_score
- ChrF score implementationgale_church
- A port of the Gale-Church Aligner.gdfa
- No module docstring; 1/1 function documentedgleu_score
- GLEU score implementation.ibm1
- Lexical translation model that ignores word order.ibm2
- Lexical translation model that considers word order.ibm3
- Translation model that considers how a word can be aligned to multiple words in another language.ibm4
- Translation model that reorders output words based on their type and distance from other related words in the output sentence.ibm5
- Translation model that keeps track of vacant positions in the target sentence to decide where to place translated words.ibm_model
- Common methods and classes for all IBM models. See IBMModel1, IBMModel2, IBMModel3, IBMModel4, and IBMModel5 for specific implementations.meteor_score
- No module docstring; 12/12 functions documentedmetrics
- No module docstring; 1/1 function documentednist_score
- NIST score implementation.phrase_based
- No module docstring; 2/2 functions documentedribes_score
- RIBES score implementationstack_decoder
- A decoder that uses stacks to implement phrase-based translation.tree
- Class for representing hierarchical language structures, such as syntax trees and morphological trees.treeprettyprinter
- Pretty-printing of discontinuous trees. Adapted from the disco-dop project, by Andreas van Cranenburgh. https://github.com/andreasvc/disco-doptreetransforms
- A collection of methods for tree (grammar) transformations used in parsing natural language.twitter
- NLTK Twitter Packageapi
- This module provides an interface for TweetHandlers, and support for timezone handling.common
- Utility functions for the :module:`twitterclient` module which do not require the twython
library to have been installed.twitter_demo
- Examples to demo the twitterclient
code.twitterclient
- NLTK Twitter clientutil
- Authentication utilities to accompany :module:`twitterclient`.util
- No module docstring; 26/34 functions, 0/1 class documentedwsd
- No module docstring; 1/1 function documented