class InterpolatedLanguageModel(LanguageModel): (source)
Known subclasses: nltk.lm.KneserNeyInterpolated, nltk.lm.WittenBellInterpolated
Constructor: InterpolatedLanguageModel(smoothing_cls, order, **kwargs)
Logic common to all interpolated language models.
The idea to abstract this comes from Chen & Goodman 1995. Do not instantiate this class directly!
| Method | __init__ |
Creates new LanguageModel. |
| Method | unmasked |
Score a word given some optional context. |
| Instance Variable | estimator |
Undocumented |
Inherited from LanguageModel:
| Method | context |
Helper method for retrieving counts for a given context. |
| Method | entropy |
Calculate cross-entropy of model for given evaluation text. |
| Method | fit |
Trains the model on a text. |
| Method | generate |
Generate words from the model. |
| Method | logscore |
Evaluate the log score of this word in this context. |
| Method | perplexity |
Calculates the perplexity of the given text. |
| Method | score |
Masks out of vocab (OOV) words and computes their model score. |
| Instance Variable | counts |
Undocumented |
| Instance Variable | order |
Undocumented |
| Instance Variable | vocab |
Undocumented |
nltk.lm.api.LanguageModel.__init__nltk.lm.KneserNeyInterpolated, nltk.lm.WittenBellInterpolatedCreates new LanguageModel.
of creating a new one when training.
:type vocabulary: nltk.lm.Vocabulary or None
:param counter: If provided, use this object to count ngrams.
:type vocabulary: nltk.lm.NgramCounter or None
:param ngrams_fn: If given, defines how sentences in training text are turned to ngram
sequences.
| Parameters | |
| smoothing | Undocumented |
| order | Undocumented |
| ngrams | Undocumented |
| pad | If given, defines how senteces in training text are padded. |
| vocabulary | If provided, this vocabulary will be used instead |
| **kwargs | Undocumented |
nltk.lm.api.LanguageModel.unmasked_scoreScore a word given some optional context.
Concrete models are expected to provide an implementation.
Note that this method does not mask its arguments with the OOV label.
Use the score method for that.
If None, compute unigram score.
:param context: tuple(str) or None
:rtype: float
| Parameters | |
| word | Undocumented |
| context | Undocumented |
| str word | Word for which we want the score |
| tuple(str) context | Context the word is in. |