class documentation

Kneser-Ney estimate of a probability distribution. This is a version of back-off that counts how likely an n-gram is provided the n-1-gram had been seen in training. Extends the ProbDistI interface, requires a trigram FreqDist instance to train on. Optionally, a different from default discount value can be specified. The default discount is set to 0.75.

Method __init__ No summary
Method __repr__ Return a string representation of this ProbDist
Method discount Return the value by which counts are discounted. By default set to 0.75.
Method max Return the sample with the greatest probability. If two or more samples have the same probability, return one of them; which sample is returned is undefined.
Method prob Return the probability for a given sample. Probabilities are always real numbers in the range [0, 1].
Method samples Return a list of all samples that have nonzero probabilities. Use prob to find the probability of each sample.
Method set_discount Set the value by which counts are discounted to the value of discount.
Instance Variable _bigrams Undocumented
Instance Variable _bins Undocumented
Instance Variable _cache Undocumented
Instance Variable _D Undocumented
Instance Variable _trigrams Undocumented
Instance Variable _trigrams_contain Undocumented
Instance Variable _wordtypes_after Undocumented
Instance Variable _wordtypes_before Undocumented

Inherited from ProbDistI:

Method generate Return a randomly selected sample from this probability distribution. The probability of returning each sample samp is equal to self.prob(samp).
Method logprob Return the base 2 logarithm of the probability for a given sample.
Constant SUM_TO_ONE True if the probabilities of the samples in this probability distribution will always sum to one.
def __init__(self, freqdist, bins=None, discount=0.75): (source)
Parameters
freqdist:FreqDistThe trigram frequency distribution upon which to base the estimation
bins:int or floatIncluded for compatibility with nltk.tag.hmm
discount:float (preferred, but can be set to int)The discount applied when retrieving counts of trigrams
def __repr__(self): (source)

Return a string representation of this ProbDist

Returns
strUndocumented
def discount(self): (source)

Return the value by which counts are discounted. By default set to 0.75.

Returns
floatUndocumented
def max(self): (source)

Return the sample with the greatest probability. If two or more samples have the same probability, return one of them; which sample is returned is undefined.

Returns
anyUndocumented
def prob(self, trigram): (source)

Return the probability for a given sample. Probabilities are always real numbers in the range [0, 1].

Parameters
trigramUndocumented
sample:anyThe sample whose probability should be returned.
Returns
floatUndocumented
def samples(self): (source)

Return a list of all samples that have nonzero probabilities. Use prob to find the probability of each sample.

Returns
listUndocumented
def set_discount(self, discount): (source)

Set the value by which counts are discounted to the value of discount.

Parameters
discount:float (preferred, but int possible)the new value to discount counts by
Returns
NoneUndocumented
_bigrams = (source)

Undocumented

Undocumented

_cache: dict = (source)

Undocumented

Undocumented

_trigrams = (source)

Undocumented

_trigrams_contain = (source)

Undocumented

_wordtypes_after = (source)

Undocumented

_wordtypes_before = (source)

Undocumented