class documentation

An abstract class defining a collection of generic association measures. Each public method returns a score, taking the following arguments:

score_fn(count_of_ngram,
         (count_of_n-1gram_1, ..., count_of_n-1gram_j),
         (count_of_n-2gram_1, ..., count_of_n-2gram_k),
         ...,
         (count_of_1gram_1, ..., count_of_1gram_n),
         count_of_total_words)

See BigramAssocMeasures and TrigramAssocMeasures

Inheriting classes should define a property _n, and a method _contingency which calculates contingency values from marginals in order for all association measures defined here to be usable.

Class Method chi_sq Scores ngrams using Pearson's chi-square as in Manning and Schutze 5.3.3.
Class Method jaccard Scores ngrams using the Jaccard index.
Class Method likelihood_ratio Scores ngrams using likelihood ratios as in Manning and Schutze 5.3.4.
Class Method pmi Scores ngrams by pointwise mutual information, as in Manning and Schutze 5.4.
Class Method poisson_stirling Scores ngrams using the Poisson-Stirling measure.
Class Method student_t Scores ngrams using Student's t test with independence hypothesis for unigrams, as in Manning and Schutze 5.3.1.
Static Method mi_like Scores ngrams using a variant of mutual information. The keyword argument power sets an exponent (default 3) for the numerator. No logarithm of the result is calculated.
Static Method raw_freq Scores ngrams by their frequency
Class Method _expected_values Calculates expected values for a contingency table.
Static Method _contingency Calculates values of a contingency table from marginal values.
Static Method _marginals Calculates values of contingency table marginals from its values.
Class Variable _n Undocumented
@classmethod
def chi_sq(cls, *marginals): (source)

Scores ngrams using Pearson's chi-square as in Manning and Schutze 5.3.3.

@classmethod
def jaccard(cls, *marginals): (source)

Scores ngrams using the Jaccard index.

@classmethod
def likelihood_ratio(cls, *marginals): (source)

Scores ngrams using likelihood ratios as in Manning and Schutze 5.3.4.

@classmethod
def pmi(cls, *marginals): (source)

Scores ngrams by pointwise mutual information, as in Manning and Schutze 5.4.

@classmethod
def poisson_stirling(cls, *marginals): (source)

Scores ngrams using the Poisson-Stirling measure.

@classmethod
def student_t(cls, *marginals): (source)

Scores ngrams using Student's t test with independence hypothesis for unigrams, as in Manning and Schutze 5.3.1.

@staticmethod
def mi_like(*marginals, **kwargs): (source)

Scores ngrams using a variant of mutual information. The keyword argument power sets an exponent (default 3) for the numerator. No logarithm of the result is calculated.

@staticmethod
def raw_freq(*marginals): (source)

Scores ngrams by their frequency

@classmethod
def _expected_values(cls, cont): (source)

Calculates expected values for a contingency table.

@staticmethod
@abstractmethod
def _contingency(*marginals): (source)
@staticmethod
@abstractmethod
def _marginals(*contingency): (source)