class documentation

Undocumented

Class Method train Train a new maxent classifier based on the given corpus of training samples. This classifier will have its weights chosen to maximize entropy while remaining empirically consistent with the training corpus.

Inherited from MaxentClassifier:

Method __init__ Construct a new maxent classifier model. Typically, new classifier models are created using the ``train()`` method.
Method __repr__ Undocumented
Method classify No summary
Method explain Print a table showing the effect of each of the features in the given feature set, and how they combine to determine the probabilities of each label for that featureset.
Method labels No summary
Method most_informative_features Generates the ranked list of informative features from most to least.
Method prob_classify No summary
Method set_weights Set the feature weight vector for this classifier. :param new_weights: The new feature weight vector. :type new_weights: list of float
Method show_most_informative_features :param show: all, neg, or pos (for negative-only or positive-only) :type show: str :param n: The no. of top features :type n: int
Method weights :return: The feature weight vector for this classifier. :rtype: list of float
Constant ALGORITHMS Undocumented
Instance Variable _encoding Undocumented
Instance Variable _logarithmic Undocumented
Instance Variable _most_informative_features Undocumented
Instance Variable _weights Undocumented

Inherited from ClassifierI (via MaxentClassifier):

Method classify_many Apply self.classify() to each element of featuresets. I.e.:
Method prob_classify_many Apply self.prob_classify() to each element of featuresets. I.e.:
@classmethod
def train(cls, train_toks, **kwargs): (source) ΒΆ

Train a new maxent classifier based on the given corpus of training samples. This classifier will have its weights chosen to maximize entropy while remaining empirically consistent with the training corpus. :rtype: MaxentClassifier :return: The new maxent classifier :type train_toks: list :param train_toks: Training data, represented as a list of pairs, the first member of which is a featureset, and the second of which is a classification label. :type algorithm: str :param algorithm: A case-insensitive string, specifying which algorithm should be used to train the classifier. The following algorithms are currently available. - Iterative Scaling Methods: Generalized Iterative Scaling (``'GIS'``), Improved Iterative Scaling (``'IIS'``) - External Libraries (requiring megam): LM-BFGS algorithm, with training performed by Megam (``'megam'``) The default algorithm is ``'IIS'``. :type trace: int :param trace: The level of diagnostic tracing output to produce. Higher values produce more verbose output. :type encoding: MaxentFeatureEncodingI :param encoding: A feature encoding, used to convert featuresets into feature vectors. If none is specified, then a ``BinaryMaxentFeatureEncoding`` will be built based on the features that are attested in the training corpus. :type labels: list(str) :param labels: The set of possible labels. If none is given, then the set of all labels attested in the training data will be used instead. :param gaussian_prior_sigma: The sigma value for a gaussian prior on model weights. Currently, this is supported by ``megam``. For other algorithms, its value is ignored. :param cutoffs: Arguments specifying various conditions under which the training should be halted. (Some of the cutoff conditions are not supported by some algorithms.) - ``max_iter=v``: Terminate after ``v`` iterations. - ``min_ll=v``: Terminate after the negative average log-likelihood drops under ``v``. - ``min_lldelta=v``: Terminate if a single iteration improves log likelihood by less than ``v``.