module documentation

A set of functions used to interface with the external megam maxent optimization package. Before megam can be used, you should tell NLTK where it can find the megam binary, using the config_megam() function. Typical usage:

>>> from nltk.classify import megam
>>> megam.config_megam() # pass path to megam if not found in PATH # doctest: +SKIP
[Found megam: ...]

Use with MaxentClassifier. Example below, see MaxentClassifier documentation for details.

nltk.classify.MaxentClassifier.train(corpus, 'megam')
Function call_megam Call the megam binary with the given arguments.
Function config_megam Configure NLTK's interface to the megam maxent optimization package.
Function parse_megam_weights Given the stdout output generated by megam when training a model, return a numpy array containing the corresponding weight vector. This function does not currently handle bias features.
Function write_megam_file Generate an input file for megam based on the given corpus of classified tokens.
Function _write_megam_features Undocumented
Variable _megam_bin Undocumented
def call_megam(args): (source)

Call the megam binary with the given arguments.

def config_megam(bin=None): (source)

Configure NLTK's interface to the megam maxent optimization package.

Parameters
bin:strThe full path to the megam binary. If not specified, then nltk will search the system for a megam binary; and if one is not found, it will raise a LookupError exception.
def parse_megam_weights(s, features_count, explicit=True): (source)

Given the stdout output generated by megam when training a model, return a numpy array containing the corresponding weight vector. This function does not currently handle bias features.

def write_megam_file(train_toks, encoding, stream, bernoulli=True, explicit=True): (source)

Generate an input file for megam based on the given corpus of classified tokens.

Parameters
train_toks:list(tuple(dict, str))Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label.
encoding:MaxentFeatureEncodingIA feature encoding, used to convert featuresets into feature vectors. May optionally implement a cost() method in order to assign different costs to different class predictions.
stream:streamThe stream to which the megam input file should be written.
bernoulliIf true, then use the 'bernoulli' format. I.e., all joint features have binary values, and are listed iff they are true. Otherwise, list feature values explicitly. If bernoulli=False, then you must call megam with the -fvals option.
explicitIf true, then use the 'explicit' format. I.e., list the features that would fire for any of the possible labels, for each token. If explicit=True, then you must call megam with the -explicit option.
def _write_megam_features(vector, stream, bernoulli): (source)

Undocumented

_megam_bin = (source)

Undocumented