class documentation

A tagger that chooses a token's tag based on a leading or trailing substring of its word string. (It is important to note that these substrings are not necessarily "true" morphological affixes). In particular, a fixed-length substring of the word is looked up in a table, and the corresponding tag is returned. Affix taggers are typically constructed by training them on a tagged corpus.

Construct a new affix tagger.

Parameters
affix_lengthThe length of the affixes that should be considered during training and tagging. Use negative numbers for suffixes.
min_stem_lengthAny words whose length is less than min_stem_length+abs(affix_length) will be assigned a tag of None by this tagger.
Class Method decode_json_obj Undocumented
Method __init__ No summary
Method context No summary
Method encode_json_obj Undocumented
Class Variable json_tag Undocumented
Instance Variable _affix_length Undocumented
Instance Variable _min_word_length Undocumented

Inherited from ContextTagger:

Method __repr__ Undocumented
Method choose_tag Decide which tag should be used for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, return None -- do not consult the backoff tagger. This method should be overridden by subclasses of SequentialBackoffTagger.
Method size No summary
Method _train Initialize this ContextTagger's _context_to_tag table based on the given training data. In particular, for each context c in the training data, set _context_to_tag[c] to the most frequent tag for that context...
Instance Variable _context_to_tag Dictionary mapping contexts to tags.

Inherited from SequentialBackoffTagger (via ContextTagger):

Method tag Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens. A tagged token is encoded as a tuple (token, tag).
Method tag_one Determine an appropriate tag for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, then its backoff tagger is consulted.
Property backoff The backoff tagger for this tagger.
Instance Variable _taggers A list of all the taggers that should be tried to tag a token (i.e., self and its backoff taggers).

Inherited from TaggerI (via ContextTagger, SequentialBackoffTagger):

Method evaluate Score the accuracy of the tagger against the gold standard. Strip the tags from the gold standard text, retag it using the tagger, then compute the accuracy score.
Method tag_sents Apply self.tag() to each element of sentences. I.e.:
Method _check_params Undocumented
@classmethod
def decode_json_obj(cls, obj): (source)

Undocumented

def __init__(self, train=None, model=None, affix_length=-3, min_stem_length=2, backoff=None, cutoff=0, verbose=False): (source)
Parameters
trainUndocumented
modelUndocumented
affix_lengthUndocumented
min_stem_lengthUndocumented
backoffThe backoff tagger that should be used for this tagger.
cutoffUndocumented
verboseUndocumented
context_to_tagA dictionary mapping contexts to tags.
def context(self, tokens, index, history): (source)
Returns
(hashable)the context that should be used to look up the tag for the specified token; or None if the specified token should not be handled by this tagger.
def encode_json_obj(self): (source)

Undocumented

json_tag: str = (source)

Undocumented

_affix_length = (source)

Undocumented

_min_word_length = (source)

Undocumented