class AffixTagger(ContextTagger): (source)
Constructor: AffixTagger(train, model, affix_length, min_stem_length, ...)
A tagger that chooses a token's tag based on a leading or trailing substring of its word string. (It is important to note that these substrings are not necessarily "true" morphological affixes). In particular, a fixed-length substring of the word is looked up in a table, and the corresponding tag is returned. Affix taggers are typically constructed by training them on a tagged corpus.
Construct a new affix tagger.
| Parameters | |
| affix | The length of the affixes that should be considered during training and tagging. Use negative numbers for suffixes. |
| min | Any words whose length is less than min_stem_length+abs(affix_length) will be assigned a tag of None by this tagger. |
| Class Method | decode |
Undocumented |
| Method | __init__ |
No summary |
| Method | context |
No summary |
| Method | encode |
Undocumented |
| Class Variable | json |
Undocumented |
| Instance Variable | _affix |
Undocumented |
| Instance Variable | _min |
Undocumented |
Inherited from ContextTagger:
| Method | __repr__ |
Undocumented |
| Method | choose |
Decide which tag should be used for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, return None -- do not consult the backoff tagger. This method should be overridden by subclasses of SequentialBackoffTagger. |
| Method | size |
No summary |
| Method | _train |
Initialize this ContextTagger's _context_to_tag table based on the given training data. In particular, for each context c in the training data, set _context_to_tag[c] to the most frequent tag for that context... |
| Instance Variable | _context |
Dictionary mapping contexts to tags. |
Inherited from SequentialBackoffTagger (via ContextTagger):
| Method | tag |
Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens. A tagged token is encoded as a tuple (token, tag). |
| Method | tag |
Determine an appropriate tag for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, then its backoff tagger is consulted. |
| Property | backoff |
The backoff tagger for this tagger. |
| Instance Variable | _taggers |
A list of all the taggers that should be tried to tag a token (i.e., self and its backoff taggers). |
Inherited from TaggerI (via ContextTagger, SequentialBackoffTagger):
| Method | evaluate |
Score the accuracy of the tagger against the gold standard. Strip the tags from the gold standard text, retag it using the tagger, then compute the accuracy score. |
| Method | tag |
Apply self.tag() to each element of sentences. I.e.: |
| Method | _check |
Undocumented |
| Parameters | |
| train | Undocumented |
| model | Undocumented |
| affix | Undocumented |
| min | Undocumented |
| backoff | The backoff tagger that should be used for this tagger. |
| cutoff | Undocumented |
| verbose | Undocumented |
| context | A dictionary mapping contexts to tags. |