class ContextTagger(SequentialBackoffTagger): (source)
Known subclasses: nltk.tag.sequential.AffixTagger, nltk.tag.sequential.NgramTagger
Constructor: ContextTagger(context_to_tag, backoff)
An abstract base class for sequential backoff taggers that choose a tag for a token based on the value of its "context". Different subclasses are used to define different contexts.
A ContextTagger chooses the tag for a token by calculating the token's context, and looking up the corresponding tag in a table. This table can be constructed manually; or it can be automatically constructed based on a training corpus, using the _train() factory method.
| Method | __init__ |
No summary |
| Method | __repr__ |
Undocumented |
| Method | choose |
Decide which tag should be used for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, return None -- do not consult the backoff tagger. This method should be overridden by subclasses of SequentialBackoffTagger. |
| Method | context |
No summary |
| Method | size |
No summary |
| Method | _train |
Initialize this ContextTagger's _context_to_tag table based on the given training data. In particular, for each context c in the training data, set _context_to_tag[c] to the most frequent tag for that context... |
| Instance Variable | _context |
Dictionary mapping contexts to tags. |
Inherited from SequentialBackoffTagger:
| Method | tag |
Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens. A tagged token is encoded as a tuple (token, tag). |
| Method | tag |
Determine an appropriate tag for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, then its backoff tagger is consulted. |
| Property | backoff |
The backoff tagger for this tagger. |
| Instance Variable | _taggers |
A list of all the taggers that should be tried to tag a token (i.e., self and its backoff taggers). |
Inherited from TaggerI (via SequentialBackoffTagger):
| Method | evaluate |
Score the accuracy of the tagger against the gold standard. Strip the tags from the gold standard text, retag it using the tagger, then compute the accuracy score. |
| Method | tag |
Apply self.tag() to each element of sentences. I.e.: |
| Method | _check |
Undocumented |
nltk.tag.sequential.AffixTagger, nltk.tag.sequential.NgramTagger| Parameters | |
| context | A dictionary mapping contexts to tags. |
| backoff | The backoff tagger that should be used for this tagger. |
Decide which tag should be used for the specified token, and return that tag. If this tagger is unable to determine a tag for the specified token, return None -- do not consult the backoff tagger. This method should be overridden by subclasses of SequentialBackoffTagger.
| Parameters | |
| tokens:list | The list of words that are being tagged. |
| index:int | The index of the word whose tag should be returned. |
| history:list(str) | A list of the tags for all words before index. |
| Returns | |
| str | Undocumented |
nltk.tag.sequential.AffixTagger, nltk.tag.sequential.NgramTagger| Returns | |
| (hashable) | the context that should be used to look up the tag for the specified token; or None if the specified token should not be handled by this tagger. |
Initialize this ContextTagger's _context_to_tag table based on the given training data. In particular, for each context c in the training data, set _context_to_tag[c] to the most frequent tag for that context. However, exclude any contexts that are already tagged perfectly by the backoff tagger(s).
The old value of self._context_to_tag (if any) is discarded.
| Parameters | |
| tagged | A tagged corpus. Each item should be a list of (word, tag tuples. |
| cutoff | If the most likely tag for a context occurs fewer than cutoff times, then exclude it from the context-to-tag table for the new tagger. |
| verbose | Undocumented |