class RegexpChunkRule(object): (source)
Known subclasses: nltk.chunk.regexp.ChunkRule, nltk.chunk.regexp.ChunkRuleWithContext, nltk.chunk.regexp.ExpandLeftRule, nltk.chunk.regexp.ExpandRightRule, nltk.chunk.regexp.MergeRule, nltk.chunk.regexp.SplitRule, nltk.chunk.regexp.StripRule, nltk.chunk.regexp.UnChunkRule
Constructor: RegexpChunkRule(regexp, repl, descr)
A rule specifying how to modify the chunking in a ChunkString, using a transformational regular expression. The RegexpChunkRule class itself can be used to implement any transformational rule based on regular expressions. There are also a number of subclasses, which can be used to implement simpler types of rules, based on matching regular expressions.
Each RegexpChunkRule has a regular expression and a replacement expression. When a RegexpChunkRule is "applied" to a ChunkString, it searches the ChunkString for any substring that matches the regular expression, and replaces it using the replacement expression. This search/replace operation has the same semantics as re.sub.
Each RegexpChunkRule also has a description string, which gives a short (typically less than 75 characters) description of the purpose of the rule.
This transformation defined by this RegexpChunkRule should only add and remove braces; it should not modify the sequence of angle-bracket delimited tags. Furthermore, this transformation may not result in nested or mismatched bracketing.
| Static Method | fromstring |
Create a RegexpChunkRule from a string description. Currently, the following formats are supported: |
| Method | __init__ |
Construct a new RegexpChunkRule. |
| Method | __repr__ |
Return a string representation of this rule. It has the form: |
| Method | apply |
Apply this rule to the given ChunkString. See the class reference documentation for a description of what it means to apply a rule. |
| Method | descr |
Return a short description of the purpose and/or effect of this rule. |
| Instance Variable | _descr |
Undocumented |
| Instance Variable | _regexp |
Undocumented |
| Instance Variable | _repl |
Undocumented |
Create a RegexpChunkRule from a string description. Currently, the following formats are supported:
{regexp} # chunk rule
}regexp{ # strip rule
regexp}{regexp # split rule
regexp{}regexp # merge rule
Where regexp is a regular expression for the rule. Any text following the comment marker (#) will be used as the rule's description:
>>> from nltk.chunk.regexp import RegexpChunkRule >>> RegexpChunkRule.fromstring('{<DT>?<NN.*>+}') <ChunkRule: '<DT>?<NN.*>+'>
nltk.chunk.regexp.ChunkRule, nltk.chunk.regexp.ChunkRuleWithContext, nltk.chunk.regexp.ExpandLeftRule, nltk.chunk.regexp.ExpandRightRule, nltk.chunk.regexp.MergeRule, nltk.chunk.regexp.SplitRule, nltk.chunk.regexp.StripRule, nltk.chunk.regexp.UnChunkRuleConstruct a new RegexpChunkRule.
| Parameters | |
| regexp:regexp or str | The regular expression for this RegexpChunkRule. When this rule is applied to a ChunkString, any substring that matches regexp will be replaced using the replacement string repl. Note that this must be a normal regular expression, not a tag pattern. |
| repl:str | The replacement expression for this RegexpChunkRule. When this rule is applied to a ChunkString, any substring that matches regexp will be replaced using repl. |
| descr:str | A short description of the purpose and/or effect of this rule. |
nltk.chunk.regexp.ChunkRule, nltk.chunk.regexp.ChunkRuleWithContext, nltk.chunk.regexp.ExpandLeftRule, nltk.chunk.regexp.ExpandRightRule, nltk.chunk.regexp.MergeRule, nltk.chunk.regexp.SplitRule, nltk.chunk.regexp.StripRule, nltk.chunk.regexp.UnChunkRuleReturn a string representation of this rule. It has the form:
<RegexpChunkRule: '{<IN|VB.*>}'->'<IN>'>
Note that this representation does not include the description string; that string can be accessed separately with the descr() method.
| Returns | |
| str | Undocumented |
Apply this rule to the given ChunkString. See the class reference documentation for a description of what it means to apply a rule.
| Parameters | |
| chunkstr:ChunkString | The chunkstring to which this rule is applied. |
| Returns | |
| None | Undocumented |
| Raises | |
ValueError | If this transformation generated an invalid chunkstring. |