class documentation

A rule specifying how to modify the chunking in a ChunkString, using a transformational regular expression. The RegexpChunkRule class itself can be used to implement any transformational rule based on regular expressions. There are also a number of subclasses, which can be used to implement simpler types of rules, based on matching regular expressions.

Each RegexpChunkRule has a regular expression and a replacement expression. When a RegexpChunkRule is "applied" to a ChunkString, it searches the ChunkString for any substring that matches the regular expression, and replaces it using the replacement expression. This search/replace operation has the same semantics as re.sub.

Each RegexpChunkRule also has a description string, which gives a short (typically less than 75 characters) description of the purpose of the rule.

This transformation defined by this RegexpChunkRule should only add and remove braces; it should not modify the sequence of angle-bracket delimited tags. Furthermore, this transformation may not result in nested or mismatched bracketing.

Static Method fromstring Create a RegexpChunkRule from a string description. Currently, the following formats are supported:
Method __init__ Construct a new RegexpChunkRule.
Method __repr__ Return a string representation of this rule. It has the form:
Method apply Apply this rule to the given ChunkString. See the class reference documentation for a description of what it means to apply a rule.
Method descr Return a short description of the purpose and/or effect of this rule.
Instance Variable _descr Undocumented
Instance Variable _regexp Undocumented
Instance Variable _repl Undocumented
@staticmethod
def fromstring(s): (source)

Create a RegexpChunkRule from a string description. Currently, the following formats are supported:

{regexp}         # chunk rule
}regexp{         # strip rule
regexp}{regexp   # split rule
regexp{}regexp   # merge rule

Where regexp is a regular expression for the rule. Any text following the comment marker (#) will be used as the rule's description:

>>> from nltk.chunk.regexp import RegexpChunkRule
>>> RegexpChunkRule.fromstring('{<DT>?<NN.*>+}')
<ChunkRule: '<DT>?<NN.*>+'>
def __init__(self, regexp, repl, descr): (source)

Construct a new RegexpChunkRule.

Parameters
regexp:regexp or strThe regular expression for this RegexpChunkRule. When this rule is applied to a ChunkString, any substring that matches regexp will be replaced using the replacement string repl. Note that this must be a normal regular expression, not a tag pattern.
repl:strThe replacement expression for this RegexpChunkRule. When this rule is applied to a ChunkString, any substring that matches regexp will be replaced using repl.
descr:strA short description of the purpose and/or effect of this rule.
def __repr__(self): (source)

Return a string representation of this rule. It has the form:

<RegexpChunkRule: '{<IN|VB.*>}'->'<IN>'>

Note that this representation does not include the description string; that string can be accessed separately with the descr() method.

Returns
strUndocumented
def apply(self, chunkstr): (source)

Apply this rule to the given ChunkString. See the class reference documentation for a description of what it means to apply a rule.

Parameters
chunkstr:ChunkStringThe chunkstring to which this rule is applied.
Returns
NoneUndocumented
Raises
ValueErrorIf this transformation generated an invalid chunkstring.
def descr(self): (source)

Return a short description of the purpose and/or effect of this rule.

Returns
strUndocumented

Undocumented

Undocumented

Undocumented