A finite state machine specialized for regular-expression-based text filters, this module defines the following classes:
StateMachine, a state machineState, a state superclassStateMachineWS, a whitespace-sensitive version ofStateMachineStateWS, a state superclass for use withStateMachineWSSearchStateMachine, usesre.search()instead ofre.match()SearchStateMachineWS, usesre.search()instead ofre.match()ViewList, extends standard Python lists.StringList, string-specific ViewList.
Exception classes:
StateMachineErrorUnknownStateErrorDuplicateStateErrorUnknownTransitionErrorDuplicateTransitionErrorTransitionPatternNotFoundTransitionMethodNotFoundUnexpectedIndentationErrorTransitionCorrection: Raised to switch to another transition.StateCorrection: Raised to switch to another state & transition.
Functions:
string2lines(): split a multi-line string into a list of one-line strings
How To Use This Module
(See the individual classes, methods, and attributes for details.)
Import it: import statemachine or from statemachine import .... You will also need to import re.
Derive a subclass of
State(orStateWS) for each state in your state machine:class MyState(statemachine.State):
Within the state's class definition:
Include a pattern for each transition, in
State.patterns:patterns = {'atransition': r'pattern', ...}Include a list of initial transitions to be set up automatically, in
State.initial_transitions:initial_transitions = ['atransition', ...]
Define a method for each transition, with the same name as the transition pattern:
def atransition(self, match, context, next_state): # do something result = [...] # a list return context, next_state, result # context, next_state may be alteredTransition methods may raise an
EOFErrorto cut processing short.You may wish to override the
State.bof()and/orState.eof()implicit transition methods, which handle the beginning- and end-of-file.In order to handle nested processing, you may wish to override the attributes
State.nested_smand/orState.nested_sm_kwargs.If you are using
StateWSas a base class, in order to handle nested indented blocks, you may wish to:- override the attributes
StateWS.indent_sm,StateWS.indent_sm_kwargs,StateWS.known_indent_sm, and/orStateWS.known_indent_sm_kwargs; - override the
StateWS.blank()method; and/or - override or extend the
StateWS.indent(),StateWS.known_indent(), and/orStateWS.firstknown_indent()methods.
- override the attributes
Create a state machine object:
sm = StateMachine(state_classes=[MyState, ...], initial_state='MyState')Obtain the input text, which needs to be converted into a tab-free list of one-line strings. For example, to read text from a file called 'inputfile':
input_string = open('inputfile').read() input_lines = statemachine.string2lines(input_string)Run the state machine on the input text and collect the results, a list:
results = sm.run(input_lines)
Remove any lingering circular references:
sm.unlink()
| Class | |
StateMachine which uses re.search() instead of re.match(). |
| Class | |
StateMachineWS which uses re.search() instead of re.match(). |
| Class | |
State superclass. Contains a list of transitions, and transition methods. |
| Class | |
A finite state machine for text filters using regular expressions. |
| Class | |
StateMachine subclass specialized for whitespace recognition. |
| Class | |
State superclass specialized for whitespace (blank lines & indents). |
| Class | |
A ViewList with string-specific methods. |
| Class | |
List with extended functionality: slices of ViewList objects are child lists, linked to their parents. Changes made to a child list also affect the parent list. A child list is effectively a "view" (in the SQL sense) of the parent list... |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Raise from within a transition method to switch to another state. |
| Exception | |
Undocumented |
| Exception | |
Raise from within a transition method to switch to another transition. |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Function | string2lines |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped. |
| Class | _ |
Mix-in class to override StateMachine regular expression behavior. |
| Function | _exception |
Return exception information: |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped.
Each tab is expanded with between 1 and tab_width spaces, so that the
next character's index becomes a multiple of tab_width (8 by default).
Parameters:
astring: a multi-line string.tab_width: the number of columns between tab stops.convert_whitespace: convert form feeds and vertical tabs to spaces?whitespace: pattern object with the to-be-converted whitespace characters (default [vf]).