A finite state machine specialized for regular-expression-based text filters, this module defines the following classes:
StateMachine
, a state machineState
, a state superclassStateMachineWS
, a whitespace-sensitive version ofStateMachine
StateWS
, a state superclass for use withStateMachineWS
SearchStateMachine
, usesre.search()
instead ofre.match()
SearchStateMachineWS
, usesre.search()
instead ofre.match()
ViewList
, extends standard Python lists.StringList
, string-specific ViewList.
Exception classes:
StateMachineError
UnknownStateError
DuplicateStateError
UnknownTransitionError
DuplicateTransitionError
TransitionPatternNotFound
TransitionMethodNotFound
UnexpectedIndentationError
TransitionCorrection
: Raised to switch to another transition.StateCorrection
: Raised to switch to another state & transition.
Functions:
string2lines()
: split a multi-line string into a list of one-line strings
How To Use This Module
(See the individual classes, methods, and attributes for details.)
Import it: import statemachine or from statemachine import .... You will also need to import re.
Derive a subclass of
State
(orStateWS
) for each state in your state machine:class MyState(statemachine.State):
Within the state's class definition:
Include a pattern for each transition, in
State.patterns
:patterns = {'atransition': r'pattern', ...}
Include a list of initial transitions to be set up automatically, in
State.initial_transitions
:initial_transitions = ['atransition', ...]
Define a method for each transition, with the same name as the transition pattern:
def atransition(self, match, context, next_state): # do something result = [...] # a list return context, next_state, result # context, next_state may be altered
Transition methods may raise an
EOFError
to cut processing short.You may wish to override the
State.bof()
and/orState.eof()
implicit transition methods, which handle the beginning- and end-of-file.In order to handle nested processing, you may wish to override the attributes
State.nested_sm
and/orState.nested_sm_kwargs
.If you are using
StateWS
as a base class, in order to handle nested indented blocks, you may wish to:- override the attributes
StateWS.indent_sm
,StateWS.indent_sm_kwargs
,StateWS.known_indent_sm
, and/orStateWS.known_indent_sm_kwargs
; - override the
StateWS.blank()
method; and/or - override or extend the
StateWS.indent()
,StateWS.known_indent()
, and/orStateWS.firstknown_indent()
methods.
- override the attributes
Create a state machine object:
sm = StateMachine(state_classes=[MyState, ...], initial_state='MyState')
Obtain the input text, which needs to be converted into a tab-free list of one-line strings. For example, to read text from a file called 'inputfile':
input_string = open('inputfile').read() input_lines = statemachine.string2lines(input_string)
Run the state machine on the input text and collect the results, a list:
results = sm.run(input_lines)
Remove any lingering circular references:
sm.unlink()
Class |
|
StateMachine which uses re.search() instead of re.match() . |
Class |
|
StateMachineWS which uses re.search() instead of re.match() . |
Class |
|
State superclass. Contains a list of transitions, and transition methods. |
Class |
|
A finite state machine for text filters using regular expressions. |
Class |
|
StateMachine subclass specialized for whitespace recognition. |
Class |
|
State superclass specialized for whitespace (blank lines & indents). |
Class |
|
A ViewList with string-specific methods. |
Class |
|
List with extended functionality: slices of ViewList objects are child lists, linked to their parents. Changes made to a child list also affect the parent list. A child list is effectively a "view" (in the SQL sense) of the parent list... |
Exception |
|
Undocumented |
Exception |
|
Undocumented |
Exception |
|
Raise from within a transition method to switch to another state. |
Exception |
|
Undocumented |
Exception |
|
Raise from within a transition method to switch to another transition. |
Exception |
|
Undocumented |
Exception |
|
Undocumented |
Exception |
|
Undocumented |
Exception |
|
Undocumented |
Exception |
|
Undocumented |
Function | string2lines |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped. |
Class | _ |
Mix-in class to override StateMachine regular expression behavior. |
Function | _exception |
Return exception information: |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped.
Each tab is expanded with between 1 and tab_width
spaces, so that the
next character's index becomes a multiple of tab_width
(8 by default).
Parameters:
astring
: a multi-line string.tab_width
: the number of columns between tab stops.convert_whitespace
: convert form feeds and vertical tabs to spaces?whitespace
: pattern object with the to-be-converted whitespace characters (default [vf]).