module documentation
This is the docutils.parsers.rst.states module, the core of the reStructuredText parser. It defines the following:
Parser Overview
The reStructuredText parser is implemented as a recursive state machine,
examining its input one line at a time. To understand how the parser works,
please first become familiar with the docutils.statemachine module. In the
description below, references are made to classes defined in this module;
please see the individual classes for details.
Parsing proceeds as follows:
- The state machine examines each line of input, checking each of the
transition patterns of the state
Body, in order, looking for a match. The implicit transitions (blank lines and indentation) are checked before any others. The 'text' transition is a catch-all (matches anything). - The method associated with the matched transition pattern is called.
- Some transition methods are self-contained, appending elements to the
document tree (
Body.doctestparses a doctest block). The parser's current line index is advanced to the end of the element, and parsing continues with step 1. - Other transition methods trigger the creation of a nested state machine,
whose job is to parse a compound construct ('indent' does a block quote,
'bullet' does a bullet list, 'overline' does a section [first checking
for a valid section header], etc.).
- In the case of lists and explicit markup, a one-off state machine is created and run to parse contents of the first item.
- A new state machine is created and its initial state is set to the
appropriate specialized state (
BulletListin the case of the 'bullet' transition; seeSpecializedBodyfor more detail). This state machine is run to parse the compound element (or series of explicit markup elements), and returns as soon as a non-member element is encountered. For example, theBulletListstate machine ends as soon as it encounters an element which is not a list item of that bullet list. The optional omission of inter-element blank lines is enabled by this nested state machine. - The current line index is advanced to the end of the elements parsed, and parsing continues with step 1.
- The result of the 'text' transition depends on the next line of text.
The current state is changed to
Text, under which the second line is examined. If the second line is:- Indented: The element is a definition list item, and parsing proceeds
similarly to step 2.B, using the
DefinitionListstate. - A line of uniform punctuation characters: The element is a section
header; again, parsing proceeds as in step 2.B, and
Bodyis still used. - Anything else: The element is a paragraph, which is examined for inline markup and appended to the parent element. Processing continues with step 1.
- Indented: The element is a definition list item, and parsing proceeds
similarly to step 2.B, using the
- Some transition methods are self-contained, appending elements to the
document tree (
| Unknown Field: classes | |
| |
| Unknown Field: exception | |
| classes | |
| Unknown Field: functions | |
| |
| Unknown Field: attributes | |
| |
| Class | |
Generic classifier of the first line of a block. |
| Class | |
Second and subsequent bullet_list list_items. |
| Class | |
Second line of potential definition_list_item. |
| Class | |
Second and subsequent definition_list_items. |
| Class | |
Second and subsequent enumerated_list list_items. |
| Class | |
Second and subsequent explicit markup construct. |
| Class | |
Parse field_list fields for extension options. |
| Class | |
Second and subsequent field_list fields. |
| Class | |
Parse inline markup; call the parse() method. |
| Class | |
Second line of over- & underlined section title or transition marker. |
| Class | |
Second and subsequent lines of a line_block. |
| Class | |
StateMachine run from within other StateMachine runs, to parse nested document structures. |
| Class | |
Second and subsequent option_list option_list_items. |
| Class | |
Nested parse handler for quoted (unindented) literal blocks. |
| Class | |
RFC2822 headers are only valid as the first constructs in documents. As soon as anything else appears, the Body state should take over. |
| Class | |
Second and subsequent RFC2822-style field_list fields. |
| Class | |
reStructuredText State superclass. |
| Class | |
reStructuredText's master StateMachine. |
| Class | |
Superclass for second and subsequent compound element members. Compound elements are lists and list-like constructs. |
| Class | |
Superclass for second and subsequent lines of Text-variants. |
| Class | |
Stores data attributes for dotted-attribute access. |
| Class | |
Parser for the contents of a substitution_definition element. |
| Class | |
Classifier of second line of a text block. |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Exception | |
Undocumented |
| Function | build |
Build, compile and return a regular expression based on definition. |
| Variable | state |
Standard set of State classes used to start RSTStateMachine. |
| Function | _loweralpha |
Undocumented |
| Function | _lowerroman |
Undocumented |
| Function | _upperalpha |
Undocumented |
Build, compile and return a regular expression based on definition.
| Unknown Field: parameter | |
definition: a 4-tuple (group name, prefix, suffix, parts),
where "parts" is a list of regular expressions and/or regular
expression definitions to be joined into an or-group. |