class documentation
class StandardFormat(object): (source)
Known subclasses: nltk.toolbox.ToolboxData, nltk.toolbox.ToolboxSettings
Constructor: StandardFormat(filename, encoding)
Class for reading and processing standard format marker files and strings.
| Method | __init__ |
Undocumented |
| Method | close |
Close a previously opened standard format marker file or string. |
| Method | fields |
Return an iterator that returns the next field in a (marker, value) tuple, where marker and value are unicode strings if an encoding was specified in the fields() method. Otherwise they are non-unicode strings. |
| Method | open |
Open a standard format marker file for sequential reading. |
| Method | open |
Open a standard format marker string for sequential reading. |
| Method | raw |
Return an iterator that returns the next field in a (marker, value) tuple. Linebreaks and trailing white space are preserved except for the final newline in each field. |
| Instance Variable | line |
Undocumented |
| Instance Variable | _encoding |
Undocumented |
| Instance Variable | _file |
Undocumented |
def fields(self, strip=True, unwrap=True, encoding=None, errors='strict', unicode_fields=None):
(source)
¶
Return an iterator that returns the next field in a (marker, value) tuple, where marker and value are unicode strings if an encoding was specified in the fields() method. Otherwise they are non-unicode strings.
| Parameters | |
| strip:bool | strip trailing whitespace from the last line of each field |
| unwrap:bool | Convert newlines in a field to spaces. |
| encoding:str or None | Name of an encoding to use. If it is specified then the fields() method returns unicode strings rather than non unicode strings. |
| errors:str | Error handling scheme for codec. Same as the decode() builtin string method. |
| unicode | Set of marker names whose values are UTF-8 encoded. Ignored if encoding is None. If the whole file is UTF-8 encoded set encoding='utf8' and leave unicode_fields with its default value of None. |
| Returns | |
| iter(tuple(str, str)) | Undocumented |
Open a standard format marker file for sequential reading.
| Parameters | |
| sfm | name of the standard format marker input file |
Open a standard format marker string for sequential reading.
| Parameters | |
| s:str | string to parse as a standard format marker input file |