class documentation
class PunktParameters(object): (source)
Stores data used to perform sentence boundary detection with Punkt.
Method | __init__ |
Undocumented |
Method | add |
Undocumented |
Method | clear |
Undocumented |
Method | clear |
Undocumented |
Method | clear |
Undocumented |
Method | clear |
Undocumented |
Instance Variable | abbrev |
A set of word types for known abbreviations. |
Instance Variable | collocations |
A set of word type tuples for known common collocations where the first word ends in a period. E.g., ('S.', 'Bach') is a common collocation in a text that discusses 'Johann S. Bach'. These count as negative evidence for sentence boundaries. |
Instance Variable | ortho |
A dictionary mapping word types to the set of orthographic contexts that word type appears in. Contexts are represented by adding orthographic context flags: ... |
Instance Variable | sent |
A set of word types for words that often appear at the beginning of sentences. |
Method | _debug |
Undocumented |
A set of word type tuples for known common collocations where the first word ends in a period. E.g., ('S.', 'Bach') is a common collocation in a text that discusses 'Johann S. Bach'. These count as negative evidence for sentence boundaries.
A dictionary mapping word types to the set of orthographic contexts that word type appears in. Contexts are represented by adding orthographic context flags: ...