class documentation

class PunktParameters(object): (source)

View In Hierarchy

Stores data used to perform sentence boundary detection with Punkt.

Method __init__ Undocumented
Method add_ortho_context Undocumented
Method clear_abbrevs Undocumented
Method clear_collocations Undocumented
Method clear_ortho_context Undocumented
Method clear_sent_starters Undocumented
Instance Variable abbrev_types A set of word types for known abbreviations.
Instance Variable collocations A set of word type tuples for known common collocations where the first word ends in a period. E.g., ('S.', 'Bach') is a common collocation in a text that discusses 'Johann S. Bach'. These count as negative evidence for sentence boundaries.
Instance Variable ortho_context A dictionary mapping word types to the set of orthographic contexts that word type appears in. Contexts are represented by adding orthographic context flags: ...
Instance Variable sent_starters A set of word types for words that often appear at the beginning of sentences.
Method _debug_ortho_context Undocumented
def __init__(self): (source)

Undocumented

def add_ortho_context(self, typ, flag): (source)

Undocumented

def clear_abbrevs(self): (source)

Undocumented

def clear_collocations(self): (source)

Undocumented

def clear_ortho_context(self): (source)

Undocumented

def clear_sent_starters(self): (source)

Undocumented

abbrev_types = (source)

A set of word types for known abbreviations.

collocations = (source)

A set of word type tuples for known common collocations where the first word ends in a period. E.g., ('S.', 'Bach') is a common collocation in a text that discusses 'Johann S. Bach'. These count as negative evidence for sentence boundaries.

ortho_context = (source)

A dictionary mapping word types to the set of orthographic contexts that word type appears in. Contexts are represented by adding orthographic context flags: ...

sent_starters = (source)

A set of word types for words that often appear at the beginning of sentences.

def _debug_ortho_context(self, typ): (source)

Undocumented