module documentation
Tokenizer Interface
Class |
|
A tokenizer that divides a string into substrings by splitting on the specified string (defined in subclasses). |
Class |
|
A processing interface for tokenizing a string. Subclasses must define tokenize() or tokenize_sents() (or both). |