module documentation
Tokenizer Interface
| Class | |
A tokenizer that divides a string into substrings by splitting on the specified string (defined in subclasses). |
| Class | |
A processing interface for tokenizing a string. Subclasses must define tokenize() or tokenize_sents() (or both). |