module documentation
Undocumented
| Function | padded |
Default preprocessing for a sequence of sentences. |
| Function | padded |
Helper with some useful defaults. |
| Variable | pad |
Pads both ends of a sentence to length specified by ngram order. |
Default preprocessing for a sequence of sentences.
Creates two iterators:
- sentences padded and turned into sequences of nltk.util.everygrams
- sentences padded as above and chained together for a flat stream of words
Iterable[Iterable[str]] :return: iterator over text as ngrams, iterator over text as vocabulary data
| Parameters | |
| order | Largest ngram length produced by everygrams. |
| text | Text to iterate over. Expected to be an iterable of sentences: |