module documentation
Undocumented
Function | padded |
Default preprocessing for a sequence of sentences. |
Function | padded |
Helper with some useful defaults. |
Variable | pad |
Pads both ends of a sentence to length specified by ngram order. |
Default preprocessing for a sequence of sentences.
Creates two iterators:
- sentences padded and turned into sequences of nltk.util.everygrams
- sentences padded as above and chained together for a flat stream of words
Iterable[Iterable[str]] :return: iterator over text as ngrams, iterator over text as vocabulary data
Parameters | |
order | Largest ngram length produced by everygrams . |
text | Text to iterate over. Expected to be an iterable of sentences: |