module documentation

Undocumented

Class TextTilingTokenizer Tokenize a document into topical sections using the TextTiling algorithm. This algorithm detects subtopic shifts based on the analysis of lexical co-occurrence patterns.
Class TokenSequence A token list with its original length and its index
Class TokenTableField A field in the token table holding parameters for each token, used later in the process
Function demo Undocumented
Function smooth smooth the data using a window with requested size.
Constant DEFAULT_SMOOTHING Undocumented
Variable BLOCK_COMPARISON Undocumented
Variable HC Undocumented
Variable LC Undocumented
Variable VOCABULARY_INTRODUCTION Undocumented
def demo(text=None): (source)

Undocumented

def smooth(x, window_len=11, window='flat'): (source)

smooth the data using a window with requested size.

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the beginning and end part of the output signal.

example:

t=linspace(-2,2,0.1)
x=sin(t)+randn(len(t))*0.1
y=smooth(x)

TODO: the window parameter could be the window itself if an array instead of a string

Parameters
xthe input signal
window_lenthe dimension of the smoothing window; should be an odd integer
windowthe type of window from 'flat', 'hanning', 'hamming', 'bartlett', 'blackman' flat window will produce a moving average smoothing.
Returns
the smoothed signal
See Also
numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve, scipy.signal.lfilter
DEFAULT_SMOOTHING: list[int] = (source)

Undocumented

Value
[0]
BLOCK_COMPARISON = (source)

Undocumented

Undocumented

Undocumented

VOCABULARY_INTRODUCTION = (source)

Undocumented