class documentation

Abstract clusterer which takes tokens and maps them into a vector space. Optionally performs singular value decomposition to reduce the dimensionality.

Method __init__ No summary
Method classify Classifies the token into a cluster, setting the token's CLUSTER parameter to that cluster identifier.
Method classify_vectorspace Returns the index of the appropriate cluster for the vector.
Method cluster Assigns the vectors to clusters, learning the clustering parameters from the data. Returns a cluster identifier for each vector.
Method cluster_vectorspace Finds the clusters using the given set of vectors.
Method likelihood Returns the likelihood (a float) of the token having the corresponding cluster.
Method likelihood_vectorspace Returns the likelihood of the vector belonging to the cluster.
Method vector Returns the vector after normalisation and dimensionality reduction
Method _normalise Normalises the vector to unit length.
Instance Variable _should_normalise Undocumented
Instance Variable _svd_dimensions Undocumented
Instance Variable _Tt Undocumented

Inherited from ClusterI:

Method classification_probdist Classifies the token into a cluster, returning a probability distribution over the cluster identifiers.
Method cluster_name Returns the names of the cluster at index.
Method cluster_names Returns the names of the clusters. :rtype: list
Method num_clusters Returns the number of clusters.
def __init__(self, normalise=False, svd_dimensions=None): (source)
Parameters
normalise:booleanshould vectors be normalised to length 1
svd_dimensions:intnumber of dimensions to use in reducing vector dimensionsionality with SVD
def classify(self, vector): (source)

Classifies the token into a cluster, setting the token's CLUSTER parameter to that cluster identifier.

@abstractmethod
def classify_vectorspace(self, vector): (source)

Returns the index of the appropriate cluster for the vector.

def cluster(self, vectors, assign_clusters=False, trace=False): (source)

Assigns the vectors to clusters, learning the clustering parameters from the data. Returns a cluster identifier for each vector.

@abstractmethod
def cluster_vectorspace(self, vectors, trace): (source)

Finds the clusters using the given set of vectors.

def likelihood(self, vector, label): (source)

Returns the likelihood (a float) of the token having the corresponding cluster.

def likelihood_vectorspace(self, vector, cluster): (source)

Returns the likelihood of the vector belonging to the cluster.

def vector(self, vector): (source)

Returns the vector after normalisation and dimensionality reduction

def _normalise(self, vector): (source)

Normalises the vector to unit length.

_should_normalise = (source)

Undocumented

_svd_dimensions = (source)

Undocumented

Undocumented