class documentation

The confusion matrix between a list of reference values and a corresponding list of test values. Entry [r,t] of this matrix is a count of the number of times that the reference value r corresponds to the test value t. E.g.:

>>> from nltk.metrics import ConfusionMatrix
>>> ref  = 'DET NN VB DET JJ NN NN IN DET NN'.split()
>>> test = 'DET VB VB DET NN NN NN IN DET NN'.split()
>>> cm = ConfusionMatrix(ref, test)
>>> print(cm['NN', 'NN'])
3

Note that the diagonal entries Ri=Tj of this matrix corresponds to correct values; and the off-diagonal entries correspond to incorrect values.

Method __getitem__ value lj was given. :rtype: int
Method __init__ Construct a new confusion matrix from a list of reference values and a corresponding list of test values.
Method __repr__ Undocumented
Method __str__ Undocumented
Method key Undocumented
Method pretty_format @todo: add marginals?
Instance Variable _confusion Undocumented
Instance Variable _correct Undocumented
Instance Variable _indices Undocumented
Instance Variable _max_conf Undocumented
Instance Variable _total Undocumented
Instance Variable _values Undocumented
def __getitem__(self, li_lj_tuple): (source)

value lj was given. :rtype: int

Returns
The number of times that value li was expected and
def __init__(self, reference, test, sort_by_count=False): (source)

Construct a new confusion matrix from a list of reference values and a corresponding list of test values.

Parameters
reference:listAn ordered list of reference values.
test:listA list of values to compare against the corresponding reference values.
sort_by_countUndocumented
Raises
ValueErrorIf reference and length do not have the same length.
def __repr__(self): (source)

Undocumented

def __str__(self): (source)

Undocumented

def key(self): (source)

Undocumented

def pretty_format(self, show_percents=False, values_in_chart=True, truncate=None, sort_by_count=False): (source)

@todo: add marginals?

Parameters
show_percentsUndocumented
values_in_chartUndocumented
truncate:intIf specified, then only show the specified number of values. Any sorting (e.g., sort_by_count) will be performed before truncation.
sort_by_countIf true, then sort by the count of each label in the reference data. I.e., labels that occur more frequently in the reference label will be towards the left edge of the matrix, and labels that occur less frequently will be towards the right edge.
Returns
A multi-line string representation of this confusion matrix.
_confusion = (source)

Undocumented

_correct = (source)

Undocumented

_indices = (source)

Undocumented

_max_conf = (source)

Undocumented

Undocumented

Undocumented