class documentation

class AnnotationTask(object): (source)

Constructor: AnnotationTask(data, distance)

View In Hierarchy

Represents an annotation task, i.e. people assign labels to items.

Notation tries to match notation in Artstein and Poesio (2007).

In general, coders and items can be represented as any hashable object. Integers, for example, are fine, though strings are more readable. Labels must support the distance functions applied to them, so e.g. a string-edit-distance makes no sense if your labels are integers, whereas interval distance needs numeric values. A notable case of this is the MASI metric, which requires Python sets.

Method __init__ Initialize an annotation task.
Method __str__ Undocumented
Method Ae_kappa Undocumented
Method agr Agreement between two coders on a given item
Method alpha Krippendorff 1980
Method Ao Observed agreement between two coders on all items.
Method avg_Ao Average observed agreement across all coders and items.
Method Disagreement Undocumented
Method Do_Kw Averaged over all labelers
Method Do_Kw_pairwise The observed disagreement for the weighted kappa coefficient.
Method kappa Cohen 1960 Averages naively over kappas for each coder pair.
Method kappa_pairwise No summary
Method load_array Load an sequence of annotation results, appending to any data already loaded.
Method multi_kappa Davies and Fleiss 1982 Averages over observed and expected agreements for each coder pair.
Method N Implements the "n-notation" used in Artstein and Poesio (2007)
Method Nck Undocumented
Method Nik Undocumented
Method Nk Undocumented
Method pi Scott 1955; here, multi-pi. Equivalent to K from Siegel and Castellan (1988).
Method S Bennett, Albert and Goldstein 1954
Method weighted_kappa Cohen 1968
Method weighted_kappa_pairwise Cohen 1968
Instance Variable C Undocumented
Instance Variable data Undocumented
Instance Variable distance Undocumented
Instance Variable I Undocumented
Instance Variable K Undocumented
Method _grouped_data Undocumented
Method _pairwise_average Calculates the average of function results for each coder pair
def __init__(self, data=None, distance=binary_distance): (source)

Initialize an annotation task.

The data argument can be None (to create an empty annotation task) or a sequence of 3-tuples, each representing a coder's labeling of an item:

(coder,item,label)

The distance argument is a function taking two arguments (labels) and producing a numerical distance. The distance from a label to itself should be zero:

distance(l,l) = 0
def __str__(self): (source)

Undocumented

def Ae_kappa(self, cA, cB): (source)

Undocumented

def agr(self, cA, cB, i, data=None): (source)

Agreement between two coders on a given item

def alpha(self): (source)

Krippendorff 1980

def Ao(self, cA, cB): (source)

Observed agreement between two coders on all items.

def avg_Ao(self): (source)

Average observed agreement across all coders and items.

def Disagreement(self, label_freqs): (source)

Undocumented

def Do_Kw(self, max_distance=1.0): (source)

Averaged over all labelers

def Do_Kw_pairwise(self, cA, cB, max_distance=1.0): (source)

The observed disagreement for the weighted kappa coefficient.

def kappa(self): (source)

Cohen 1960 Averages naively over kappas for each coder pair.

def kappa_pairwise(self, cA, cB): (source)
def load_array(self, array): (source)

Load an sequence of annotation results, appending to any data already loaded.

The argument is a sequence of 3-tuples, each representing a coder's labeling of an item:
(coder,item,label)
def multi_kappa(self): (source)

Davies and Fleiss 1982 Averages over observed and expected agreements for each coder pair.

@deprecated('Use Nk, Nik or Nck instead')
def N(self, k=None, i=None, c=None): (source)

Implements the "n-notation" used in Artstein and Poesio (2007)

def Nck(self, c, k): (source)

Undocumented

def Nik(self, i, k): (source)

Undocumented

def Nk(self, k): (source)

Undocumented

def pi(self): (source)

Scott 1955; here, multi-pi. Equivalent to K from Siegel and Castellan (1988).

def S(self): (source)

Bennett, Albert and Goldstein 1954

def weighted_kappa(self, max_distance=1.0): (source)

Cohen 1968

def weighted_kappa_pairwise(self, cA, cB, max_distance=1.0): (source)

Cohen 1968

Undocumented

data: list = (source)

Undocumented

distance = (source)

Undocumented

Undocumented

Undocumented

def _grouped_data(self, field, data=None): (source)

Undocumented

def _pairwise_average(self, function): (source)

Calculates the average of function results for each coder pair