class TadmEventMaxentFeatureEncoding(BinaryMaxentFeatureEncoding): (source)
Constructor: TadmEventMaxentFeatureEncoding(labels, mapping, unseen_features, alwayson_features)
Undocumented
Class Method | train |
Construct and return new feature encoding, based on a given training corpus ``train_toks``. See the class description ``BinaryMaxentFeatureEncoding`` for a description of the joint-features that will be included in this encoding. |
Method | __init__ |
:param labels: A list of the "known labels" for this encoding. |
Method | describe |
:return: A string describing the value of the joint-feature whose index in the generated feature vectors is ``fid``. :rtype: str |
Method | encode |
Given a (featureset, label) pair, return the corresponding vector of joint-feature values. This vector is represented as a list of ``(index, value)`` tuples, specifying the value of each non-zero joint-feature. |
Method | labels |
:return: A list of the "known labels" -- i.e., all labels ``l`` such that ``self.encode(fs,l)`` can be a nonzero joint-feature vector for some value of ``fs``. :rtype: list |
Method | length |
:return: The size of the fixed-length joint-feature vectors that are generated by this encoding. :rtype: int |
Instance Variable | _label |
Undocumented |
Instance Variable | _mapping |
dict mapping from (fname,fval,label) -> fid |
Inherited from BinaryMaxentFeatureEncoding
:
Instance Variable | _alwayson |
dict mapping from label -> fid |
Instance Variable | _inv |
Undocumented |
Instance Variable | _labels |
A list of attested labels. |
Instance Variable | _length |
The length of generated joint feature vectors. |
Instance Variable | _unseen |
dict mapping from fname -> fid |
Construct and return new feature encoding, based on a given training corpus ``train_toks``. See the class description ``BinaryMaxentFeatureEncoding`` for a description of the joint-features that will be included in this encoding. :type train_toks: list(tuple(dict, str)) :param train_toks: Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label. :type count_cutoff: int :param count_cutoff: A cutoff value that is used to discard rare joint-features. If a joint-feature's value is 1 fewer than ``count_cutoff`` times in the training corpus, then that joint-feature is not included in the generated encoding. :type labels: list :param labels: A list of labels that should be used by the classifier. If not specified, then the set of labels attested in ``train_toks`` will be used. :param options: Extra parameters for the constructor, such as ``unseen_features`` and ``alwayson_features``.
:param labels: A list of the "known labels" for this encoding. :param mapping: A dictionary mapping from ``(fname,fval,label)`` tuples to corresponding joint-feature indexes. These indexes must be the set of integers from 0...len(mapping). If ``mapping[fname,fval,label]=id``, then ``self.encode(..., fname:fval, ..., label)[id]`` is 1; otherwise, it is 0. :param unseen_features: If true, then include unseen value features in the generated joint-feature vectors. :param alwayson_features: If true, then include always-on features in the generated joint-feature vectors.
:return: A string describing the value of the joint-feature whose index in the generated feature vectors is ``fid``. :rtype: str
Given a (featureset, label) pair, return the corresponding vector of joint-feature values. This vector is represented as a list of ``(index, value)`` tuples, specifying the value of each non-zero joint-feature.
:type featureset: dict :rtype: list(tuple(int, int))
:return: A list of the "known labels" -- i.e., all labels ``l`` such that ``self.encode(fs,l)`` can be a nonzero joint-feature vector for some value of ``fs``. :rtype: list
:return: The size of the fixed-length joint-feature vectors that are generated by this encoding. :rtype: int