module documentation

Basic data classes for representing feature structures, and for performing basic operations on those feature structures. A feature structure is a mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. There are two types of feature structure, implemented by two subclasses of FeatStruct:

  • feature dictionaries, implemented by FeatDict, act like Python dictionaries. Feature identifiers may be strings or instances of the Feature class.
  • feature lists, implemented by FeatList, act like Python lists. Feature identifiers are integers.

Feature structures are typically used to represent partial information about objects. A feature identifier that is not mapped to a value stands for a feature whose value is unknown (not a feature without a value). Two feature structures that represent (potentially overlapping) information about the same object can be combined by unification. When two inconsistent feature structures are unified, the unification fails and returns None.

Features can be specified using "feature paths", or tuples of feature identifiers that specify path through the nested feature structures to a value. Feature structures may contain reentrant feature values. A "reentrant feature value" is a single feature value that can be accessed via multiple feature paths. Unification preserves the reentrance relations imposed by both of the unified feature structures. In the feature structure resulting from unification, any modifications to a reentrant feature value will be visible using any of its feature paths.

Feature structure variables are encoded using the nltk.sem.Variable class. The variables' values are tracked using a bindings dictionary, which maps variables to their values. When two feature structures are unified, a fresh bindings dictionary is created to track their values; and before unification completes, all bound variables are replaced by their values. Thus, the bindings dictionaries are usually strictly internal to the unification process. However, it is possible to track the bindings of variables if you choose to, by supplying your own initial bindings dictionary to the unify() function.

When unbound variables are unified with one another, they become aliased. This is encoded by binding one variable to the other.

Lightweight Feature Structures

Many of the functions defined by nltk.featstruct can be applied directly to simple Python dictionaries and lists, rather than to full-fledged FeatDict and FeatList objects. In other words, Python dicts and lists can be used as "light-weight" feature structures.

>>> from nltk.featstruct import unify
>>> unify(dict(x=1, y=dict()), dict(a='a', y=dict(b='b')))  # doctest: +SKIP
{'y': {'b': 'b'}, 'x': 1, 'a': 'a'}

However, you should keep in mind the following caveats:

  • Python dictionaries & lists ignore reentrance when checking for equality between values. But two FeatStructs with different reentrances are considered nonequal, even if all their base values are equal.
  • FeatStructs can be easily frozen, allowing them to be used as keys in hash tables. Python dictionaries and lists can not.
  • FeatStructs display reentrance in their string representations; Python dictionaries and lists do not.
  • FeatStructs may not be mixed with Python dictionaries and lists (e.g., when performing unification).
  • FeatStructs provide a number of useful methods, such as walk() and cyclic(), which are not available for Python dicts and lists.

In general, if your feature structures will contain any reentrances, or if you plan to use them as dictionary keys, it is strongly recommended that you use full-fledged FeatStruct objects.

Class CustomFeatureValue An abstract base class for base values that define a custom unification method. The custom unification method of CustomFeatureValue will be used during unification if:
Class FeatDict A feature structure that acts like a Python dictionary. I.e., a mapping from feature identifiers to feature values, where a feature identifier can be a string or a Feature; and where a feature value can be either a basic value (such as a string or an integer), or a nested feature structure...
Class FeatList A list of feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure.
Class FeatStruct A mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. There are two types of feature structure:...
Class FeatStructReader No class docstring; 0/7 instance variable, 0/12 constant, 5/20 methods documented
Class Feature A feature identifier that's specialized to put additional constraints, default values, etc.
Class FeatureValueConcat A base feature value that represents the concatenation of two or more FeatureValueTuple or Variable.
Class FeatureValueSet A base feature value that is a set of other base feature values. FeatureValueSet implements SubstituteBindingsI, so it any variable substitutions will be propagated to the elements contained by the set...
Class FeatureValueTuple A base feature value that is a tuple of other base feature values. FeatureValueTuple implements SubstituteBindingsI, so it any variable substitutions will be propagated to the elements contained by the set...
Class FeatureValueUnion A base feature value that represents the union of two or more FeatureValueSet or Variable.
Class RangeFeature Undocumented
Class SlashFeature Undocumented
Class SubstituteBindingsSequence A mixin class for sequence clases that distributes variables() and substitute_bindings() over the object's elements.
Function conflicts Return a list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2.
Function demo Just for testing
Function display_unification Undocumented
Function find_variables No summary
Function interactive_demo Undocumented
Function remove_variables No summary
Function rename_variables Return the feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables. The names for these new variables will be names that are not used by any variable in ...
Function retract_bindings Return the feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it. A feature structure value must be identical to a bound value (i...
Function substitute_bindings Return the feature structure that is obtained by replacing each variable bound by bindings with its binding. If a variable is aliased to a bound variable, then it will be replaced by that variable's value...
Function subsumes Return True if fstruct1 subsumes fstruct2. I.e., return true if unifying fstruct1 with fstruct2 would result in a feature structure equal to fstruct2.
Function unify Unify fstruct1 with fstruct2, and return the resulting feature structure. This unified feature structure is the minimal feature structure that contains all feature value assignments from both fstruct1...
Constant SLASH Undocumented
Constant TYPE Undocumented
Variable UnificationFailure A unique value used to indicate unification failure. It can be returned by Feature.unify_base_values() or by custom fail() functions to indicate that unificaiton should fail.
Class _UnificationFailure Undocumented
Exception _UnificationFailureError An exception that is used by _destructively_unify to abort unification when a failure is encountered.
Function _apply_forwards Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
Function _apply_forwards_to_bindings Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
Function _check_frozen Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message. Otherwise, call the method and return its result.
Function _default_fs_class Undocumented
Function _destructively_unify Attempt to unify fstruct1 and fstruct2 by modifying them in-place. If the unification succeeds, then fstruct1 will contain the unified value, the value of fstruct2 is undefined, and forward[id(fstruct2)] is set to fstruct1...
Function _flatten Helper function -- return a copy of list, with all elements of type cls spliced in rather than appended in.
Function _is_mapping Undocumented
Function _is_sequence Undocumented
Function _remove_variables Undocumented
Function _rename_variable Undocumented
Function _rename_variables Undocumented
Function _resolve_aliases Replace any bound aliased vars with their binding; and replace any unbound aliased vars with their representative var.
Function _retract_bindings Undocumented
Function _substitute_bindings Undocumented
Function _trace_bindings Undocumented
Function _trace_unify_fail Undocumented
Function _trace_unify_identity Undocumented
Function _trace_unify_start Undocumented
Function _trace_unify_succeed Undocumented
Function _trace_valrepr Undocumented
Function _unify_feature_values Attempt to unify fval1 and and fval2, and return the resulting unified value. The method of unification will depend on the types of fval1 and fval2:
Function _variables Undocumented
Constant _FROZEN_ERROR Undocumented
Constant _FROZEN_NOTICE Undocumented
def conflicts(fstruct1, fstruct2, trace=0): (source)

Return a list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2.

Returns
list(tuple)Undocumented
def demo(trace=False): (source)

Just for testing

def display_unification(fs1, fs2, indent=' '): (source)

Undocumented

def find_variables(fstruct, fs_class='default'): (source)
Returns
set(Variable)The set of variables used by this feature structure.
def interactive_demo(trace=False): (source)

Undocumented

def remove_variables(fstruct, fs_class='default'): (source)
Returns
FeatStructThe feature structure that is obtained by deleting all features whose values are Variables.
def rename_variables(fstruct, vars=None, used_vars=(), new_vars=None, fs_class='default'): (source)

Return the feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables. The names for these new variables will be names that are not used by any variable in vars, or in used_vars, or in this feature structure.

To consistently rename the variables in a set of feature structures, simply apply rename_variables to each one, using the same dictionary:

>>> from nltk.featstruct import FeatStruct
>>> fstruct1 = FeatStruct('[subj=[agr=[gender=?y]], obj=[agr=[gender=?y]]]')
>>> fstruct2 = FeatStruct('[subj=[agr=[number=?z,gender=?y]], obj=[agr=[number=?z,gender=?y]]]')
>>> new_vars = {}  # Maps old vars to alpha-renamed vars
>>> fstruct1.rename_variables(new_vars=new_vars)
[obj=[agr=[gender=?y2]], subj=[agr=[gender=?y2]]]
>>> fstruct2.rename_variables(new_vars=new_vars)
[obj=[agr=[gender=?y2, number=?z2]], subj=[agr=[gender=?y2, number=?z2]]]

If new_vars is not specified, then an empty dictionary is used.

Parameters
fstructUndocumented
vars:setThe set of variables that should be renamed. If not specified, find_variables(fstruct) is used; i.e., all variables will be given new names.
used_vars:setA set of variables whose names should not be used by the new variables.
new_vars:dict(Variable -> Variable)

A dictionary that is used to hold the mapping from old variables to new variables. For each variable v in this feature structure:

  • If new_vars maps v to v', then v will be replaced by v'.
  • If new_vars does not contain v, but vars does contain v, then a new entry will be added to new_vars, mapping v to the new variable that is used to replace it.
fs_classUndocumented
def retract_bindings(fstruct, bindings, fs_class='default'): (source)

Return the feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it. A feature structure value must be identical to a bound value (i.e., have equal id) to be replaced.

bindings is modified to point to this new feature structure, rather than the original feature structure. Feature structure values in bindings may be modified if they are contained in fstruct.

def substitute_bindings(fstruct, bindings, fs_class='default'): (source)

Return the feature structure that is obtained by replacing each variable bound by bindings with its binding. If a variable is aliased to a bound variable, then it will be replaced by that variable's value. If a variable is aliased to an unbound variable, then it will be replaced by that variable.

Parameters
fstructUndocumented
bindings:dict(Variable -> any)A dictionary mapping from variables to values.
fs_classUndocumented
def subsumes(fstruct1, fstruct2): (source)

Return True if fstruct1 subsumes fstruct2. I.e., return true if unifying fstruct1 with fstruct2 would result in a feature structure equal to fstruct2.

Returns
boolUndocumented
def unify(fstruct1, fstruct2, bindings=None, trace=False, fail=None, rename_vars=True, fs_class='default'): (source)

Unify fstruct1 with fstruct2, and return the resulting feature structure. This unified feature structure is the minimal feature structure that contains all feature value assignments from both fstruct1 and fstruct2, and that preserves all reentrancies.

If no such feature structure exists (because fstruct1 and fstruct2 specify incompatible values for some feature), then unification fails, and unify returns None.

Bound variables are replaced by their values. Aliased variables are replaced by their representative variable (if unbound) or the value of their representative variable (if bound). I.e., if variable v is in bindings, then v is replaced by bindings[v]. This will be repeated until the variable is replaced by an unbound variable or a non-variable value.

Unbound variables are bound when they are unified with values; and aliased when they are unified with variables. I.e., if variable v is not in bindings, and is unified with a variable or value x, then bindings[v] is set to x.

If bindings is unspecified, then all variables are assumed to be unbound. I.e., bindings defaults to an empty dict.

>>> from nltk.featstruct import FeatStruct
>>> FeatStruct('[a=?x]').unify(FeatStruct('[b=?x]'))
[a=?x, b=?x2]
Parameters
fstruct1Undocumented
fstruct2Undocumented
bindings:dict(Variable -> any)A set of variable bindings to be used and updated during unification.
trace:boolIf true, generate trace output.
failUndocumented
rename_vars:boolIf True, then rename any variables in fstruct2 that are also used in fstruct1, in order to avoid collisions on variable names.
fs_classUndocumented

Undocumented

Value
SlashFeature('slash',
             default=False, display='slash')

Undocumented

Value
Feature('type',
        display='prefix')
UnificationFailure = (source)

A unique value used to indicate unification failure. It can be returned by Feature.unify_base_values() or by custom fail() functions to indicate that unificaiton should fail.

def _apply_forwards(fstruct, forward, fs_class, visited): (source)

Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).

def _apply_forwards_to_bindings(forward, bindings): (source)

Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).

def _check_frozen(method, indent=''): (source)

Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message. Otherwise, call the method and return its result.

def _default_fs_class(obj): (source)

Undocumented

def _destructively_unify(fstruct1, fstruct2, bindings, forward, trace, fail, fs_class, path): (source)

Attempt to unify fstruct1 and fstruct2 by modifying them in-place. If the unification succeeds, then fstruct1 will contain the unified value, the value of fstruct2 is undefined, and forward[id(fstruct2)] is set to fstruct1. If the unification fails, then a _UnificationFailureError is raised, and the values of fstruct1 and fstruct2 are undefined.

Parameters
fstruct1Undocumented
fstruct2Undocumented
bindingsA dictionary mapping variables to values.
forwardA dictionary mapping feature structures ids to replacement structures. When two feature structures are merged, a mapping from one to the other will be added to the forward dictionary; and changes will be made only to the target of the forward dictionary. _destructively_unify will always 'follow' any links in the forward dictionary for fstruct1 and fstruct2 before actually unifying them.
traceIf true, generate trace output
failUndocumented
fs_classUndocumented
pathThe feature path that led us to this unification step. Used for trace output.
def _flatten(lst, cls): (source)

Helper function -- return a copy of list, with all elements of type cls spliced in rather than appended in.

def _is_mapping(v): (source)

Undocumented

def _is_sequence(v): (source)

Undocumented

def _remove_variables(fstruct, fs_class, visited): (source)

Undocumented

def _rename_variable(var, used_vars): (source)

Undocumented

def _rename_variables(fstruct, vars, used_vars, new_vars, fs_class, visited): (source)

Undocumented

def _resolve_aliases(bindings): (source)

Replace any bound aliased vars with their binding; and replace any unbound aliased vars with their representative var.

def _retract_bindings(fstruct, inv_bindings, fs_class, visited): (source)

Undocumented

def _substitute_bindings(fstruct, bindings, fs_class, visited): (source)

Undocumented

def _trace_bindings(path, bindings): (source)

Undocumented

def _trace_unify_fail(path, result): (source)

Undocumented

def _trace_unify_identity(path, fval1): (source)

Undocumented

def _trace_unify_start(path, fval1, fval2): (source)

Undocumented

def _trace_unify_succeed(path, fval1): (source)

Undocumented

def _trace_valrepr(val): (source)

Undocumented

def _unify_feature_values(fname, fval1, fval2, bindings, forward, trace, fail, fs_class, fpath): (source)

Attempt to unify fval1 and and fval2, and return the resulting unified value. The method of unification will depend on the types of fval1 and fval2:

  1. If they're both feature structures, then destructively unify them (see _destructively_unify().
  2. If they're both unbound variables, then alias one variable to the other (by setting bindings[v2]=v1).
  3. If one is an unbound variable, and the other is a value, then bind the unbound variable to the value.
  4. If one is a feature structure, and the other is a base value, then fail.
  5. If they're both base values, then unify them. By default, this will succeed if they are equal, and fail otherwise.
def _variables(fstruct, vars, fs_class, visited): (source)

Undocumented

_FROZEN_ERROR: str = (source)

Undocumented

Value
'Frozen FeatStructs may not be modified.'
_FROZEN_NOTICE: str = (source)

Undocumented

Value
'''
%sIf self is frozen, raise ValueError.'''