Basic data classes for representing feature structures, and for performing basic operations on those feature structures. A feature structure is a mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. There are two types of feature structure, implemented by two subclasses of FeatStruct:
- feature dictionaries, implemented by FeatDict, act like Python dictionaries. Feature identifiers may be strings or instances of the Feature class.
- feature lists, implemented by FeatList, act like Python lists. Feature identifiers are integers.
Feature structures are typically used to represent partial information about objects. A feature identifier that is not mapped to a value stands for a feature whose value is unknown (not a feature without a value). Two feature structures that represent (potentially overlapping) information about the same object can be combined by unification. When two inconsistent feature structures are unified, the unification fails and returns None.
Features can be specified using "feature paths", or tuples of feature identifiers that specify path through the nested feature structures to a value. Feature structures may contain reentrant feature values. A "reentrant feature value" is a single feature value that can be accessed via multiple feature paths. Unification preserves the reentrance relations imposed by both of the unified feature structures. In the feature structure resulting from unification, any modifications to a reentrant feature value will be visible using any of its feature paths.
Feature structure variables are encoded using the nltk.sem.Variable class. The variables' values are tracked using a bindings dictionary, which maps variables to their values. When two feature structures are unified, a fresh bindings dictionary is created to track their values; and before unification completes, all bound variables are replaced by their values. Thus, the bindings dictionaries are usually strictly internal to the unification process. However, it is possible to track the bindings of variables if you choose to, by supplying your own initial bindings dictionary to the unify() function.
When unbound variables are unified with one another, they become aliased. This is encoded by binding one variable to the other.
Lightweight Feature Structures
Many of the functions defined by nltk.featstruct can be applied directly to simple Python dictionaries and lists, rather than to full-fledged FeatDict and FeatList objects. In other words, Python dicts and lists can be used as "light-weight" feature structures.
>>> from nltk.featstruct import unify >>> unify(dict(x=1, y=dict()), dict(a='a', y=dict(b='b'))) # doctest: +SKIP {'y': {'b': 'b'}, 'x': 1, 'a': 'a'}
However, you should keep in mind the following caveats:
- Python dictionaries & lists ignore reentrance when checking for equality between values. But two FeatStructs with different reentrances are considered nonequal, even if all their base values are equal.
- FeatStructs can be easily frozen, allowing them to be used as keys in hash tables. Python dictionaries and lists can not.
- FeatStructs display reentrance in their string representations; Python dictionaries and lists do not.
- FeatStructs may not be mixed with Python dictionaries and lists (e.g., when performing unification).
- FeatStructs provide a number of useful methods, such as walk() and cyclic(), which are not available for Python dicts and lists.
In general, if your feature structures will contain any reentrances, or if you plan to use them as dictionary keys, it is strongly recommended that you use full-fledged FeatStruct objects.
Class |
|
An abstract base class for base values that define a custom unification method. The custom unification method of CustomFeatureValue will be used during unification if: |
Class |
|
A feature structure that acts like a Python dictionary. I.e., a mapping from feature identifiers to feature values, where a feature identifier can be a string or a Feature; and where a feature value can be either a basic value (such as a string or an integer), or a nested feature structure... |
Class |
|
A list of feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. |
Class |
|
A mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. There are two types of feature structure:... |
Class |
|
No class docstring; 0/7 instance variable, 0/12 constant, 5/20 methods documented |
Class |
|
A feature identifier that's specialized to put additional constraints, default values, etc. |
Class |
|
A base feature value that represents the concatenation of two or more FeatureValueTuple or Variable. |
Class |
|
A base feature value that is a set of other base feature values. FeatureValueSet implements SubstituteBindingsI, so it any variable substitutions will be propagated to the elements contained by the set... |
Class |
|
A base feature value that is a tuple of other base feature values. FeatureValueTuple implements SubstituteBindingsI, so it any variable substitutions will be propagated to the elements contained by the set... |
Class |
|
A base feature value that represents the union of two or more FeatureValueSet or Variable. |
Class |
|
Undocumented |
Class |
|
Undocumented |
Class |
|
A mixin class for sequence clases that distributes variables() and substitute_bindings() over the object's elements. |
Function | conflicts |
Return a list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2. |
Function | demo |
Just for testing |
Function | display |
Undocumented |
Function | find |
No summary |
Function | interactive |
Undocumented |
Function | remove |
No summary |
Function | rename |
Return the feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables. The names for these new variables will be names that are not used by any variable in ... |
Function | retract |
Return the feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it. A feature structure value must be identical to a bound value (i... |
Function | substitute |
Return the feature structure that is obtained by replacing each variable bound by bindings with its binding. If a variable is aliased to a bound variable, then it will be replaced by that variable's value... |
Function | subsumes |
Return True if fstruct1 subsumes fstruct2. I.e., return true if unifying fstruct1 with fstruct2 would result in a feature structure equal to fstruct2. |
Function | unify |
Unify fstruct1 with fstruct2, and return the resulting feature structure. This unified feature structure is the minimal feature structure that contains all feature value assignments from both fstruct1... |
Constant | SLASH |
Undocumented |
Constant | TYPE |
Undocumented |
Variable |
|
A unique value used to indicate unification failure. It can be returned by Feature.unify_base_values() or by custom fail() functions to indicate that unificaiton should fail. |
Class | _ |
Undocumented |
Exception | _ |
An exception that is used by _destructively_unify to abort unification when a failure is encountered. |
Function | _apply |
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy). |
Function | _apply |
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy). |
Function | _check |
Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message. Otherwise, call the method and return its result. |
Function | _default |
Undocumented |
Function | _destructively |
Attempt to unify fstruct1 and fstruct2 by modifying them in-place. If the unification succeeds, then fstruct1 will contain the unified value, the value of fstruct2 is undefined, and forward[id(fstruct2)] is set to fstruct1... |
Function | _flatten |
Helper function -- return a copy of list, with all elements of type cls spliced in rather than appended in. |
Function | _is |
Undocumented |
Function | _is |
Undocumented |
Function | _remove |
Undocumented |
Function | _rename |
Undocumented |
Function | _rename |
Undocumented |
Function | _resolve |
Replace any bound aliased vars with their binding; and replace any unbound aliased vars with their representative var. |
Function | _retract |
Undocumented |
Function | _substitute |
Undocumented |
Function | _trace |
Undocumented |
Function | _trace |
Undocumented |
Function | _trace |
Undocumented |
Function | _trace |
Undocumented |
Function | _trace |
Undocumented |
Function | _trace |
Undocumented |
Function | _unify |
Attempt to unify fval1 and and fval2, and return the resulting unified value. The method of unification will depend on the types of fval1 and fval2: |
Function | _variables |
Undocumented |
Constant | _FROZEN |
Undocumented |
Constant | _FROZEN |
Undocumented |
Return a list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2.
Returns | |
list(tuple) | Undocumented |
Returns | |
FeatStruct | The feature structure that is obtained by deleting all features whose values are Variables. |
Return the feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables. The names for these new variables will be names that are not used by any variable in vars, or in used_vars, or in this feature structure.
To consistently rename the variables in a set of feature structures, simply apply rename_variables to each one, using the same dictionary:
>>> from nltk.featstruct import FeatStruct >>> fstruct1 = FeatStruct('[subj=[agr=[gender=?y]], obj=[agr=[gender=?y]]]') >>> fstruct2 = FeatStruct('[subj=[agr=[number=?z,gender=?y]], obj=[agr=[number=?z,gender=?y]]]') >>> new_vars = {} # Maps old vars to alpha-renamed vars >>> fstruct1.rename_variables(new_vars=new_vars) [obj=[agr=[gender=?y2]], subj=[agr=[gender=?y2]]] >>> fstruct2.rename_variables(new_vars=new_vars) [obj=[agr=[gender=?y2, number=?z2]], subj=[agr=[gender=?y2, number=?z2]]]
If new_vars is not specified, then an empty dictionary is used.
Parameters | |
fstruct | Undocumented |
vars:set | The set of variables that should be renamed. If not specified, find_variables(fstruct) is used; i.e., all variables will be given new names. |
used | A set of variables whose names should not be used by the new variables. |
new | A dictionary that is used to hold the mapping from old variables to new variables. For each variable v in this feature structure:
|
fs | Undocumented |
Return the feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it. A feature structure value must be identical to a bound value (i.e., have equal id) to be replaced.
bindings is modified to point to this new feature structure, rather than the original feature structure. Feature structure values in bindings may be modified if they are contained in fstruct.
Return the feature structure that is obtained by replacing each variable bound by bindings with its binding. If a variable is aliased to a bound variable, then it will be replaced by that variable's value. If a variable is aliased to an unbound variable, then it will be replaced by that variable.
Parameters | |
fstruct | Undocumented |
bindings:dict(Variable -> any) | A dictionary mapping from variables to values. |
fs | Undocumented |
Return True if fstruct1 subsumes fstruct2. I.e., return true if unifying fstruct1 with fstruct2 would result in a feature structure equal to fstruct2.
Returns | |
bool | Undocumented |
Unify fstruct1 with fstruct2, and return the resulting feature structure. This unified feature structure is the minimal feature structure that contains all feature value assignments from both fstruct1 and fstruct2, and that preserves all reentrancies.
If no such feature structure exists (because fstruct1 and fstruct2 specify incompatible values for some feature), then unification fails, and unify returns None.
Bound variables are replaced by their values. Aliased variables are replaced by their representative variable (if unbound) or the value of their representative variable (if bound). I.e., if variable v is in bindings, then v is replaced by bindings[v]. This will be repeated until the variable is replaced by an unbound variable or a non-variable value.
Unbound variables are bound when they are unified with values; and aliased when they are unified with variables. I.e., if variable v is not in bindings, and is unified with a variable or value x, then bindings[v] is set to x.
If bindings is unspecified, then all variables are assumed to be unbound. I.e., bindings defaults to an empty dict.
>>> from nltk.featstruct import FeatStruct >>> FeatStruct('[a=?x]').unify(FeatStruct('[b=?x]')) [a=?x, b=?x2]
Parameters | |
fstruct1 | Undocumented |
fstruct2 | Undocumented |
bindings:dict(Variable -> any) | A set of variable bindings to be used and updated during unification. |
trace:bool | If true, generate trace output. |
fail | Undocumented |
rename | If True, then rename any variables in fstruct2 that are also used in fstruct1, in order to avoid collisions on variable names. |
fs | Undocumented |
A unique value used to indicate unification failure. It can be returned by Feature.unify_base_values() or by custom fail() functions to indicate that unificaiton should fail.
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message. Otherwise, call the method and return its result.
Attempt to unify fstruct1 and fstruct2 by modifying them in-place. If the unification succeeds, then fstruct1 will contain the unified value, the value of fstruct2 is undefined, and forward[id(fstruct2)] is set to fstruct1. If the unification fails, then a _UnificationFailureError is raised, and the values of fstruct1 and fstruct2 are undefined.
Parameters | |
fstruct1 | Undocumented |
fstruct2 | Undocumented |
bindings | A dictionary mapping variables to values. |
forward | A dictionary mapping feature structures ids to replacement structures. When two feature structures are merged, a mapping from one to the other will be added to the forward dictionary; and changes will be made only to the target of the forward dictionary. _destructively_unify will always 'follow' any links in the forward dictionary for fstruct1 and fstruct2 before actually unifying them. |
trace | If true, generate trace output |
fail | Undocumented |
fs | Undocumented |
path | The feature path that led us to this unification step. Used for trace output. |
Helper function -- return a copy of list, with all elements of type cls spliced in rather than appended in.
Replace any bound aliased vars with their binding; and replace any unbound aliased vars with their representative var.
Attempt to unify fval1 and and fval2, and return the resulting unified value. The method of unification will depend on the types of fval1 and fval2:
- If they're both feature structures, then destructively unify them (see _destructively_unify().
- If they're both unbound variables, then alias one variable to the other (by setting bindings[v2]=v1).
- If one is an unbound variable, and the other is a value, then bind the unbound variable to the value.
- If one is a feature structure, and the other is a base value, then fail.
- If they're both base values, then unify them. By default, this will succeed if they are equal, and fail otherwise.