class ProbabilisticNonprojectiveParser(object): (source)
Constructor: ProbabilisticNonprojectiveParser()
A probabilistic non-projective dependency parser.
Nonprojective dependencies allows for "crossing branches" in the parse tree which is necessary for representing particular linguistic phenomena, or even typical parses in some languages. This parser follows the MST parsing algorithm, outlined in McDonald(2005), which likens the search for the best non-projective parse to finding the maximum spanning tree in a weighted directed graph.
>>> class Scorer(DependencyScorerI): ... def train(self, graphs): ... pass ... ... def score(self, graph): ... return [ ... [[], [5], [1], [1]], ... [[], [], [11], [4]], ... [[], [10], [], [5]], ... [[], [8], [8], []], ... ]
>>> npp = ProbabilisticNonprojectiveParser() >>> npp.train([], Scorer())
>>> parses = npp.parse(['v1', 'v2', 'v3'], [None, None, None]) >>> len(list(parses)) 1
Rule based example
>>> from nltk.grammar import DependencyGrammar
>>> grammar = DependencyGrammar.fromstring(''' ... 'taught' -> 'play' | 'man' ... 'man' -> 'the' | 'in' ... 'in' -> 'corner' ... 'corner' -> 'the' ... 'play' -> 'golf' | 'dachshund' | 'to' ... 'dachshund' -> 'his' ... ''')
>>> ndp = NonprojectiveDependencyParser(grammar) >>> parses = ndp.parse(['the', 'man', 'in', 'the', 'corner', 'taught', 'his', 'dachshund', 'to', 'play', 'golf']) >>> len(list(parses)) 4
Method | __init__ |
Creates a new non-projective parser. |
Method | best |
Returns the source of the best incoming arc to the node with address: node_index |
Method | collapse |
Takes a list of nodes that have been identified to belong to a cycle, and collapses them into on larger node. The arcs of all nodes in the graph must be updated to account for this. |
Method | compute |
When updating scores the score of the highest-weighted incoming arc is subtracted upon collapse. This returns the correct amount to subtract from that edge. |
Method | compute |
As nodes are collapsed into others, they are replaced by the new node in the graph, but it's still necessary to keep track of what these original nodes were. This takes a list of node addresses and replaces any collapsed node addresses with their original addresses. |
Method | initialize |
Assigns a score to every edge in the DependencyGraph graph. These scores are generated via the parser's scorer which was assigned during the training process. |
Method | original |
Undocumented |
Method | parse |
Parses a list of tokens in accordance to the MST parsing algorithm for non-projective dependency parses. Assumes that the tokens to be parsed have already been tagged and those tags are provided. Various scoring methods can be used by implementing the ... |
Method | train |
Trains a DependencyScorerI from a set of DependencyGraph objects, and establishes this as the parser's scorer. This is used to initialize the scores on a DependencyGraph during the parsing procedure. |
Method | update |
Updates the edge scores to reflect a collapse operation into new_node. |
Instance Variable | inner |
Undocumented |
Instance Variable | scores |
Undocumented |
Instance Variable | _scorer |
Undocumented |
Returns the source of the best incoming arc to the node with address: node_index
the node that is arced to.
Parameters | |
node | The address of the 'destination' node, |
Takes a list of nodes that have been identified to belong to a cycle, and collapses them into on larger node. The arcs of all nodes in the graph must be updated to account for this.
Parameters | |
new | A Node (Dictionary) to collapse the cycle nodes into. |
cycle | A list of node addresses, each of which is in the cycle. |
g | Undocumented |
b | Undocumented |
c | Undocumented |
g | Graphs which need to be updated. |
When updating scores the score of the highest-weighted incoming arc is subtracted upon collapse. This returns the correct amount to subtract from that edge.
to a particular node being updated :type cycle_indexes: A list of integers. :param cycle_indexes: Only arcs from cycle nodes are considered. This is a list of such nodes addresses.
Parameters | |
column | A index representing the column of incoming arcs |
cycle | Undocumented |
As nodes are collapsed into others, they are replaced by the new node in the graph, but it's still necessary to keep track of what these original nodes were. This takes a list of node addresses and replaces any collapsed node addresses with their original addresses.
subsumed nodes.
Parameters | |
new | A list of node addresses to check for |
Assigns a score to every edge in the DependencyGraph graph. These scores are generated via the parser's scorer which was assigned during the training process.
Parameters | |
graph:DependencyGraph | A dependency graph to assign scores to. |
Parses a list of tokens in accordance to the MST parsing algorithm for non-projective dependency parses. Assumes that the tokens to be parsed have already been tagged and those tags are provided. Various scoring methods can be used by implementing the DependencyScorerI interface and passing it to the training algorithm.
Parameters | |
tokens:list(str) | A list of words or punctuation to be parsed. |
tags:list(str) | A list of tags corresponding by index to the words in the tokens list. |
Returns | |
iter(DependencyGraph) | An iterator of non-projective parses. |
Trains a DependencyScorerI from a set of DependencyGraph objects, and establishes this as the parser's scorer. This is used to initialize the scores on a DependencyGraph during the parsing procedure.
Parameters | |
graphs:list(DependencyGraph) | A list of dependency graphs to train the scorer. |
dependency | A scorer which implements the DependencyScorerI interface. |