class documentation
class CoreNLPDependencyParser(GenericCoreNLPParser): (source)
Constructor: CoreNLPDependencyParser(url, encoding, tagtype)
Dependency parser.
>>> dep_parser = CoreNLPDependencyParser(url='http://localhost:9000')
>>> parse, = dep_parser.raw_parse( ... 'The quick brown fox jumps over the lazy dog.' ... ) >>> print(parse.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE The DT 4 det quick JJ 4 amod brown JJ 4 amod fox NN 5 nsubj jumps VBZ 0 ROOT over IN 9 case the DT 9 det lazy JJ 9 amod dog NN 5 nmod . . 5 punct
>>> print(parse.tree()) # doctest: +NORMALIZE_WHITESPACE (jumps (fox The quick brown) (dog over the lazy) .)
>>> for governor, dep, dependent in parse.triples(): ... print(governor, dep, dependent) # doctest: +NORMALIZE_WHITESPACE ('jumps', 'VBZ') nsubj ('fox', 'NN') ('fox', 'NN') det ('The', 'DT') ('fox', 'NN') amod ('quick', 'JJ') ('fox', 'NN') amod ('brown', 'JJ') ('jumps', 'VBZ') nmod ('dog', 'NN') ('dog', 'NN') case ('over', 'IN') ('dog', 'NN') det ('the', 'DT') ('dog', 'NN') amod ('lazy', 'JJ') ('jumps', 'VBZ') punct ('.', '.')
>>> (parse_fox, ), (parse_dog, ) = dep_parser.raw_parse_sents( ... [ ... 'The quick brown fox jumps over the lazy dog.', ... 'The quick grey wolf jumps over the lazy fox.', ... ] ... ) >>> print(parse_fox.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE The DT 4 det quick JJ 4 amod brown JJ 4 amod fox NN 5 nsubj jumps VBZ 0 ROOT over IN 9 case the DT 9 det lazy JJ 9 amod dog NN 5 nmod . . 5 punct
>>> print(parse_dog.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE The DT 4 det quick JJ 4 amod grey JJ 4 amod wolf NN 5 nsubj jumps VBZ 0 ROOT over IN 9 case the DT 9 det lazy JJ 9 amod fox NN 5 nmod . . 5 punct
>>> (parse_dog, ), (parse_friends, ) = dep_parser.parse_sents( ... [ ... "I 'm a dog".split(), ... "This is my friends ' cat ( the tabby )".split(), ... ] ... ) >>> print(parse_dog.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE I PRP 4 nsubj 'm VBP 4 cop a DT 4 det dog NN 0 ROOT
>>> print(parse_friends.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE This DT 6 nsubj is VBZ 6 cop my PRP$ 4 nmod:poss friends NNS 6 nmod:poss ' POS 4 case cat NN 0 ROOT -LRB- -LRB- 9 punct the DT 9 det tabby NN 6 appos -RRB- -RRB- 9 punct
>>> parse_john, parse_mary, = dep_parser.parse_text( ... 'John loves Mary. Mary walks.' ... )
>>> print(parse_john.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE John NNP 2 nsubj loves VBZ 0 ROOT Mary NNP 2 dobj . . 2 punct
>>> print(parse_mary.to_conll(4)) # doctest: +NORMALIZE_WHITESPACE Mary NNP 2 nsubj walks VBZ 0 ROOT . . 2 punct
Special cases
Non-breaking space inside of a token.
>>> len( ... next( ... dep_parser.raw_parse( ... 'Anhalt said children typically treat a 20-ounce soda bottle as one ' ... 'serving, while it actually contains 2 1/2 servings.' ... ) ... ).nodes ... ) 21
Phone numbers.
>>> len( ... next( ... dep_parser.raw_parse('This is not going to crash: 01 111 555.') ... ).nodes ... ) 10
>>> print( ... next( ... dep_parser.raw_parse('The underscore _ should not simply disappear.') ... ).to_conll(4) ... ) # doctest: +NORMALIZE_WHITESPACE The DT 3 det underscore VBP 3 amod _ NN 7 nsubj should MD 7 aux not RB 7 neg simply RB 7 advmod disappear VB 0 ROOT . . 7 punct
>>> print( ... '\n'.join( ... next( ... dep_parser.raw_parse( ... 'for all of its insights into the dream world of teen life , and its electronic expression through ' ... 'cyber culture , the film gives no quarter to anyone seeking to pull a cohesive story out of its 2 ' ... '1/2-hour running time .' ... ) ... ).to_conll(4).split('\n')[-8:] ... ) ... ) its PRP$ 40 nmod:poss 2 1/2 CD 40 nummod - : 40 punct hour NN 31 nmod running VBG 42 amod time NN 40 dep . . 24 punct <BLANKLINE>
Method | make |
Undocumented |
Class Variable | parser |
Undocumented |
Constant | _OUTPUT |
Undocumented |
Inherited from GenericCoreNLPParser
:
Method | __init__ |
Undocumented |
Method | api |
Undocumented |
Method | parse |
Parse multiple sentences. |
Method | parse |
Parse a piece of text. |
Method | raw |
Parse a sentence. |
Method | raw |
Parse multiple sentences. |
Method | raw |
Tag multiple sentences. |
Method | tag |
Tag a list of tokens. |
Method | tag |
Tag multiple sentences. |
Method | tokenize |
Tokenize a string of text. |
Instance Variable | encoding |
Undocumented |
Instance Variable | session |
Undocumented |
Instance Variable | tagtype |
Undocumented |
Instance Variable | url |
Undocumented |
Inherited from ParserI
(via GenericCoreNLPParser
):
Method | grammar |
No summary |
Method | parse |
When possible this list is sorted from most likely to least likely. |
Method | parse |
No summary |
Method | parse |
No summary |
Inherited from TokenizerI
(via GenericCoreNLPParser
, ParserI
):
Method | span |
Identify the tokens using integer offsets (start_i, end_i), where s[start_i:end_i] is the corresponding token. |
Method | span |
Apply self.span_tokenize() to each element of strings. I.e.: |
Method | tokenize |
Apply self.tokenize() to each element of strings. I.e.: |
Inherited from TaggerI
(via GenericCoreNLPParser
, ParserI
, TokenizerI
):
Method | evaluate |
Score the accuracy of the tagger against the gold standard. Strip the tags from the gold standard text, retag it using the tagger, then compute the accuracy score. |
Method | _check |
Undocumented |