class VerbnetCorpusReader(XMLCorpusReader): (source)
Constructor: VerbnetCorpusReader(root, fileids, wrap_etree)
An NLTK interface to the VerbNet verb lexicon.
From the VerbNet site: "VerbNet (VN) (Kipper-Schuler 2006) is the largest on-line verb lexicon currently available for English. It is a hierarchical domain-independent, broad-coverage verb lexicon with mappings to other lexical resources such as WordNet (Miller, 1990; Fellbaum, 1998), XTAG (XTAG Research Group, 2001), and FrameNet (Baker et al., 1998)."
For details about VerbNet see: https://verbs.colorado.edu/~mpalmer/projects/verbnet.html
Method | __init__ |
No summary |
Method | classids |
Return a list of the VerbNet class identifiers. If a file identifier is specified, then return only the VerbNet class identifiers for classes (and subclasses) defined by that file. If a lemma is specified, then return only VerbNet class identifiers for classes that contain that lemma as a member... |
Method | fileids |
Return a list of fileids that make up this corpus. If vnclass_ids is specified, then return the fileids that make up the specified VerbNet class(es). |
Method | frames |
Given a VerbNet class, this method returns VerbNet frames |
Method | lemmas |
Return a list of all verb lemmas that appear in any class, or in the classid if specified. |
Method | longid |
Returns longid of a VerbNet class |
Method | pprint |
Returns pretty printed version of a VerbNet class |
Method | pprint |
Returns pretty version of all frames in a VerbNet class |
Method | pprint |
Returns pretty printed version of members in a VerbNet class |
Method | pprint |
Returns pretty printed version of subclasses of VerbNet class |
Method | pprint |
Returns pretty printed version of thematic roles in a VerbNet class |
Method | shortid |
Returns shortid of a VerbNet class |
Method | subclasses |
Returns subclass ids, if any exist |
Method | themroles |
Returns thematic roles participating in a VerbNet class |
Method | vnclass |
Returns VerbNet class ElementTree |
Method | wordnetids |
Return a list of all wordnet identifiers that appear in any class, or in classid if specified. |
Method | _get |
Returns member description within frame |
Method | _get |
Returns example within a frame |
Method | _get |
Returns semantics within a single frame |
Method | _get |
Returns semantics within a frame |
Method | _index |
Initialize the indexes _lemma_to_class, _wordnet_to_class, and _class_to_fileid by scanning through the corpus fileids. This is fast if ElementTree uses the C implementation (<0.1 secs), but quite slow (>10 secs) if only the python implementation is available. |
Method | _index |
Helper for _index() |
Method | _pprint |
Returns pretty printed version of a VerbNet frame description |
Method | _pprint |
Returns pretty printed version of example within frame in a VerbNet class |
Method | _pprint |
Returns a pretty printed version of semantics within frame in a VerbNet class |
Method | _pprint |
Returns pretty printed version of a single frame in a VerbNet class |
Method | _pprint |
Returns pretty printed version of syntax within a frame in a VerbNet class |
Method | _quick |
Initialize the indexes _lemma_to_class, _wordnet_to_class, and _class_to_fileid by scanning through the corpus fileids. This doesn't do proper xml parsing, but is good enough to find everything in the standard VerbNet corpus -- and it runs about 30 times faster than xml parsing (with the python ElementTree; only 2-3 times faster if ElementTree uses the C implementation). |
Constant | _INDEX |
Regular expression used by _index() to quickly scan the corpus for basic information. |
Constant | _LONGID |
Regular expression that matches (and decomposes) longids |
Constant | _SHORTID |
Regular expression that matches shortids |
Instance Variable | _class |
A dictionary mapping from class identifiers to corresponding file identifiers. The keys of this dictionary provide a complete list of all classes and subclasses. |
Instance Variable | _lemma |
A dictionary mapping from verb lemma strings to lists of VerbNet class identifiers. |
Instance Variable | _shortid |
Undocumented |
Instance Variable | _wordnet |
A dictionary mapping from wordnet identifier strings to lists of VerbNet class identifiers. |
Inherited from XMLCorpusReader
:
Method | raw |
Undocumented |
Method | words |
Returns all of the words and punctuation symbols in the specified file that were in text nodes -- ie, tags are ignored. Like the xml() method, fileid can only specify one file. |
Method | xml |
Undocumented |
Instance Variable | _wrap |
Undocumented |
Inherited from CorpusReader
(via XMLCorpusReader
):
Method | __repr__ |
Undocumented |
Method | abspath |
Return the absolute path for the given file. |
Method | abspaths |
Return a list of the absolute paths for all fileids in this corpus; or for the given list of fileids, if specified. |
Method | citation |
Return the contents of the corpus citation.bib file, if it exists. |
Method | encoding |
Return the unicode encoding for the given corpus file, if known. If the encoding is unknown, or if the given file should be processed using byte strings (str), then return None. |
Method | ensure |
Load this corpus (if it has not already been loaded). This is used by LazyCorpusLoader as a simple method that can be used to make sure a corpus is loaded -- e.g., in case a user wants to do help(some_corpus). |
Method | license |
Return the contents of the corpus LICENSE file, if it exists. |
Method | open |
Return an open stream that can be used to read the given file. If the file's encoding is not None, then the stream will automatically decode the file's contents into unicode. |
Method | readme |
Return the contents of the corpus README file, if it exists. |
Class Variable | root |
Undocumented |
Method | _get |
Undocumented |
Instance Variable | _encoding |
The default unicode encoding for the fileids that make up this corpus. If encoding is None, then the file contents are processed using byte strings. |
Instance Variable | _fileids |
A list of the relative paths for the fileids that make up this corpus. |
Instance Variable | _root |
The root directory for this corpus. |
Instance Variable | _tagset |
Undocumented |
Parameters | |
root:PathPointer or str | A path pointer identifying the root directory for this corpus. If a string is specified, then it will be converted to a PathPointer automatically. |
fileids | A list of the files that make up this corpus. This list can either be specified explicitly, as a list of strings; or implicitly, as a regular expression over file paths. The absolute path for each file will be constructed by joining the reader's root to each file name. |
wrap | Undocumented |
encoding | The default unicode encoding for the files that make up the corpus. The value of encoding can be any of the following: - A string: encoding is the encoding name for all files. - A dictionary: encoding[file_id] is the encoding name for the file whose identifier is file_id. If file_id is not in encoding, then the file contents will be processed using non-unicode byte strings.
|
tagset | The name of the tagset used by this corpus, to be used for normalizing or converting the POS tags returned by the tagged_...() methods. |
Return a list of the VerbNet class identifiers. If a file identifier is specified, then return only the VerbNet class identifiers for classes (and subclasses) defined by that file. If a lemma is specified, then return only VerbNet class identifiers for classes that contain that lemma as a member. If a wordnetid is specified, then return only identifiers for classes that contain that wordnetid as a member. If a classid is specified, then return only identifiers for subclasses of the specified VerbNet class. If nothing is specified, return all classids within VerbNet
nltk.corpus.reader.CorpusReader.fileids
Return a list of fileids that make up this corpus. If vnclass_ids is specified, then return the fileids that make up the specified VerbNet class(es).
Given a VerbNet class, this method returns VerbNet frames
The members returned are: 1) Example 2) Description 3) Syntax 4) Semantics
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
Returns | |
frames - a list of frame dictionaries |
Returns longid of a VerbNet class
Given a short VerbNet class identifier (eg '37.10'), map it to a long id (eg 'confess-37.10'). If shortid is already a long id, then return it as-is
Returns pretty printed version of a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet class.
containing the xml contents of a VerbNet class.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree |
Returns pretty version of all frames in a VerbNet class
Return a string containing a pretty-printed representation of the list of frames within the VerbNet class.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
indent | Undocumented |
Returns pretty printed version of members in a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet class's member verbs.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
indent | Undocumented |
Returns pretty printed version of subclasses of VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet class's subclasses.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
indent | Undocumented |
Returns pretty printed version of thematic roles in a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet class's thematic roles.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
indent | Undocumented |
Returns shortid of a VerbNet class
Given a long VerbNet class identifier (eg 'confess-37.10'), map it to a short id (eg '37.10'). If longid is already a short id, then return it as-is.
Returns subclass ids, if any exist
Given a VerbNet class, this method returns subclass ids (if they exist) in a list of strings.
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
Returns | |
list of subclasses |
Returns thematic roles participating in a VerbNet class
Members returned as part of roles are- 1) Type 2) Modifiers
Parameters | |
vnclass | A VerbNet class identifier; or an ElementTree containing the xml contents of a VerbNet class. |
Returns | |
themroles: A list of thematic roles in the VerbNet class |
Returns VerbNet class ElementTree
Return an ElementTree containing the xml for the specified VerbNet class.
Parameters | |
fileid | An identifier specifying which class should be returned. Can be a file identifier (such as 'put-9.1.xml'), or a VerbNet class identifier (such as 'put-9.1') or a short VerbNet class identifier (such as '9.1'). |
Returns member description within frame
A utility function to retrieve a description of participating members within a frame in VerbNet.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
Returns | |
description: a description dictionary with members - primary and secondary |
Returns example within a frame
A utility function to retrieve an example within a frame in VerbNet.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
Returns | |
example_text: The example sentence for this particular frame |
Returns semantics within a single frame
A utility function to retrieve semantics within a frame in VerbNet Members of the semantics dictionary: 1) Predicate value 2) Arguments
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
Returns | |
semantics: semantics dictionary |
Returns semantics within a frame
A utility function to retrieve semantics within a frame in VerbNet. Members of the syntactic dictionary: 1) POS Tag 2) Modifiers
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
Returns | |
syntax_within_single_frame |
Initialize the indexes _lemma_to_class, _wordnet_to_class, and _class_to_fileid by scanning through the corpus fileids. This is fast if ElementTree uses the C implementation (<0.1 secs), but quite slow (>10 secs) if only the python implementation is available.
Returns pretty printed version of a VerbNet frame description
Return a string containing a pretty-printed representation of the given VerbNet frame description.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
indent | Undocumented |
Returns pretty printed version of example within frame in a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet frame example.
Parameters | |
vnframe | An ElementTree containing the xml contents of a Verbnet frame. |
indent | Undocumented |
Returns a pretty printed version of semantics within frame in a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet frame semantics.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
indent | Undocumented |
Returns pretty printed version of a single frame in a VerbNet class
Returns a string containing a pretty-printed representation of the given frame.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
indent | Undocumented |
Returns pretty printed version of syntax within a frame in a VerbNet class
Return a string containing a pretty-printed representation of the given VerbNet frame syntax.
Parameters | |
vnframe | An ElementTree containing the xml contents of a VerbNet frame. |
indent | Undocumented |
Initialize the indexes _lemma_to_class, _wordnet_to_class, and _class_to_fileid by scanning through the corpus fileids. This doesn't do proper xml parsing, but is good enough to find everything in the standard VerbNet corpus -- and it runs about 30 times faster than xml parsing (with the python ElementTree; only 2-3 times faster if ElementTree uses the C implementation).
Regular expression used by _index() to quickly scan the corpus for basic information.
Value |
|
Regular expression that matches (and decomposes) longids
Value |
|
A dictionary mapping from class identifiers to corresponding file identifiers. The keys of this dictionary provide a complete list of all classes and subclasses.