class documentation
class RTEFeatureExtractor(object): (source)
Constructor: RTEFeatureExtractor(rtepair, stop, use_lemmatize)
This builds a bag of words for both the text and the hypothesis after throwing away some stopwords, then calculates overlap and difference.
| Method | __init__ |
No summary |
| Method | hyp |
Compute the extraneous material in the hypothesis. |
| Method | overlap |
Compute the overlap between text and hypothesis. |
| Instance Variable | hyp |
Undocumented |
| Instance Variable | hyp |
Undocumented |
| Instance Variable | negwords |
Undocumented |
| Instance Variable | stop |
Undocumented |
| Instance Variable | stopwords |
Undocumented |
| Instance Variable | text |
Undocumented |
| Instance Variable | text |
Undocumented |
| Static Method | _lemmatize |
Use morphy from WordNet to find the base form of verbs. |
| Static Method | _ne |
This just assumes that words in all caps or titles are named entities. |
| Instance Variable | _hyp |
Undocumented |
| Instance Variable | _overlap |
Undocumented |
| Instance Variable | _txt |
Undocumented |
| Parameters | |
| rtepair | a RTEPair from which features should be extracted |
| stop:bool | if True, stopwords are thrown away. |
| use | Undocumented |
Compute the extraneous material in the hypothesis.
| Parameters | |
| toktype:'ne' or 'word' | distinguish Named Entities from ordinary words |
| debug | Undocumented |
Compute the overlap between text and hypothesis.
| Parameters | |
| toktype:'ne' or 'word' | distinguish Named Entities from ordinary words |
| debug | Undocumented |