Undocumented
Function | allign |
Aligns/matches words in the hypothesis to reference by sequentially applying exact match, stemmed match and wordnet based synonym match. In case there are multiple matches the match which has the least number of crossing is chosen. |
Function | exact |
matches exact words in hypothesis and reference and returns a word mapping based on the enumerated word id between hypothesis and reference |
Function | meteor |
Calculates METEOR score for hypothesis with multiple references as described in "Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments" by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL... |
Function | single |
Calculates METEOR score for single hypothesis and reference as per "Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments" by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL... |
Function | stem |
Stems each word and matches them in hypothesis and reference and returns a word mapping between hypothesis and reference |
Function | wordnetsyn |
Matches each word in reference to a word in hypothesis if any synonym of a hypothesis word is the exact match to the reference word. |
Function | _count |
Counts the fewest possible number of chunks such that matched unigrams of each chunk are adjacent to each other. This is used to caluclate the fragmentation part of the metric. |
Function | _enum |
Aligns/matches words in the hypothesis to reference by sequentially applying exact match, stemmed match and wordnet based synonym match. in case there are multiple matches the match which has the least number of crossing is chosen... |
Function | _enum |
Stems each word and matches them in hypothesis and reference and returns a word mapping between enum_hypothesis_list and enum_reference_list based on the enumerated word id. The function also returns a enumerated list of unmatched words for hypothesis and reference. |
Function | _enum |
Matches each word in reference to a word in hypothesis if any synonym of a hypothesis word is the exact match to the reference word. |
Function | _generate |
Takes in string inputs for hypothesis and reference and returns enumerated word lists for each of them |
Function | _match |
matches exact words in hypothesis and reference and returns a word mapping between enum_hypothesis_list and enum_reference_list based on the enumerated word id. |
Aligns/matches words in the hypothesis to reference by sequentially applying exact match, stemmed match and wordnet based synonym match. In case there are multiple matches the match which has the least number of crossing is chosen.
Parameters | |
hypothesis | hypothesis string |
reference | reference string |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
Returns | |
list of tuples, list of tuples, list of tuples | sorted list of matched tuples, unmatched hypothesis list, unmatched reference list |
matches exact words in hypothesis and reference and returns a word mapping based on the enumerated word id between hypothesis and reference
Parameters | |
hypothesis:str | hypothesis string |
reference:str | reference string |
Returns | |
list of 2D tuples, list of 2D tuples, list of 2D tuples | enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples |
Calculates METEOR score for hypothesis with multiple references as described in "Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments" by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL. http://www.cs.cmu.edu/~alavie/METEOR/pdf/Lavie-Agarwal-2007-METEOR.pdf
In case of multiple references the best score is chosen. This method iterates over single_meteor_score and picks the best pair among all the references for a given hypothesis
>>> hypothesis1 = 'It is a guide to action which ensures that the military always obeys the commands of the party' >>> hypothesis2 = 'It is to insure the troops forever hearing the activity guidebook that party direct'
>>> reference1 = 'It is a guide to action that ensures that the military will forever heed Party commands' >>> reference2 = 'It is the guiding principle which guarantees the military forces always being under the command of the Party' >>> reference3 = 'It is the practical guide for the army always to heed the directions of the party'
>>> round(meteor_score([reference1, reference2, reference3], hypothesis1),4) 0.7398
If there is no words match during the alignment the method returns the score as 0. We can safely return a zero instead of raising a division by zero error as no match usually implies a bad translation.
>>> round(meteor_score(['this is a cat'], 'non matching hypothesis'),4) 0.0
Parameters | |
references:list(str) | reference sentences |
hypothesis:str | a hypothesis sentence |
preprocess:method | preprocessing function (default str.lower) |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
alpha:float | parameter for controlling relative weights of precision and recall. |
beta:float | parameter for controlling shape of penalty as a function of as a function of fragmentation. |
gamma:float | relative weight assigned to fragmentation penality. |
Returns | |
float | The sentence-level METEOR score. |
Calculates METEOR score for single hypothesis and reference as per "Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments" by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL. http://www.cs.cmu.edu/~alavie/METEOR/pdf/Lavie-Agarwal-2007-METEOR.pdf
>>> hypothesis1 = 'It is a guide to action which ensures that the military always obeys the commands of the party'
>>> reference1 = 'It is a guide to action that ensures that the military will forever heed Party commands'
>>> round(single_meteor_score(reference1, hypothesis1),4) 0.7398
If there is no words match during the alignment the method returns the score as 0. We can safely return a zero instead of raising a division by zero error as no match usually implies a bad translation.
>>> round(meteor_score('this is a cat', 'non matching hypothesis'),4) 0.0
Parameters | |
reference | Undocumented |
hypothesis:str | a hypothesis sentence |
preprocess:method | preprocessing function (default str.lower) |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
alpha:float | parameter for controlling relative weights of precision and recall. |
beta:float | parameter for controlling shape of penalty as a function of as a function of fragmentation. |
gamma:float | relative weight assigned to fragmentation penality. |
references:list(str) | reference sentences |
Returns | |
float | The sentence-level METEOR score. |
Stems each word and matches them in hypothesis and reference and returns a word mapping between hypothesis and reference
Parameters | |
hypothesis: | |
reference: | |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
Returns | |
list of 2D tuples, list of 2D tuples, list of 2D tuples | enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples |
Matches each word in reference to a word in hypothesis if any synonym of a hypothesis word is the exact match to the reference word.
Parameters | |
hypothesis | hypothesis string |
reference | reference string |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
Returns | |
list of tuples | list of mapped tuples |
Counts the fewest possible number of chunks such that matched unigrams of each chunk are adjacent to each other. This is used to caluclate the fragmentation part of the metric.
Parameters | |
matches | list containing a mapping of matched words (output of allign_words) |
Returns | |
int | Number of chunks a sentence is divided into post allignment |
Aligns/matches words in the hypothesis to reference by sequentially applying exact match, stemmed match and wordnet based synonym match. in case there are multiple matches the match which has the least number of crossing is chosen. Takes enumerated list as input instead of string input
Parameters | |
enum | enumerated hypothesis list |
enum | enumerated reference list |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
Returns | |
list of tuples, list of tuples, list of tuples | sorted list of matched tuples, unmatched hypothesis list, unmatched reference list |
Stems each word and matches them in hypothesis and reference and returns a word mapping between enum_hypothesis_list and enum_reference_list based on the enumerated word id. The function also returns a enumerated list of unmatched words for hypothesis and reference.
Parameters | |
enum | |
enum | |
stemmer:nltk.stem.api.StemmerI or any class that implements a stem method | nltk.stem.api.StemmerI object (default PorterStemmer()) |
Returns | |
list of 2D tuples, list of 2D tuples, list of 2D tuples | enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples |
Matches each word in reference to a word in hypothesis if any synonym of a hypothesis word is the exact match to the reference word.
Parameters | |
enum | enumerated hypothesis list |
enum | enumerated reference list |
wordnet:WordNetCorpusReader | a wordnet corpus reader object (default nltk.corpus.wordnet) |
Returns | |
list of tuples, list of tuples, list of tuples | list of matched tuples, unmatched hypothesis list, unmatched reference list |
Takes in string inputs for hypothesis and reference and returns enumerated word lists for each of them
Parameters | |
hypothesis:str | hypothesis string |
reference:str | reference string |
preprocess:method | Undocumented |
Returns | |
list of 2D tuples, list of 2D tuples | enumerated words list |
Unknown Field: preprocess | |
preprocessing method (default str.lower) |
matches exact words in hypothesis and reference and returns a word mapping between enum_hypothesis_list and enum_reference_list based on the enumerated word id.
Parameters | |
enum | enumerated hypothesis list |
enum | enumerated reference list |
Returns | |
list of 2D tuples, list of 2D tuples, list of 2D tuples | enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples |