class SmoothingFunction: (source)
Constructor: SmoothingFunction(epsilon, alpha, k)
This is an implementation of the smoothing techniques for segment-level BLEU scores that was presented in Boxing Chen and Collin Cherry (2014) A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU. In WMT14. http://acl2014.org/acl2014/W14-33/pdf/W14-3346.pdf
Method | __init__ |
This will initialize the parameters required for the various smoothing techniques, the default values are set to the numbers used in the experiments from Chen and Cherry (2014). |
Method | method0 |
No smoothing. |
Method | method1 |
Smoothing method 1: Add epsilon counts to precision with 0 counts. |
Method | method2 |
Smoothing method 2: Add 1 to both numerator and denominator from Chin-Yew Lin and Franz Josef Och (2004) ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation. In COLING 2004. |
Method | method3 |
Smoothing method 3: NIST geometric sequence smoothing The smoothing is computed by taking 1 / ( 2^k ), instead of 0, for each precision score whose matching n-gram count is null. k is 1 for the first 'n' value for which the n-gram match count is null/ For example, if the text contains:... |
Method | method4 |
Smoothing method 4: Shorter translations may have inflated precision values due to having smaller denominators; therefore, we give them proportionally smaller smoothed counts. Instead of scaling to 1/(2^k), Chen and Cherry suggests dividing by 1/ln(len(T)), where T is the length of the translation. |
Method | method5 |
Smoothing method 5: The matched counts for similar values of n should be similar. To a calculate the n-gram matched count, it averages the n−1, n and n+1 gram matched counts. |
Method | method6 |
Smoothing method 6: Interpolates the maximum likelihood estimate of the precision p_n with a prior estimate pi0. The prior is estimated by assuming that the ratio between pn and pn−1 will be the same as that between pn−1 and pn−2; from Gao and He (2013) Training MRF-Based Phrase Translation Models using Gradient Ascent... |
Method | method7 |
Smoothing method 7: Interpolates methods 4 and 5. |
Instance Variable | alpha |
Undocumented |
Instance Variable | epsilon |
Undocumented |
Instance Variable | k |
Undocumented |
This will initialize the parameters required for the various smoothing techniques, the default values are set to the numbers used in the experiments from Chen and Cherry (2014).
>>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', ... 'that', 'the', 'military', 'always', 'obeys', 'the', ... 'commands', 'of', 'the', 'party'] >>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', ... 'that', 'the', 'military', 'will', 'forever', 'heed', ... 'Party', 'commands']
>>> chencherry = SmoothingFunction() >>> print(sentence_bleu([reference1], hypothesis1)) # doctest: +ELLIPSIS 0.4118... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method0)) # doctest: +ELLIPSIS 0.4118... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method1)) # doctest: +ELLIPSIS 0.4118... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method2)) # doctest: +ELLIPSIS 0.4489... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method3)) # doctest: +ELLIPSIS 0.4118... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method4)) # doctest: +ELLIPSIS 0.4118... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method5)) # doctest: +ELLIPSIS 0.4905... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method6)) # doctest: +ELLIPSIS 0.4135... >>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method7)) # doctest: +ELLIPSIS 0.4905...
Parameters | |
epsilon:float | the epsilon value use in method 1 |
alpha:int | the alpha value use in method 6 |
k:int | the k value use in method 4 |
Smoothing method 2: Add 1 to both numerator and denominator from Chin-Yew Lin and Franz Josef Och (2004) ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation. In COLING 2004.
Smoothing method 3: NIST geometric sequence smoothing The smoothing is computed by taking 1 / ( 2^k ), instead of 0, for each precision score whose matching n-gram count is null. k is 1 for the first 'n' value for which the n-gram match count is null/ For example, if the text contains:
- one 2-gram match
- and (consequently) two 1-gram matches
- the n-gram count for each individual precision score would be:
- n=1 => prec_count = 2 (two unigrams)
- n=2 => prec_count = 1 (one bigram)
- n=3 => prec_count = 1/2 (no trigram, taking 'smoothed' value of 1 / ( 2^k ), with k=1)
- n=4 => prec_count = 1/4 (no fourgram, taking 'smoothed' value of 1 / ( 2^k ), with k=2)
Smoothing method 4: Shorter translations may have inflated precision values due to having smaller denominators; therefore, we give them proportionally smaller smoothed counts. Instead of scaling to 1/(2^k), Chen and Cherry suggests dividing by 1/ln(len(T)), where T is the length of the translation.
Smoothing method 5: The matched counts for similar values of n should be similar. To a calculate the n-gram matched count, it averages the n−1, n and n+1 gram matched counts.
Smoothing method 6: Interpolates the maximum likelihood estimate of the precision p_n with a prior estimate pi0. The prior is estimated by assuming that the ratio between pn and pn−1 will be the same as that between pn−1 and pn−2; from Gao and He (2013) Training MRF-Based Phrase Translation Models using Gradient Ascent. In NAACL.