«
class documentation

class SmoothingFunction: (source)

Constructor: SmoothingFunction(epsilon, alpha, k)

View In Hierarchy

This is an implementation of the smoothing techniques for segment-level BLEU scores that was presented in Boxing Chen and Collin Cherry (2014) A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU. In WMT14. http://acl2014.org/acl2014/W14-33/pdf/W14-3346.pdf

Method __init__ This will initialize the parameters required for the various smoothing techniques, the default values are set to the numbers used in the experiments from Chen and Cherry (2014).
Method method0 No smoothing.
Method method1 Smoothing method 1: Add epsilon counts to precision with 0 counts.
Method method2 Smoothing method 2: Add 1 to both numerator and denominator from Chin-Yew Lin and Franz Josef Och (2004) ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation. In COLING 2004.
Method method3 Smoothing method 3: NIST geometric sequence smoothing The smoothing is computed by taking 1 / ( 2^k ), instead of 0, for each precision score whose matching n-gram count is null. k is 1 for the first 'n' value for which the n-gram match count is null/ For example, if the text contains:...
Method method4 Smoothing method 4: Shorter translations may have inflated precision values due to having smaller denominators; therefore, we give them proportionally smaller smoothed counts. Instead of scaling to 1/(2^k), Chen and Cherry suggests dividing by 1/ln(len(T)), where T is the length of the translation.
Method method5 Smoothing method 5: The matched counts for similar values of n should be similar. To a calculate the n-gram matched count, it averages the n−1, n and n+1 gram matched counts.
Method method6 Smoothing method 6: Interpolates the maximum likelihood estimate of the precision p_n with a prior estimate pi0. The prior is estimated by assuming that the ratio between pn and pn−1 will be the same as that between pn−1 and pn−2; from Gao and He (2013) Training MRF-Based Phrase Translation Models using Gradient Ascent...
Method method7 Smoothing method 7: Interpolates methods 4 and 5.
Instance Variable alpha Undocumented
Instance Variable epsilon Undocumented
Instance Variable k Undocumented
def __init__(self, epsilon=0.1, alpha=5, k=5): (source)

This will initialize the parameters required for the various smoothing techniques, the default values are set to the numbers used in the experiments from Chen and Cherry (2014).

>>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures',
...                 'that', 'the', 'military', 'always', 'obeys', 'the',
...                 'commands', 'of', 'the', 'party']
>>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures',
...               'that', 'the', 'military', 'will', 'forever', 'heed',
...               'Party', 'commands']
>>> chencherry = SmoothingFunction()
>>> print(sentence_bleu([reference1], hypothesis1)) # doctest: +ELLIPSIS
0.4118...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method0)) # doctest: +ELLIPSIS
0.4118...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method1)) # doctest: +ELLIPSIS
0.4118...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method2)) # doctest: +ELLIPSIS
0.4489...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method3)) # doctest: +ELLIPSIS
0.4118...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method4)) # doctest: +ELLIPSIS
0.4118...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method5)) # doctest: +ELLIPSIS
0.4905...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method6)) # doctest: +ELLIPSIS
0.4135...
>>> print(sentence_bleu([reference1], hypothesis1, smoothing_function=chencherry.method7)) # doctest: +ELLIPSIS
0.4905...
Parameters
epsilon:floatthe epsilon value use in method 1
alpha:intthe alpha value use in method 6
k:intthe k value use in method 4
def method0(self, p_n, *args, **kwargs): (source)

No smoothing.

def method1(self, p_n, *args, **kwargs): (source)

Smoothing method 1: Add epsilon counts to precision with 0 counts.

def method2(self, p_n, *args, **kwargs): (source)

Smoothing method 2: Add 1 to both numerator and denominator from Chin-Yew Lin and Franz Josef Och (2004) ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation. In COLING 2004.

def method3(self, p_n, *args, **kwargs): (source)

Smoothing method 3: NIST geometric sequence smoothing The smoothing is computed by taking 1 / ( 2^k ), instead of 0, for each precision score whose matching n-gram count is null. k is 1 for the first 'n' value for which the n-gram match count is null/ For example, if the text contains:

  • one 2-gram match
  • and (consequently) two 1-gram matches
the n-gram count for each individual precision score would be:
  • n=1 => prec_count = 2 (two unigrams)
  • n=2 => prec_count = 1 (one bigram)
  • n=3 => prec_count = 1/2 (no trigram, taking 'smoothed' value of 1 / ( 2^k ), with k=1)
  • n=4 => prec_count = 1/4 (no fourgram, taking 'smoothed' value of 1 / ( 2^k ), with k=2)
def method4(self, p_n, references, hypothesis, hyp_len=None, *args, **kwargs): (source)

Smoothing method 4: Shorter translations may have inflated precision values due to having smaller denominators; therefore, we give them proportionally smaller smoothed counts. Instead of scaling to 1/(2^k), Chen and Cherry suggests dividing by 1/ln(len(T)), where T is the length of the translation.

def method5(self, p_n, references, hypothesis, hyp_len=None, *args, **kwargs): (source)

Smoothing method 5: The matched counts for similar values of n should be similar. To a calculate the n-gram matched count, it averages the n−1, n and n+1 gram matched counts.

def method6(self, p_n, references, hypothesis, hyp_len=None, *args, **kwargs): (source)

Smoothing method 6: Interpolates the maximum likelihood estimate of the precision p_n with a prior estimate pi0. The prior is estimated by assuming that the ratio between pn and pn−1 will be the same as that between pn−1 and pn−2; from Gao and He (2013) Training MRF-Based Phrase Translation Models using Gradient Ascent. In NAACL.

def method7(self, p_n, references, hypothesis, hyp_len=None, *args, **kwargs): (source)

Smoothing method 7: Interpolates methods 4 and 5.

Undocumented

Undocumented

Undocumented