Common methods and classes for all IBM models. See IBMModel1, IBMModel2, IBMModel3, IBMModel4, and IBMModel5 for specific implementations.
The IBM models are a series of generative models that learn lexical translation probabilities, p(target language word|source language word), given a sentence-aligned parallel corpus.
The models increase in sophistication from model 1 to 5. Typically, the output of lower models is used to seed the higher models. All models use the Expectation-Maximization (EM) algorithm to learn various probability tables.
Words in a sentence are one-indexed. The first word of a sentence has position 1, not 0. Index 0 is reserved in the source sentence for the NULL token. The concept of position does not apply to NULL, but it is indexed at 0 by convention.
Each target word is aligned to exactly one source word or the NULL token.
References: Philipp Koehn. 2010. Statistical Machine Translation. Cambridge University Press, New York.
Peter E Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19 (2), 263-311.
Class |
|
Helper data object for training IBM Models 3 and up |
Class |
|
Data object to store counts of various parameters during training |
Class |
|
Abstract base class for all IBM models |
Function | longest |
No summary |