class RussianStemmer(_LanguageSpecificStemmer): (source)
Constructor: RussianStemmer(ignore_stopwords)
The Russian Snowball stemmer.
Note | |
A detailed description of the Russian stemming algorithm can be found under http://snowball.tartarus.org/algorithms/russian/stemmer.html |
Method | stem |
Stem a Russian word and return the stemmed form. |
Method | __cyrillic |
Transliterate a Russian word into the Roman alphabet. |
Method | __regions |
Return the regions RV and R2 which are used by the Russian stemmer. |
Method | __roman |
Transliterate a Russian word back into the Cyrillic alphabet. |
Class Variable | __adjectival |
Suffixes to be deleted. |
Class Variable | __derivational |
Suffixes to be deleted. |
Class Variable | __noun |
Suffixes to be deleted. |
Class Variable | __perfective |
Suffixes to be deleted. |
Class Variable | __reflexive |
Suffixes to be deleted. |
Class Variable | __superlative |
Suffixes to be deleted. |
Class Variable | __verb |
Suffixes to be deleted. |
Inherited from _LanguageSpecificStemmer
:
Method | __init__ |
Undocumented |
Method | __repr__ |
Print out the string representation of the respective class. |
Instance Variable | stopwords |
Undocumented |
nltk.stem.api.StemmerI.stem
Stem a Russian word and return the stemmed form.
Parameters | |
word:str or unicode | The word that is stemmed. |
Returns | |
unicode | The stemmed form. |
Transliterate a Russian word into the Roman alphabet.
A Russian word whose letters consist of the Cyrillic alphabet are transliterated into the Roman alphabet in order to ease the forthcoming stemming process.
Parameters | |
word:unicode | The word that is transliterated. |
Returns | |
unicode | the transliterated word. |
Note | |
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly! |
Return the regions RV and R2 which are used by the Russian stemmer.
In any word, RV is the region after the first vowel, or the end of the word if it contains no vowel.
R2 is the region after the first non-vowel following a vowel in R1, or the end of the word if there is no such non-vowel.
R1 is the region after the first non-vowel following a vowel, or the end of the word if there is no such non-vowel.
Parameters | |
word:str or unicode | The Russian word whose regions RV and R2 are determined. |
Returns | |
tuple | the regions RV and R2 for the respective Russian word. |
Note | |
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly! |
Transliterate a Russian word back into the Cyrillic alphabet.
A Russian word formerly transliterated into the Roman alphabet in order to ease the stemming process, is transliterated back into the Cyrillic alphabet, its original form.
Parameters | |
word:str or unicode | The word that is transliterated. |
Returns | |
unicode | word, the transliterated word. |
Note | |
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly! |