class documentation

The Russian Snowball stemmer.

Note
A detailed description of the Russian stemming algorithm can be found under http://snowball.tartarus.org/algorithms/russian/stemmer.html
Method stem Stem a Russian word and return the stemmed form.
Method __cyrillic_to_roman Transliterate a Russian word into the Roman alphabet.
Method __regions_russian Return the regions RV and R2 which are used by the Russian stemmer.
Method __roman_to_cyrillic Transliterate a Russian word back into the Cyrillic alphabet.
Class Variable __adjectival_suffixes Suffixes to be deleted.
Class Variable __derivational_suffixes Suffixes to be deleted.
Class Variable __noun_suffixes Suffixes to be deleted.
Class Variable __perfective_gerund_suffixes Suffixes to be deleted.
Class Variable __reflexive_suffixes Suffixes to be deleted.
Class Variable __superlative_suffixes Suffixes to be deleted.
Class Variable __verb_suffixes Suffixes to be deleted.

Inherited from _LanguageSpecificStemmer:

Method __init__ Undocumented
Method __repr__ Print out the string representation of the respective class.
Instance Variable stopwords Undocumented
def stem(self, word): (source)

Stem a Russian word and return the stemmed form.

Parameters
word:str or unicodeThe word that is stemmed.
Returns
unicodeThe stemmed form.
def __cyrillic_to_roman(self, word): (source)

Transliterate a Russian word into the Roman alphabet.

A Russian word whose letters consist of the Cyrillic alphabet are transliterated into the Roman alphabet in order to ease the forthcoming stemming process.

Parameters
word:unicodeThe word that is transliterated.
Returns
unicodethe transliterated word.
Note
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly!
def __regions_russian(self, word): (source)

Return the regions RV and R2 which are used by the Russian stemmer.

In any word, RV is the region after the first vowel, or the end of the word if it contains no vowel.

R2 is the region after the first non-vowel following a vowel in R1, or the end of the word if there is no such non-vowel.

R1 is the region after the first non-vowel following a vowel, or the end of the word if there is no such non-vowel.

Parameters
word:str or unicodeThe Russian word whose regions RV and R2 are determined.
Returns
tuplethe regions RV and R2 for the respective Russian word.
Note
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly!
def __roman_to_cyrillic(self, word): (source)

Transliterate a Russian word back into the Cyrillic alphabet.

A Russian word formerly transliterated into the Roman alphabet in order to ease the stemming process, is transliterated back into the Cyrillic alphabet, its original form.

Parameters
word:str or unicodeThe word that is transliterated.
Returns
unicodeword, the transliterated word.
Note
This helper method is invoked by the stem method of the subclass RussianStemmer. It is not to be invoked directly!
__adjectival_suffixes: tuple = (source)

Suffixes to be deleted.

__derivational_suffixes: tuple = (source)

Suffixes to be deleted.

__noun_suffixes: tuple = (source)

Suffixes to be deleted.

__perfective_gerund_suffixes: tuple = (source)

Suffixes to be deleted.

__reflexive_suffixes: tuple = (source)

Suffixes to be deleted.

__superlative_suffixes: tuple = (source)

Suffixes to be deleted.

__verb_suffixes: tuple = (source)

Suffixes to be deleted.