class documentation

The Hungarian Snowball stemmer.

Note
A detailed description of the Hungarian stemming algorithm can be found under http://snowball.tartarus.org/algorithms/hungarian/stemmer.html
Method stem Stem an Hungarian word and return the stemmed form.
Method __r1_hungarian Return the region R1 that is used by the Hungarian stemmer.
Class Variable __digraphs The Hungarian digraphs.
Class Variable __double_consonants The Hungarian double consonants.
Class Variable __step1_suffixes Suffixes to be deleted in step 1 of the algorithm.
Class Variable __step2_suffixes Suffixes to be deleted in step 2 of the algorithm.
Class Variable __step3_suffixes Suffixes to be deleted in step 3 of the algorithm.
Class Variable __step4_suffixes Suffixes to be deleted in step 4 of the algorithm.
Class Variable __step5_suffixes Suffixes to be deleted in step 5 of the algorithm.
Class Variable __step6_suffixes Suffixes to be deleted in step 6 of the algorithm.
Class Variable __step7_suffixes Suffixes to be deleted in step 7 of the algorithm.
Class Variable __step8_suffixes Suffixes to be deleted in step 8 of the algorithm.
Class Variable __step9_suffixes Suffixes to be deleted in step 9 of the algorithm.
Class Variable __vowels The Hungarian vowels.

Inherited from _LanguageSpecificStemmer:

Method __init__ Undocumented
Method __repr__ Print out the string representation of the respective class.
Instance Variable stopwords Undocumented
def stem(self, word): (source)

Stem an Hungarian word and return the stemmed form.

Parameters
word:str or unicodeThe word that is stemmed.
Returns
unicodeThe stemmed form.
def __r1_hungarian(self, word, vowels, digraphs): (source)

Return the region R1 that is used by the Hungarian stemmer.

If the word begins with a vowel, R1 is defined as the region after the first consonant or digraph (= two letters stand for one phoneme) in the word. If the word begins with a consonant, it is defined as the region after the first vowel in the word. If the word does not contain both a vowel and consonant, R1 is the null region at the end of the word.

Parameters
word:str or unicodeThe Hungarian word whose region R1 is determined.
vowels:unicodeThe Hungarian vowels that are used to determine the region R1.
digraphs:tupleThe digraphs that are used to determine the region R1.
Returns
unicodethe region R1 for the respective word.
Note
This helper method is invoked by the stem method of the subclass HungarianStemmer. It is not to be invoked directly!
__digraphs: tuple = (source)

The Hungarian digraphs.

__double_consonants: tuple = (source)

The Hungarian double consonants.

__step1_suffixes: tuple = (source)

Suffixes to be deleted in step 1 of the algorithm.

__step2_suffixes: tuple = (source)

Suffixes to be deleted in step 2 of the algorithm.

__step3_suffixes: tuple = (source)

Suffixes to be deleted in step 3 of the algorithm.

__step4_suffixes: tuple = (source)

Suffixes to be deleted in step 4 of the algorithm.

__step5_suffixes: tuple = (source)

Suffixes to be deleted in step 5 of the algorithm.

__step6_suffixes: tuple = (source)

Suffixes to be deleted in step 6 of the algorithm.

__step7_suffixes: tuple = (source)

Suffixes to be deleted in step 7 of the algorithm.

__step8_suffixes: tuple = (source)

Suffixes to be deleted in step 8 of the algorithm.

__step9_suffixes: tuple = (source)

Suffixes to be deleted in step 9 of the algorithm.

__vowels: unicode = (source)

The Hungarian vowels.