class documentation
class HungarianStemmer(_LanguageSpecificStemmer): (source)
Constructor: HungarianStemmer(ignore_stopwords)
The Hungarian Snowball stemmer.
Note | |
A detailed description of the Hungarian stemming algorithm can be found under http://snowball.tartarus.org/algorithms/hungarian/stemmer.html |
Method | stem |
Stem an Hungarian word and return the stemmed form. |
Method | __r1 |
Return the region R1 that is used by the Hungarian stemmer. |
Class Variable | __digraphs |
The Hungarian digraphs. |
Class Variable | __double |
The Hungarian double consonants. |
Class Variable | __step1 |
Suffixes to be deleted in step 1 of the algorithm. |
Class Variable | __step2 |
Suffixes to be deleted in step 2 of the algorithm. |
Class Variable | __step3 |
Suffixes to be deleted in step 3 of the algorithm. |
Class Variable | __step4 |
Suffixes to be deleted in step 4 of the algorithm. |
Class Variable | __step5 |
Suffixes to be deleted in step 5 of the algorithm. |
Class Variable | __step6 |
Suffixes to be deleted in step 6 of the algorithm. |
Class Variable | __step7 |
Suffixes to be deleted in step 7 of the algorithm. |
Class Variable | __step8 |
Suffixes to be deleted in step 8 of the algorithm. |
Class Variable | __step9 |
Suffixes to be deleted in step 9 of the algorithm. |
Class Variable | __vowels |
The Hungarian vowels. |
Inherited from _LanguageSpecificStemmer
:
Method | __init__ |
Undocumented |
Method | __repr__ |
Print out the string representation of the respective class. |
Instance Variable | stopwords |
Undocumented |
overrides
nltk.stem.api.StemmerI.stem
Stem an Hungarian word and return the stemmed form.
Parameters | |
word:str or unicode | The word that is stemmed. |
Returns | |
unicode | The stemmed form. |
Return the region R1 that is used by the Hungarian stemmer.
If the word begins with a vowel, R1 is defined as the region after the first consonant or digraph (= two letters stand for one phoneme) in the word. If the word begins with a consonant, it is defined as the region after the first vowel in the word. If the word does not contain both a vowel and consonant, R1 is the null region at the end of the word.
Parameters | |
word:str or unicode | The Hungarian word whose region R1 is determined. |
vowels:unicode | The Hungarian vowels that are used to determine the region R1. |
digraphs:tuple | The digraphs that are used to determine the region R1. |
Returns | |
unicode | the region R1 for the respective word. |
Note | |
This helper method is invoked by the stem method of the subclass HungarianStemmer. It is not to be invoked directly! |