class _ScandinavianStemmer(_LanguageSpecificStemmer): (source)
Known subclasses: nltk.stem.snowball.DanishStemmer
, nltk.stem.snowball.NorwegianStemmer
, nltk.stem.snowball.SwedishStemmer
Constructor: _ScandinavianStemmer(ignore_stopwords)
This subclass encapsulates a method for defining the string region R1. It is used by the Danish, Norwegian, and Swedish stemmer.
Method | _r1 |
Return the region R1 that is used by the Scandinavian stemmers. |
Inherited from _LanguageSpecificStemmer
:
Method | __init__ |
Undocumented |
Method | __repr__ |
Print out the string representation of the respective class. |
Instance Variable | stopwords |
Undocumented |
Inherited from StemmerI
(via _LanguageSpecificStemmer
):
Method | stem |
Strip affixes from the token and return the stem. |
Return the region R1 that is used by the Scandinavian stemmers.
R1 is the region after the first non-vowel following a vowel, or is the null region at the end of the word if there is no such non-vowel. But then R1 is adjusted so that the region before it contains at least three letters.
Parameters | |
word:str or unicode | The word whose region R1 is determined. |
vowels:unicode | The vowels of the respective language that are used to determine the region R1. |
Returns | |
unicode | the region R1 for the respective word. |
Note | |
This helper method is invoked by the respective stem method of the subclasses DanishStemmer, NorwegianStemmer, and SwedishStemmer. It is not to be invoked directly! |