nltk.stem.lancaster.LancasterStemmer

class documentation

class LancasterStemmer(StemmerI): (source)

Constructor: LancasterStemmer(rule_tuple, strip_prefix_flag)

Lancaster Stemmer

>>> from nltk.stem.lancaster import LancasterStemmer
>>> st = LancasterStemmer()
>>> st.stem('maximum')     # Remove "-um" when word is intact
'maxim'
>>> st.stem('presumably')  # Don't remove "-um" when word is not intact
'presum'
>>> st.stem('multiply')    # No action taken if word ends with "-ply"
'multiply'
>>> st.stem('provision')   # Replace "-sion" with "-j" to trigger "j" set of rules
'provid'
>>> st.stem('owed')        # Word starting with vowel must contain at least 2 letters
'ow'
>>> st.stem('ear')         # ditto
'ear'
>>> st.stem('saying')      # Words starting with consonant must contain at least 3
'say'
>>> st.stem('crying')      #     letters and one of those letters must be a vowel
'cry'
>>> st.stem('string')      # ditto
'string'
>>> st.stem('meant')       # ditto
'meant'
>>> st.stem('cement')      # ditto
'cem'
>>> st_pre = LancasterStemmer(strip_prefix_flag=True)
>>> st_pre.stem('kilometer') # Test Prefix
'met'
>>> st_custom = LancasterStemmer(rule_tuple=("ssen4>", "s1t."))
>>> st_custom.stem("ness") # Change s to t
'nest'

Method	`__init__`	Create an instance of the Lancaster stemmer.
Method	`__repr__`	Undocumented
Method	`parseRules`	Validate the set of rules used in this stemmer.
Method	`stem`	Stem a word using the Lancaster stemmer.
Class Variable	`default_rule_tuple`	Undocumented
Instance Variable	`rule_dictionary`	Undocumented
Method	`__applyRule`	Apply the stemming rule to the word
Method	`__doStemming`	Perform the actual word stemming
Method	`__getLastLetter`	Get the zero-based index of the last alphabetic character in this string
Method	`__isAcceptable`	Determine if the word is acceptable for stemming.
Method	`__stripPrefix`	Remove prefix from a word.
Instance Variable	`_rule_tuple`	Undocumented
Instance Variable	`_strip_prefix`	Undocumented

def __init__(self, rule_tuple=None, strip_prefix_flag=False): (source) ¶

Create an instance of the Lancaster stemmer.

def __repr__(self): (source) ¶

Undocumented

def parseRules(self, rule_tuple=None): (source) ¶

Validate the set of rules used in this stemmer.

If this function is called as an individual method, without using stem method, rule_tuple argument will be compiled into self.rule_dictionary. If this function is called within stem, self._rule_tuple will be used.

def stem(self, word): (source) ¶

overrides nltk.stem.api.StemmerI.stem

Stem a word using the Lancaster stemmer.

default_rule_tuple: tuple[str, ...] = (source) ¶

Undocumented

rule_dictionary: dict = (source) ¶

Undocumented

def __applyRule(self, word, remove_total, append_string): (source) ¶

Apply the stemming rule to the word

def __doStemming(self, word, intact_word): (source) ¶

Perform the actual word stemming

def __getLastLetter(self, word): (source) ¶