class documentation
class LancasterStemmer(StemmerI): (source)
Constructor: LancasterStemmer(rule_tuple, strip_prefix_flag)
Lancaster Stemmer
>>> from nltk.stem.lancaster import LancasterStemmer >>> st = LancasterStemmer() >>> st.stem('maximum') # Remove "-um" when word is intact 'maxim' >>> st.stem('presumably') # Don't remove "-um" when word is not intact 'presum' >>> st.stem('multiply') # No action taken if word ends with "-ply" 'multiply' >>> st.stem('provision') # Replace "-sion" with "-j" to trigger "j" set of rules 'provid' >>> st.stem('owed') # Word starting with vowel must contain at least 2 letters 'ow' >>> st.stem('ear') # ditto 'ear' >>> st.stem('saying') # Words starting with consonant must contain at least 3 'say' >>> st.stem('crying') # letters and one of those letters must be a vowel 'cry' >>> st.stem('string') # ditto 'string' >>> st.stem('meant') # ditto 'meant' >>> st.stem('cement') # ditto 'cem' >>> st_pre = LancasterStemmer(strip_prefix_flag=True) >>> st_pre.stem('kilometer') # Test Prefix 'met' >>> st_custom = LancasterStemmer(rule_tuple=("ssen4>", "s1t.")) >>> st_custom.stem("ness") # Change s to t 'nest'
Method | __init__ |
Create an instance of the Lancaster stemmer. |
Method | __repr__ |
Undocumented |
Method | parse |
Validate the set of rules used in this stemmer. |
Method | stem |
Stem a word using the Lancaster stemmer. |
Class Variable | default |
Undocumented |
Instance Variable | rule |
Undocumented |
Method | __apply |
Apply the stemming rule to the word |
Method | __do |
Perform the actual word stemming |
Method | __get |
Get the zero-based index of the last alphabetic character in this string |
Method | __is |
Determine if the word is acceptable for stemming. |
Method | __strip |
Remove prefix from a word. |
Instance Variable | _rule |
Undocumented |
Instance Variable | _strip |
Undocumented |
Validate the set of rules used in this stemmer.
If this function is called as an individual method, without using stem method, rule_tuple argument will be compiled into self.rule_dictionary. If this function is called within stem, self._rule_tuple will be used.