class documentation
A stemmer that uses regular expressions to identify morphological affixes. Any substrings that match the regular expressions will be removed.
>>> from nltk.stem import RegexpStemmer >>> st = RegexpStemmer('ing$|s$|e$|able$', min=4) >>> st.stem('cars') 'car' >>> st.stem('mass') 'mas' >>> st.stem('was') 'was' >>> st.stem('bee') 'bee' >>> st.stem('compute') 'comput' >>> st.stem('advisable') 'advis'
Parameters | |
regexp | The regular expression that should be used to identify morphological affixes. |
min | The minimum length of string to stem |
Method | __init__ |
Undocumented |
Method | __repr__ |
Undocumented |
Method | stem |
Strip affixes from the token and return the stem. |
Instance Variable | _min |
Undocumented |
Instance Variable | _regexp |
Undocumented |
overrides
nltk.stem.api.StemmerI.stem
Strip affixes from the token and return the stem.
Parameters | |
word | Undocumented |
token:str | The token that should be stemmed. |