class documentation

ARLSTem stemmer : a light Arabic Stemming algorithm without any dictionary. Department of Telecommunication & Information Processing. USTHB University, Algiers, Algeria. ARLSTem.stem(token) returns the Arabic stem for the input token. The ARLSTem Stemmer requires that all tokens are encoded using Unicode encoding.

Method __init__ Undocumented
Method fem2masc transform the word from the feminine form to the masculine form.
Method norm normalize the word by removing diacritics, replacing hamzated Alif with Alif replacing AlifMaqsura with Yaa and removing Waaw at the beginning.
Method plur2sing transform the word from the plural form to the singular form.
Method pref remove prefixes from the words' beginning.
Method stem call this function to get the word's stem based on ARLSTem .
Method suff remove suffixes from the word's end.
Method verb stem the verb prefixes and suffixes or both
Method verb_t1 stem the present prefixes and suffixes
Method verb_t2 stem the future prefixes and suffixes
Method verb_t3 stem the present suffixes
Method verb_t4 stem the present prefixes
Method verb_t5 stem the future prefixes
Method verb_t6 stem the order prefixes
Instance Variable pl_si2 Undocumented
Instance Variable pl_si3 Undocumented
Instance Variable pr2 Undocumented
Instance Variable pr3 Undocumented
Instance Variable pr32 Undocumented
Instance Variable pr4 Undocumented
Instance Variable re_alifMaqsura Undocumented
Instance Variable re_diacritics Undocumented
Instance Variable re_hamzated_alif Undocumented
Instance Variable su2 Undocumented
Instance Variable su22 Undocumented
Instance Variable su3 Undocumented
Instance Variable su32 Undocumented
Instance Variable verb_pr2 Undocumented
Instance Variable verb_pr22 Undocumented
Instance Variable verb_pr33 Undocumented
Instance Variable verb_su2 Undocumented
Instance Variable verb_suf1 Undocumented
Instance Variable verb_suf2 Undocumented
Instance Variable verb_suf3 Undocumented
def __init__(self): (source)

Undocumented

def fem2masc(self, token): (source)

transform the word from the feminine form to the masculine form.

def norm(self, token): (source)

normalize the word by removing diacritics, replacing hamzated Alif with Alif replacing AlifMaqsura with Yaa and removing Waaw at the beginning.

def plur2sing(self, token): (source)

transform the word from the plural form to the singular form.

def pref(self, token): (source)

remove prefixes from the words' beginning.

def stem(self, token): (source)

call this function to get the word's stem based on ARLSTem .

def suff(self, token): (source)

remove suffixes from the word's end.

def verb(self, token): (source)

stem the verb prefixes and suffixes or both

def verb_t1(self, token): (source)

stem the present prefixes and suffixes

def verb_t2(self, token): (source)

stem the future prefixes and suffixes

def verb_t3(self, token): (source)

stem the present suffixes

def verb_t4(self, token): (source)

stem the present prefixes

def verb_t5(self, token): (source)

stem the future prefixes

def verb_t6(self, token): (source)

stem the order prefixes

pl_si2: list[str] = (source)

Undocumented

pl_si3: list[str] = (source)

Undocumented

pr2: list[str] = (source)

Undocumented

pr3: list[str] = (source)

Undocumented

pr32: list[str] = (source)

Undocumented

pr4: list[str] = (source)

Undocumented

re_alifMaqsura = (source)

Undocumented

re_diacritics = (source)

Undocumented

re_hamzated_alif = (source)

Undocumented

su2: list[str] = (source)

Undocumented

su22: list[str] = (source)

Undocumented

su3: list[str] = (source)

Undocumented

su32: list[str] = (source)

Undocumented

verb_pr2: list[str] = (source)

Undocumented

verb_pr22: list[str] = (source)

Undocumented

verb_pr33: list[str] = (source)

Undocumented

verb_su2: list[str] = (source)

Undocumented

verb_suf1: list[str] = (source)

Undocumented

verb_suf2: list[str] = (source)

Undocumented

verb_suf3: list[str] = (source)

Undocumented