class documentation

class CJKChars(object): (source)

View In Hierarchy

An object that enumerates the code points of the CJK characters as listed on http://en.wikipedia.org/wiki/Basic_Multilingual_Plane#Basic_Multilingual_Plane

This is a Python port of the CJK code point enumerations of Moses tokenizer: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/detokenizer.perl#L309

Class Variable CJK_Compatibility_Forms Undocumented
Class Variable CJK_Compatibility_Ideographs Undocumented
Class Variable CJK_Radicals Undocumented
Class Variable Hangul_Jamo Undocumented
Class Variable Hangul_Syllables Undocumented
Class Variable Katakana_Hangul_Halfwidth Undocumented
Class Variable Phags_Pa Undocumented
Class Variable ranges Undocumented
Class Variable Supplementary_Ideographic_Plane Undocumented
CJK_Compatibility_Forms: tuple[int, ...] = (source)

Undocumented

CJK_Compatibility_Ideographs: tuple[int, ...] = (source)

Undocumented

CJK_Radicals: tuple[int, ...] = (source)

Undocumented

Hangul_Jamo: tuple[int, ...] = (source)

Undocumented

Hangul_Syllables: tuple[int, ...] = (source)

Undocumented

Katakana_Hangul_Halfwidth: tuple[int, ...] = (source)

Undocumented

Phags_Pa: tuple[int, ...] = (source)

Undocumented

Undocumented

Supplementary_Ideographic_Plane: tuple[int, ...] = (source)

Undocumented