Character Set Mapping and Transliteration
The CMST component is an invaluable help when working with character strings in almost all common
character sets. The component is fully Unicode enabled and offers all important functions of a modern
String class to COM, C++ and Java. It is able to transform between 40 different character sets and
transform non-latin characters through Transliteration into latin characters.
- Mapping between 40 different character sets, including UTF-8, ISO 8859-1, GBK, BIG5, JIS, EBCDIC
- Character filter on 'a'-'Z' and '0'-'9'
- Correct "removal" of diacritics according to language specific rules
- HTML and URL encoding and decoding
- Unix <-> Windows line break conversions
- Greek Transliteration (BGN/PCGN 1962, ISO 843 - 1997)
- Cyrillic Transliteration (BGN/PCGN 1947, ISO 9 – 1995)
- Hebrew Transliteration
- Japanese Katakana, Hiragana und Kanji Transliteration
- Chinese Pinyin Transliteration (Mandarin, Cantonese)
Korean Hangul Transliteration

Sample application written with help of the CSMT component.
|