Sanskrit and Unicode character encoding

Jeroen Hellingman jhelling at cs.ruu.nl
Tue Jun 22 10:51:00 UTC 1993


  
  
  In the Unicode list, Michael Everson writes:
  > I'm reviewing the proposed Unicode 16-bit character encodings
  > for Tibetan and Sinhalese; both the proposals depart from the
  > standard ISCII layout which Devanagari, Bangali, Gurmukhi,
  > Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, and Burmese
  > follow (with some reservations re: Tamil). Given the body of
  > Sanskrit literature written in the Tibetan script and Pali
  > literature written in the Sinhalese script, it seems to me
  > that encoding parallel to ISCII is important for both Tibetan
  > and Sinhalese scripts. Does anyone agree with this opinion?
  > Parallel encoding can simplify the transfer of a text from
  > script to script because the positions of the characters
  > are the same even though the actual code addresses are not.
  > By keeping Tibetan and Sinhalese out of the fold, script
  > transfer could be made more troublesome.
  > 
  > Has anyone out there expertise on this question? I know,
  > for instance, that Tibetan TSEG may cause troubles.

I am not really an expert, but I think the problems will not be to difficult.
Using a table lookup, you can easily find the equivalent Tibetan character for
a Devanagari character. I think that the parallell encoding of the various
indic scripts is misleading, in that sense that you still need to do table
lookup if you want to transliterate one in the other, as in most Indic scripts
there are minor differences in encoded characters, for example, in Tamil, both
aspirated and voiced versions of consonants are missing, and when trying to
display Malayalam in Tamil script, these should be mapped to other characters
(which indicates that round trip mapping is not possible).

The additional problem with Sinhala is then that it includes some more
characters (A, AE, AA, AEAE, and nasalized characters) which cannot be coded
in parallel, as they are not known in the Devanagari prototype.

The filling nature of TSEG is still under investigation, as far as I know.



-- 
Jeroen Hellingman                 E-mail: <jhelling at cs.ruu.nl>
't Zand 2                         Phone: +31-3473-73935 (home)
4133 TB Vianen                    (18.00--21.00 GMT)
The Netherlands                   Answer in English, German, or Dutch.
 






More information about the INDOLOGY mailing list