Is Unicode welcome yet on the list?
Stefan Baums
sbaums at GMX.NET
Wed Jan 30 20:06:35 UTC 2002
Dear list,
please bear with the length of this post. I am addressing your
questions re Unicode one by one.
Regarding Valerie Roebuck's concern about the memory requirements of
Unicode text:
The form of Unicode used on the web and in email is called UTF-8.
It continues to encode all Western European letters as single
bytes, and would use 2 bytes for most letters with Sanskrit
diacritics. It thus has the exact same memory requirements as
Harvard-Kyoto ('.n' etc. are also two bytes each).
About those of us using older computers and software (voluntary or
perforce):
This would seem to be the greatest obstacle. While it is
relatively straightforward to update one's home computer to Unicode
supporting software (any computer running Windows 9x or higher now
should be powerful enough), those using email clients like pine on
a university network terminal have little influence over the
software installed on that system. One could, however, prod the
university computer service to upgrade to Unicode-aware program
versions (the newest versions of Pine, e.g., are, and there is
always Mutt 1.3.x / 1.4.x (http://www.mutt.org), see below).
Note to Richard Mahoney concerning Mutt:
Mutt 1.3.x and the upcoming 1.4.x have excellent support for
Unicode. This requires XFree86 4.x.x and an iconv implementation
(there's one included in the GNU C library, and for BSD people
there is a standalone implementation). I am myself using Mutt.
Note to Jürgen Hanneder concerning Unicode character input under
GNU/Linux:
I'm using GNU Emacs. I'll send you my input method offlist.
Anybody else who wants it, just email me.
About Windows / Macintosh / GNU/Linux:
I find it interesting that the two of you who did report success
reading my Unicode email were using GNU/Linux. (So much for
Microsoft's assertion that free operating systems could never
achieve complexities such as "deep Unicode support".) What is
required is basically just XFree86 4.x.x and glibc 2.2.x plus a
suitable email client (mutt, kmail, emacs, ...) (all included in
recent distributions). Detailed, somewhat technical instructions
at http://www.cl.cam.ac.uk/~mgk25/unicode.html, or email me for
help.
Under Windows, one probably just needs to set the (up-to-date)
email program's font to one including the proper diacritics. Arial
Unicode is fine, or use Andrew Glass's excellent Gandhari Unicode
font (http://staff.washington.edu/asg/Pages/Fonts.html).
Unicode is possible under Macs, but I don't know the details.
Unicode and the list:
It does not make sense to use Unicode on the list as long as that
excludes people from reading one's posts. Unicode is, however, the
best answer to Indologist encoding needs, so I suggest we discuss
this issue again in the future. One could also set up a webpage on
the Indology site with hints on configuring systems for (the
Indologically interesting part of) Unicode.
Best regards,
Stefan Baums
--
Stefan Baums
Asien-Instituttet
Københavns Universitet
More information about the INDOLOGY
mailing list