[INDOLOGY] Diacriticals in unicode, single or multiple glyphs

Harry Spier hspier.muktabodha at gmail.com
Fri Nov 18 12:58:58 UTC 2016


Dear list members,

In unicode you can write characters with diacriticals with either a single
glyph or you can combine the character with the diacritical writing it in
two glyphs.

This is a problem when one searchs sanskrit etexts.

For example, the letters with diacriticals in the Muktabodha digital
library are written with one glyph and as far as I can see GRETIL does the
same thing.  But the transcoding utility at  "The Sanskrit Library"
http://sanskritlibrary.org/transcodeText.html
combines letters with their diacriticals in two glyphs.
 So if you used the Sanskrit Library utility to create a transliterated
word such as for example: *śākti* and then searched texts from either
GRETIL or Muktabodha for that word your search wouldn't find anything.

Thanks,
Harry Spier


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://list.indology.info/pipermail/indology/attachments/20161118/76de491d/attachment.htm>


More information about the INDOLOGY mailing list