Sanskrit lexicon

pesch at indoger.unizh.ch pesch at indoger.unizh.ch
Thu Feb 16 17:36:33 UTC 1995


In response to Thomas Malten's sample page of Monier-Williams 
I have been reflecting about the types of information I would like
to be able to find (electronically); a simple reproduction of the 
printed page might not be sufficient. And IF the conversion into electronic
format is done manually (and Malten's sample looks like it), and IF it is done
by trained people, it would seem highly recommendable to tag the various types
of information. A main entry may contain information concerning:
-- source texts (individual or groups, with textual reference or without,
mostly tied to specific meaning)
-- sublemmas (distinguished according to part of speech, e.g. nouns as 
sublemmas to adjectives, marked by italic endings in parentheses)
-- compounds with identical first member (indicated by leading hyphen if
sandhi allows that)
-- compounds with the lemma-word as second member
-- parallel lemmas 
-- pointers to other lemmata (entries without indication of meanings)
-- "homonyms" (different meanings) dependant on the grammatical tag (e.g. kut2
as "cl. 6.P." or as "cl. 4.P.")
-- explanations (e.g. "there being eight elephants of the cardinal
points")

I do not see that the typography in Monier-Williams would allow to 
distinguish these (and other) types of information automatically. 

In my project of computerizing Mylius' Woerterbuch Sanskrit-Deutsch I
have no "manual labour" at my disposal and must restrict the tagging
to what can be achieved by interpreting the typography of the printed book:
1. Counter for homonyms
2. lemma
3. grammatical tags 
4. meaning or meanings (including specifications concerning semantic context,
syntax, etc.)

I suppose some kind of agreement as to what is recommandable and/or 
necessary should be reached before each of us begin to encode his/her
55K of Monier-Williams. Does the Text Encoding Initiative provide 
us with a model?!

Peter Schreiner

(PS: I am NOT concerned with the details of transliteration or tagging;
Malten's system is beautifully unambiguous. But I would like to understand
the requirements for the "logic" of an electronic dictionary.)
 






More information about the INDOLOGY mailing list