compound analysis in e-texts

Jakub Cejka jakub at unipune.ernet.in
Fri Aug 30 19:58:07 UTC 1996


Compound (AND sandhi)-analysed texts are useful, there is no doubt (at 
least for some purposes). It's certainly good to make them.
Problem is that they cannot be considered fully reliable.
As Brigit Kellner rightly stresses, this needs competence. And even when 
it is done by a competent specialist, it is in many cases a matter of 
individual interpretation. 
It is indeed necessary to have side by side with an analysed text an 
unanalysed one as Madhav Deshpande noted. Let me just remind what I think 
is clear that an analysed text (at least in such formats like the TZ 
which is a great merit of this scheme) can be automaticaly reconverted 
into a "samhitapatha". 
One important point mentioned by prof Deshpande is the fact that in 
manuscripts there is usually no word boundary observed even where there 
is no sandhi. Therefore already breaking this is an individual 
interpretation. Typical example that comes into my mind is Kiratarjuniya 
1.1 (or 1.2?) where if the edition reads a sequence
       prajaasu  vrttim
the possibility of interpreting it as a compound
       prajaa-suvrttim   
may be overlooked by anone who is not familiar with the way of writing in 
manuscripts. And will necessarily be overlooked by any software analysing 
the text. Similarly with avagraha which is also already interpretation. 
So once we use such interpretative editions calling them un-analysed 
(which is not fully true) we can use even more analysed ones, but they 
should come together with a NOT A BIT analysed text (that is without 
spaces between words, avagraha's etc). As in epigraphy an edition of an 
inscription is not fully useful if it is not accompanied by photos of the 
whole inscription, so it is with our sanskrit texts unfortunately.
   I look forward the time texts will be published not in a little 
analysed form but in fully analysed and fully unanalysed form side by side.
 
Once discussing analysis' principles it would be really good if those who 
have developed or are extensively using a transliteration-cum-analysis 
coding system could elaborate on their rules adopted for cases like those 
mentioned by B.Kellner (prefixes and "prepositions", compound-members 
which do not have independent existence (-da, -ja etc)) as to what they 
separate and what not. It would be of use to attempt at least a partial 
unification.


Ad Brigit Kellner opposition to romanized texts: 

Yes, Sanskrit is a foreign language (even foreign to everyone), I do not, 
however see the reason why romanized texts do any harm to it. We should 
not forget that devanagari is not THE Sanskrit script. The original 
Sanskrit texts (in mss) are written in devanagari, grantha, telugu, 
bangla, sarada -aadi Scripts. Even today, students in India read Sanskrit 
not only in devanagari which has otherwise been selected recently as the 
script (perhaps because of Hindi being widely learnt). In West Bengal I 
saw M.A. students always preferred to read their student editions of 
Sanskrit texts in Bangla lipi, similarly elsewhere. If Hindi was not 
promoted in India together with devanagari becoming scholarly script for 
Sanskrit, the case would be similar to Pali. Why should Pali be studied 
in say Sinhala script rather than any other one or than romanized 
transliteration according to needs? 

______________________________________________________________________________
Mr. Jakub Cejka
Dept. of Sanskrit, University of Pune
Ganeshkhind, Pune, India  411 007

e-mail:  jakub at unipune.ernet.in   (till July 97 the latest)









More information about the INDOLOGY mailing list