frequencies

Wed Feb 18 11:09:39 UTC 2009

I do agree that statistical and other analysis of the corpus of Sanskrit
texts would deliver interesting insights, for instance in the way technical
terms were invented by different sects in order to separate themselves off
from the 'competition'. Perhaps one could see in the Sanskrit texts
interference with the mothertongues of the authors. I think texts of Jains
and Buddhists should be included.
Victor

-----Oorspronkelijk bericht-----
Van: Indology [mailto:INDOLOGY at liverpool.ac.uk] Namens Dominik Wujastyk
Verzonden: woensdag 18 februari 2009 11:53
Aan: INDOLOGY at liverpool.ac.uk
Onderwerp: Re: frequencies

I agree with Alexandra.

I also think that one mustn't put the cart before the onager.  There 
exists a corpus of language material called "Sanskrit", and it is 
therefore available for corpus-analysis, just like the British National 
Corpus (http://www.natcorp.ox.ac.uk/using) or any other similar project. 
One can dream up all sorts of fascinating questions to ask of the 
material.  One can attend to the layering of the material, or the division 
into subject fields.  Following the analysis, one may arrive at 
interesting or unexpected conclusions, try to explain them sociologically, 
etc. etc.  I feel certain that there are undiscovered patterns of syntax 
and usage in Sanskrit that probably persisted over very long periods of 
time.  Eli's assertion that one can't analyse the stuff because it's so 
hermetically divided by genre is itself something it would be interesting 
to test quantitively.  This kind of work might, for example, show a 
clearer distinction than any of us are currently aware of between core 
usage and extended genre-specific usage (in lexemes, syntax, etc.).

One of the main interests of corpus analysis in other languages has been 
precisely that generally held assumptions about language use have been 
overturned and that all sorts of interesting features have been discovered 
that nobody previously knew about.

Incidentally, the generation of Sanskrit scholars working in the 19th 
century, Roth, Aufrecht, Boehtlingk and others, treated the Veda very much 
as a closed corpus, and this attitude informed to some extent to their 
approaches to the rest of Skt literature.  There was a clear underlying 
assumption that it could all be definitively grasped, submitted to lexical 
analysis, and nailed down.

Best,
Dominik

-- 
Dr Dominik Wujastyk

On Wed, 18 Feb 2009, Alexandra Vandergeer wrote:

> That makes it even more interesting to perform statistical tests. When you
> read Latin descriptions of new species in the 19th century, you can't help
> seeing a native language substratum underlying this 'Latin'. Why this
> wouldn't have been the case for Sanskrit? Everybody repeats the same,
> Sanskrit is pure, holy and so on, but has this ever been measured? Being
> holy doesn't make it vulnerable to impacts from a 'lower' level, including
> the bazaar, if you like. Also Sanskrit suffered from an evolution from
> within, so to say.
>
> (I'm not a linguist either; even if my phd is on a linguistic subject, I'm
> more interested in the statistics of language use than in the derivation
> of word stems :-) ).
>
> Alexandra
>
>> Obviously Sanskrit is a language functioning in a timeless never-never
>> world. It is the language of the Brahmanical sacred world-order. Thus it
>> would probably never have been meant to be a vehicle of daily
>> communication.
>> Sanskrit is timeless, pure and holy, at least certainly since the second
>> millenium (C.E.). The use of Sanskrit by Buddhists in the first millenium
>> is
>> certainly remarkable. Could it indicate a strong tendency on the part of
>> Buddhists to adapt themselves even more to Brahmanical norms than the
>> texts
>> of the Pali canon seem to indicate? The comparison with Hebrew is
>> interesting, for Hebrew is another ancient sacred language of scriptures
>> and
>> not of daily communication on worldly matters. Latin and Arabic also
>> developed these tendencies.
>> But I'm no linguist.
>> Victor van Bijlert
>>
>>
>> -----Oorspronkelijk bericht-----
>> Van: Indology [mailto:INDOLOGY at liverpool.ac.uk] Namens
>> franco at RZ.UNI-LEIPZIG.DE
>> Verzonden: dinsdag 17 februari 2009 15:55
>> Aan: INDOLOGY at liverpool.ac.uk
>> Onderwerp: Re: frequencies
>>
>> Frequency in Sanskrit does not work in the same way as in English and
>> other modern languges. It is possible to complie a list of 3000 words
>> in English that cover 70-80% of "all" conversations, newspaper
>> articles, etc. This is just not possible in the case of Sanskrit--if
>> it were possible, it would have been done a long time ago--because the
>> vocabulary is highly specialized according to literary genres. On the
>> other hand, if one moves within the same genre, one can go back and
>> forth hundreds of years without any difficulty, something that cannot
>> be done in English, German, French and do on. Hebrew is an exception,
>> but this is a special case.
>> Best wishes,
>> EF
>>
>>
>>
>>
>>   Quoting Jonathan Silk <kauzeya at GMAIL.COM>:
>>
>>> Just a quick note (in addition to correcting the misprint pointed out by
>>> Jan--yes, of course, linguist!): Whether or not one wants to include the
>>> lexicon of Buddhist texts as "Sanskrit"--and there was long ago more
>>> than
>>> one discussion about this, about whether we also want to speak of Jaina
>>> Sanskrit, architectural Sanskrit and so on--the language of these texts
>>> is
>>> not in any sense "derived from Pali". While the two are related, to be
>> sure,
>>> and some portion of Buddhist(ic) Sanskrit vocabulary may have been
>> borrowed
>>> or adapted from Middle Indic (--that is, *some* Buddhist[ic] Skt is
>>> 'Sanskritized Prakrit'), I am not aware of any case in which it can be
>> shown
>>> that the Middle Indic in question is Pali (but I have not looked into
>>> this--has anyone?).
>>>
>>> This is slightly off the topic, but the point is that if one wants to
>> decide
>>> to exclude particularly Buddhist lexica from a lexicon of Skt, the
>>> grounds
>>> for this cannot be that the words are not Skt.
>>>
>>> On Tue, Feb 17, 2009 at 2:39 PM, Alexandra Vandergeer
>>> <geeraae at geol.uoa.gr>wrote:
>>>
>>>> Naturally, but the same is valid for present-day English. Frequency
>>>> lists
>>>> are based on a wide spectrum, including newspapers, books, literature,
>>>> spoken language, but not necessarily poems. In the case of Skt, I'd
>> expect
>>>> epics, philosophical texts in the broadest sense, shastras, [Buddhist
>>>> texts not, likely derived from Pali] to give a reasonable sample of the
>>>> Sanskrit language as is.
>>>>
>>>> And I agree with Jonathan that the lexicon suggested by Himal is likely
>>>> a
>>>> 'useful' vocabulary to read avarage Skt texts. Anyway, thanks Himal for
>>>> the suggestion.
>>>>
>>>> Alexandra van der Geer
>>>> Athens
>>>>
>>>>> I am not sure whether the question is even meaningful for classical
>>>>> Sanskrit. Frequency where? In Epic literature? In philosophical
>>>>> literature? In dharmasaastra or Buddhist texts? Each genre has its
>>>> own
>>>>> special vocabulary, and its own frequencies.
>>>>> Best wishes
>>>>> EF
>>>>
>>>
>>>
>>>
>>> --
>>> J. Silk
>>> Instituut Kern / Universiteit Leiden
>>> Postbus 9515
>>> 2300 RA Leiden
>>> Netherlands
>>>
>>>
>>
>>
>>
>> ----------------------------------------------------------------
>> This message was sent using IMP, the Internet Messaging Program.
>>
>