[INDOLOGY] Sandhi and compound splitting model

Jan E.M. Houben jemhouben at gmail.com
Wed Aug 29 15:29:54 UTC 2018


Dear Oliver,
I hope to be able to use the sandhi and word splitter, it will definitely
be useful.
As for RV 1.1.1: in no way does it affect your syntactic analysis which is
your main aim, but even in a coarse annotation dA and dhA should not and
need not be confounded; in fact, not J&B but, half a century earlier,
Geldner showed the way to a more correct interpretation, not in his
translation but in his note ad loc...
Best,
Jan

On Wed, 29 Aug 2018 at 12:02, Oliver Hellwig <hellwig7 at gmx.de> wrote:

> Dear Jan,
>
> thanks for the positive feedback on the word splitter. Hope it turns out
> to be useful for our research community.
>
> Reg. RV 1.1.1: The analysis does not imply that dhAtama is morphologically
> derived from dA "to give", although one may get this impression by the term
> "giving" in that line. "giving" is just a coarse word semantic annotation
> of dhAtama, which is - it's meant to be coarse! - not too far away from
> Jamison + Brereton 2014 ("most richly conferring treasure"). Same for the
> English terms (if any) in other lines.
>
> Best wishes, Oliver
>
> On 29/08/2018 09:58, Jan E.M. Houben wrote:
>
> Dear Oliver,
> Congratulations and thanks for sharing again a very useful research tool.
> Also for the tool you shared earlier (see below),
> which, incidentally, contains a mistake in the very first line:
>
> 1#1#1#2#2ratnadhātamam#2#dhātamam#dhātama###219609#4443604#1#ADJ#3#1#1#_##giving~130047~2
> The mistake -- and you are not the only one to make it -- is that the
> adjectival word part -dhātama- (you have chosen to neglect tama, probably
> consciously) is not derived from dā (cp. Gk. didoomi "I give, confer") but
> from dhā (cp. Gk. tithēmi "I establish").
> Herzliche Grüße,
> Jan
>
> ***
> I would like to announce the release of a full annotation of the Rigveda
> with morphological, lexical and verb-argument information.
>
> Data are stored in a publicly accessible repository at
> https://git.adwmainz.net/open/rigveda
>
> Details of the annotation process are described in the LREC paper, which
> is
> stored at the upper level of the repository.
>
>
>
>
> On Wed, 29 Aug 2018 at 07:24, Oliver Hellwig via INDOLOGY <
> indology at list.indology.info> wrote:
>
>> Dear all,
>>
>> Sebastian Nehrdich and I have developed a machine learning model that
>> splits Sandhis and compounds in "raw" Sanskrit text.
>>
>> You find further details, model, code and the data it was built with
>> (~600.000 lines of Sanskrit text from the DCS) at
>> https://github.com/OliverHellwig/sanskrit/tree/master/papers/2018emnlp
>>
>> The pdf in the github directory contains further technical information.
>>
>> If you know researchers who work on this topic and may be interested in
>> the model or the data, it would be great if you could forward this mail
>> to them.
>>
>> Oliver
>>
>> ---
>> Oliver Hellwig
>> IVS Zurich / SFB 991, Düsseldorf
>>
>>
>> _______________________________________________
>> INDOLOGY mailing list
>> INDOLOGY at list.indology.info
>> indology-owner at list.indology.info (messages to the list's managing
>> committee)
>> http://listinfo.indology.info (where you can change your list options or
>> unsubscribe)
>>
>
>
> --
>
> *Jan E.M. Houben*
>
> Directeur d'Études, Professor of South Asian History and Philology
>
> *Sources et histoire de la tradition sanskrite*
>
> École Pratique des Hautes Études (EPHE, PSL - Université Paris)
>
> *Sciences historiques et philologiques *
>
> 54, rue Saint-Jacques, CS 20525 – 75005 Paris
>
> *johannes.houben at ephe.sorbonne.fr <johannes.houben at ephe.sorbonne.fr>*
>
> *johannes.houben at ephe.psl.eu <johannes.houben at ephe.psl.eu>*
>
> *https://ephe-sorbonne.academia.edu/JanEMHouben
> <https://ephe-sorbonne.academia.edu/JanEMHouben>*
>
> [image: 1506959459738_Signature]
>
>
>
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://list.indology.info/pipermail/indology/attachments/20180829/4bb21f1a/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-1506959459.jpg
Type: image/jpeg
Size: 7300 bytes
Desc: not available
URL: <https://list.indology.info/pipermail/indology/attachments/20180829/4bb21f1a/attachment.jpg>


More information about the INDOLOGY mailing list