[INDOLOGY] New version of the Digital Corpus of Sanskrit

hellwig7 at gmx.de hellwig7 at gmx.de
Sun Jul 17 14:36:10 UTC 2016

Dear list,

after several years, a new version of the Digital Corpus of Sanskrit has 
come out. It contains, among other texts, the complete morphological and 
lexical annotation of the Mahabharata except for three prose chapters.

Although you are still redirected from the old URL, you may note the new web 

A few notes on the new release:
(1) I find the multi-word search rather useful: You can now search for text 
lines that must contain two or more lemmata (click on the "Add to multi-word 
q." links after a search result on the query page). To start with, try 
something popular such as rāma and sītā; will display all text lines that 
contain any inflected form of rāma and sītā.
(2) Global and text dictionaries have been merged into one. Contrary to 
former versions, the lexicographic database now contains all lemmata given 
in my digital dictionary, even if they don't occur in a text.
(3) You should, in principle, be able to type IAST Unicode directly in the 
search interface.
(4) The information contained under "Similar and related words" is only a 
gimmick at the moment, at least for less popular words. It displays the 
cosine similarity between neural embeddings built with word2vec 
(https://en.wikipedia.org/wiki/Word_embedding for more information). They 
seem to capture some semantic similarites; check, for instance, 'rāma' or 
(5) This release relies heavily on JavaScript. The website will look quite 
unresponsive when JS is deactivated in your browser.
(6) Not sure why I chose the former design. The readability of the site 
should now be better, esp. on small screens.
(7) Access to parts of the semantic annotation layer will be added in the 
next weeks.

I'm considering quite seriously to make this version of the DCS open source. 
If you are interested in collaborating, please send me your github user 
name, so that I can invite you to the project.

Finally, my special thanks go to the patient and helpful team of the KJC at 
the University of Heidelberg!


Oliver Hellwig, University of Düsseldorf, Germany 

More information about the INDOLOGY mailing list