Dear list members,

While I am not an IVC expert and have absolutely no skin in the game, I got repeatedly asked over the last months to say something about this since I am doing something with "computers and Sanskrit" and Yajnadevam seems to be using computers as well. While I am not very interested in his work, it didn't sit well with me that these claims keep showing up again and again, so I finally spared an evening and took a bit of a look at what YD is actually doing.

Their paper is found here (it was never peer reviewed, and I have my doubts if it would ever pass that process): https://www.academia.edu/78867798/A_cryptanalytic_decipherment_of_the_Indus_Script

Rohan Pandey has done nice work to look at the decipherment method applied by Yajnadevam and how its failing: https://x.com/khoomeik/status/1882058141145403817

On the methodological end there are countless issues with YD: Most strikingly, the assumption that a bronze-age script can be just one-to-one mapped to a language that is likely thousands of years younger is very wild. Even if there was a clear continuity from IVC to Sanskrit, it is very unlikely to work this way.

Forcing a 1-to-1 mapping with a cipher algorithm is not very difficult if you have a reasonably big dictionary (we know that MW is not lacking possible entries) and relax the constraints such as skipping aspirants etc. until you get matches, which is what YD did. After doing this, the author still has the burden of proof on their end to show that the generated language "works" and is not just a random combination of dictionary entries.

So coming from the very other end, I recently looked at the problem from the point of view of Sanskrit computational linguistics.

For anybody with Sanskrit knowledge, looking at these "decipherments" creates the impression that while they consist of Sanskrit words, their choice and combination is very, very strange to say the least.

In order to quantify this: The simple intuition here is that "if this is Sanskrit, it should parse as such". So I used our (Vedic) Sanskrit parser (https://arxiv.org/abs/2409.13920) and tested how many of the generated dependency trees for the alleged "Sanskrit decipherments" are valid (single root, no circular subtrees, no purely flat sentences etc.). I did the same experiment for a number of control texts, both (Vedic) Sanskrit and random languages. This is not a sufficiently thorough scientific publishable evaluation, but it at least should give a rough indication on where things are heading.

Here are the results:

Yajnadevam's "Sanskrit" is the bar represented with yd-parsed.txt as label. As you can see, the error rate is very high, close to parsing random sections of German text or perhaps Icelandic with the parser. Interestingly, it does a much better job even at parsing Latin into valid structures than YD's "Sanskrit".

As you can see, the three control text (random sections of text sampled from etexts of the Chāndogya Upaniṣad, Śatapatha Brāhmaṇa, as well as different Atharvaveda recensions) parse with <10% error rate. One can argue that this is still a bit high and hints at a not-so-perfect setup of the parser, but I did this ad-hoc in one evening since I don't have a lot of time to spare on this.

While this is not a sufficiently scientific treatment, and I am not sure if YD's work warrants that since it was never submitted to proper peer review int he first place but got hyped up by media directly instead, it might still be interesting to some of you who wonder what is the deal of these claims.

Best,

Sebastian