An Exciting New Development

Mehta, Shailendra Mehta at
Fri Feb 21 21:46:19 UTC 1997

An Exciting New Development

Nai Dunia is a well respected and authoritative Hindi daily, published
from Indore, in Madhya Pradesh. I remember many summers spent at my
grandmother's place in Burhanpur, when I devoured it with gusto, every
day. It is a superb paper. It has also been at the forefront in terms of
applying information technology in its day to day running. 

So it was quite a treat to learn that it is now the first Internet Hindi
daily. What is interesting is that they have accomplished this without
using complicated gif or postscript files. Once you download a font,
which they make available free of charge, you can use the capabilities
available in the latest generation of browsers (for example Netscape
3.0) to view the pages directly in Devanagari script. Since they own the
font and the software associated with it, no additional permission is
required from anyone.  You can access their site at

In one swoop they have opened up several and exciting new possibilities.
I would like to explore some of them here in the hope that others will
do likewise.

1. Vinay Chhajlani, who owns a software firm in addition to belonging to
the family which owns Nai Dunia, has kindly offered to make the font and
the associated Windows files available to everyone. It will then give
all of us the capability to exchange mail in Devanagari and to read it
from any word processor. From what I understand, it automatically
creates the necessary ligatures.

2. Nai Duniya has enormous libraries of material in Hindi, which they
created and own which can be used for teaching and learning Hindi at all
levels - material suitable for children, for adults, cartoons etc..
which they are willing to put on line. In addition one now has the
capability to immediately put all sorts of Devanagari texts on line. In
particular, the entire Indology corpus of Sanskrit e-texts, can go on
line in Devanagari, capable of being view instantly by any browser.
Sophisticated dicitionaries can also be placed on line in an identical
fashion. This would apply to equally well to the enormous amount of
material pertaining to Hindi films (particularly Hindi film songs) which
has been accumulated on line, as also to resources for teaching Sanskrit
which have also been proliferating on the net. (Both of these efforts
have been hampered by the necessity of using the Roman script to mediate
any information flow.)

3. We can go one step further. The main problem of the Indian
subcontinent is not the proliferation of languages, since most of us can
understand each other's speech quite well, especially as it gets more
technical and the vocabulary gets Sanskritised. The main problem is the
proliferation of scripts. Provided that we adopt standards that
encompass all of the Indic Scripts in a uniform way, it would not be
difficult to solve this problem, at least on the net. We can then read
all the Indic content on the Web in the script of our choice -- Tamil,
Bengali, Devanagari and yes, Roman (for the large and growing Indian
diaspora). So a Hindi lover of Tagore could read him in Devanagari. And
I daresay, understand most of it. Similarly a Telugu lover of the Urdu
Ghazal could read them in the Telugu script.  Indeed, as some of you are
aware, the Economics Journal "Artha Vijnana" published by the Gokhale
Institute in Pune, used to publish papers in all Indian languages, but
in the Devanagari script. Since all Indian languages (except Urdu) use
(with minor exceptions) identical technical terminology, it was possible
to read the papers without any loss of comprehension with a minimal
glossary which was provided at the end for smoothening out some nuances.
Sadly, this brave experiment was stopped some years ago. However, this
can be resurrected on the Net on a major scale. I truly look forward to
reading the Tirukkural in Devanagari along with a Hindi translation.
Indeed Hindi readers have long accessed Urdu and Persian classics this
way, with almost total comprehension.

4. As those of us who try to speak very pure forms of Indic languages
are aware, there is always a term which stumps you. I am quite capable
of carrying on a literate conversation about Science, Philosophy or
Economics in Hindi (and routinely do with several of my colleagues and
friends here) but often a new term will stump one. How do you, for
example, "database", in Hindi? In such situations many people use
Sanskrit as a lego set and improvise. More often than not, one manages
to get the right term or something close to it. However, one needs a
dictionary to be absolutely sure. Now, as it happens, the Government of
India has been compiling hundreds of such technical dictionaries.
(Indeed this is the mandate of several thousand individuals on a daily
basis. ) I own more than a dozen of these myself (one of which has over
300,000 of the most common scientific terms, and is a very competent
effort) but every once in a while I will require a word not in any on
these. For this one needs access to the full database (constantly
updated by the scholars working in conjunction with Shastri Bhavan) of
several million terms. It turns out that it is already in machine
readable form. Perhaps we could persuade the government to put it on

5. Unfortunately, to the best of my knowledge, the Government of India
undertakes this exercise only for Hindi. The responsibility for the
other Indian languages resides with their respective state governments,
which in most cases do not have comparable resources. This is where the
fun starts. Assuming that more than 99.9% of these technical words will
be common to all Indic languages (except Urdu) by changing the script in
which the browser views the database, the entire corpus is made
available to every Indic language. To take one example, the phrase
"elasticity of supply" will be purti-locha in both Kannada and Bengali
and they can all access this phrase by looking at the same database.

6. One could go a lot further. For example we can have readings of texts
(via real audio) keyed to viewings of the text on a browser. Several
prominent theatre personalities have expressed a keen interest in create
a public resource in this way. I know that the "India on the Internet"
project hosted by Vedika Software in Singapore will be happy to provide
a permanent home to any such efforts, along with a fast Internet
connection, and they could be mirrored world wide. This database could
then be added to, on a volunteer basis.

However, before we get to all of that and more, I have one apprehension
which is the main motivation for posting this message. We need a common
standard. We certainly do not want the situation with Hindi typewriter
keyboards (there are nearly a dozen different layouts) repeated here.
Competing standards create numerous problems. Even if one standard
eventually wins out, as the work of Brian Arthur and Paul David shows
(and this is one of the areas that I research), it need not be the best,
if network externalities (i.e. a standard is valued not only for its
technical efficiency but also for the number of people who use it) are
present. Can we devise a way for Nai Dunia, ISO, ITRANS and others to
all come under a common umbrella? 

The software experts, and I refer to Messers Wujastyk, Pandey, Sibal
among others, should certainly be able to enlighten us on this score as
they have done on numerous previous occasions.

Shailendra Raj Mehta
mehta at

More information about the INDOLOGY mailing list