OCR of (transcribed) indic texts

Juergen Neuss jneuss at ZEDAT.FU-BERLIN.DE
Tue Nov 24 19:37:55 UTC 1998

Dear list-members,
maybe some of you are aware of the difficulties which arise if one wants
to scan transcribed (not to speak of original) indic texts. Of course
the scanning itself is not the problem but the subsequent transformation
of the image-file into a text-file by means of OCR (Optical character
recognition) programs. These programs often do recognise only the usual
set of ASCII-characters. Some of them include extended features ehich
means that in certain cases the user may direct the program to read a
certain difficult character in a certain way. As far as I know
diacritical signs are a problem for at least most of these programs. If
anyone of you has experience with OCR programs in this respect I would
be grateful for your recommendations. Moreover I would like to know
whether there are any OCR programs available which recognise Indian
characters of any kind. I hope this message does not provoke any
response which violates the non-commercial spirit of this list.
Thanks for reading.
jneuss at zedat.fu-berlin.de

Juergen Neuss

Freie Universitaet Berlin

Institut für Indische Philologie und Kunstgeschichte

Königin-Luise-Str. 34a

14195 Berlin

More information about the INDOLOGY mailing list