[INDOLOGY] Transcoding

Peter Flugel pf8 at soas.ac.uk
Sun Jun 19 09:10:01 UTC 2016

Dear Peter,

Thank you very much for pointing me to this valuable tool!



On 18 June 2016 at 14:37, Peter Scharf <scharfpm7 at gmail.com> wrote:

> Dear Peter and others,
> The most convenient way to transform Sanskrit text from one encoding to
> another is to use the transcoding software developed by the Sanskrit
> Library.  This transcoding software can be used in one of two ways:
> 1. Strings of text of any length can be transcoded in toto by pasting them
> into the transcoding window on the web at:
> http://sanskritlibrary.org/transcodeText.html
> Simply select the input and output encodings from the two menus at the
> bottom of the page.
> 2. Download the transcoding software and install it locally on your own
> machine and run it under Unix and transcode from and two a great number of
> transcodings.  On a Mac or Linux system this is easy.  I don't know how to
> do it on a PC.  The downloaded software permits very sophisticated
> delineation of which strings to transcode within a document of mixed text.
> One can tag strings in a certain way, for example with specific start and
> end character strings or xml tags, and then transcode all strings with
> those tags in one way and all strings with another tag in another, e.g.
> transcode <s>kfzRa</s> to Devanagari and <r>kfzRa</r> to Roman.  Or one can
> select text within a document to transcode using regular expressions.  The
> software is available for download near the bottom of the alphabetical list
> of downloadable software on the Sanskrit Library downloads page:
> http://sanskritlibrary.org/downloads.html.  Look for TranscodeFile (Java
> program) <http://sanskritlibrary.org/software/transcodeFile.html>
> I have made a number of transcoding rules for my own use which I'm glad to
> share if you want help getting started.
> Yours,
> Peter
> *************************
> Peter M. Scharf
> scharfpm7 at gmail.com
> *************************
> On 18 Jun 2016, at 5:18 AM, Peter Flugel wrote:
> Dear Peter
> Thank you for this really interesting information.
> I have a question which you may be able to answer as well: what is the
> best way for transforming texts written in Nagari characters into roman
> script? I am trying to integrate two data bases.
> Yours
> Peter
> Sent from my iPhone
> On 17 Jun 2016, at 20:28, Peter Scharf <scharfpm7 at gmail.com> wrote:
> Dear Indologists,
> I have just completed a comparison of the ligature formation produced by
> several Devanagari fonts and thought it might be useful to share the
> results of the comparison.  I compared 1260 ligatures formed by the LaTeX
> Skt package with seven Unicode fonts.  The ligatures compared were the
> combined set of all those listed by Ulrich Stiehl in his document, *Conjunct
> Consonants in Sanskrit*, Heidelberg, 21 April 2003, pp. 4--34, and those
> listed in the Skt package documentation *Sanskrit for LaTeX2e*, pp.
> 22--35.
> 1. LaTeX Skt package
> 2. Chandas
> 3. Uttara
> 4. Sanskrit2003
> 5. Praja
> 6. Arial Unicode MS
> 7. Devanagari MT
> 8. Mangal
> The LaTeX Skt package comes with the TeXLive installation available at
> https://www.tug.org/texlive/.  The Chandas and Uttara fonts were produced
> by produced by Mihail Bayaryn and are available at
> http://www.sanskritweb.net/cakram/.  The Sanskrit2003 font was produced
> by Ulrich Stiehl and is available at
> http://www.omkarananda-ashram.org/Sanskrit/itranslator2003.htm.  These
> fonts are all available free of cost.  Praja was produced by Peter Freund
> and is available for $35 at
> https://secure.bmtmicro.com/servlets/Orders.ShoppingCart?CID=5115&PRODUCTID=51150002.
> Arial Unicode MS is available with Microsoft Office, FrontPage and
> Publisher, with the installation of international support.  Devanagari MT
> is available with Mac systems with the Asian languages support.  Mangal is
> available with Windows systems with supplemental language support.
> The comparison showed that Chandas and Uttara are able to form all
> conjuncts correctly with the exception of seven sequences: *ṅkṣṇva*,
> *ṅrvya*, *ṭhthya*, *dḍḍa*, *ddbra*, *ddvra*, *l̃la*, without the
> interruption of an inappropriate virāma.  The LaTeX Skt package handles all
> but 29.  Sanskrit 2003 lacked 80, Praja 187, Arial Unicode MS 201,
> Devanagari MT 232, and Mangal 236.  I also checked the behavior of the
> fonts in handling the accents in the Devanagari extended, and Vedic
> extenstions Unicode pages.  Only the Praja font handled them all properly,
> the LaTeX Skt package handles most Vedic accentuation, while most fonts
> handled only the common accentual system.  A test of Vedic accents with any
> font can be performed by visiting the Sanskrit Library's interactive Vedic
> Unicode character phonetic value table at
> http://sanskritlibrary.org/accents.html.  Simply set your browser to use
> the font you would like to test.
> The first five fonts listed are therefore commendable; the last three are
> inadequate for Sanskrit.  It would be desirable for Mihail Bayaryn and
> Ulrich Stiehl to upgrade their fonts, which otherwise handle conjuncts very
> comprehensively, to handle the Vedic characters in the two Unicode pages
> mentioned including in particular the combining candrabindu with semivowels
> *l*, *y*, and *v*.
> Other Indic fonts not tested are described on the University of Chicago's
> South Asia Language Resource Center page at
> http://salrc.uchicago.edu/resources/fonts/available/hindi/.
> Yours,
> Peter
> *************************
> Peter M. Scharf
> scharfpm7 at gmail.com
> *************************
> _______________________________________________
> INDOLOGY mailing list
> INDOLOGY at list.indology.info
> indology-owner at list.indology.info (messages to the list's managing
> committee)
> http://listinfo.indology.info (where you can change your list options or
> unsubscribe)

Dr Peter Flügel
Chair, Centre of Jaina Studies
Department of Religions and Philosophies
Faculty of Arts and Humanities
School of Oriental and African Studies
University of London
Thornhaugh Street
Russell Square
London WC1H OXG

Tel.: (+44-20) 7898 4776
E-mail: pf8 at soas.ac.uk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://list.indology.info/pipermail/indology/attachments/20160619/6e11d920/attachment.htm>

More information about the INDOLOGY mailing list