[Anita Lowry <lowry at cunixf.cc.columbia.edu>: [Elaine Brennan <EDITORS at BROWNVM.brown.edu>: 6.0405 Rs: Sanscrit and Hungarian E-Texts (2/216)]]

Mon Dec 14 12:07:07 UTC 1992

Cross-posting, for your info.   /David Magier Columbia University
                ---------------

Return-Path: <lowry>
Received: by cunixf.cc.columbia.edu (5.59/FCB/jba)
	id AA10809; Sat, 12 Dec 92 22:55:55 EST
Date: Sat, 12 Dec 92 22:55:54 EST
From: Anita Lowry <lowry at cunixf.cc.columbia.edu>
To: magier
Cc: scottr
Subject: [Elaine Brennan <EDITORS at BROWNVM.brown.edu>: 6.0405 Rs: Sanscrit and
        Hungarian E-Texts (2/216)]
Message-Id: <CMM.0.90.4.724218954.lowry at cunixf.cc.columbia.edu>

David,
FYI.
/Anita
                ---------------

Return-Path: <HUMANIST at BROWNVM.brown.edu>
Received: from brownvm.brown.edu by cunixf.cc.columbia.edu (5.59/FCB/jba)
	id AA04939; Fri, 11 Dec 92 20:13:54 EST
Message-Id: <9212120113.AA04939 at cunixf.cc.columbia.edu>
Received: from BROWNVM.BROWN.EDU by BROWNVM.brown.edu (IBM VM SMTP V2R2)
   with BSMTP id 0528; Fri, 11 Dec 92 20:07:32 EST
Received: from BROWNVM.BITNET by BROWNVM.BROWN.EDU (Mailer R2.08 R208004) with
 BSMTP id 5066; Fri, 11 Dec 92 15:46:48 EST
Date:         Fri, 11 Dec 1992 15:12:09 EST
Reply-To: Elaine Brennan <EDITORS at BROWNVM.brown.edu>
Sender: "HUMANIST: Humanities Computing" <HUMANIST at BROWNVM.brown.edu>
From: Elaine Brennan <EDITORS at BROWNVM.brown.edu>
Subject: 6.0405  Rs: Sanscrit and Hungarian E-Texts  (2/216)
To: Multiple recipients of list HUMANIST <HUMANIST at BROWNVM.brown.edu>

Humanist Discussion Group, Vol. 6, No. 0405. Friday, 11 Dec 1992.

(1)   Date:     Thu, 10 Dec 92 12:00:10 WST                   (21 lines)
From: From:     Thomas B. Ridgeway <ridgeway at blackbox.hacc.washington.edu>
Subject: bject:  Re: 6.0397  E-Text Query  (1/10)

(2)   Date:     Wed, 9 Dec 1992 16:17 EST                     (195 lines)
From: From:     Paul Mangiafico <PMANGIAFICO at guvax.acc.georgetown.edu>
Subject: bject:  Sanskrit and Hungarian E-texts

(1) --------------------------------------------------------------------
Date: Thu, 10 Dec 92 12:00:10 WST
From: Thomas B. Ridgeway <ridgeway at blackbox.hacc.washington.edu>
Subject: Re: 6.0397  E-Text Query  (1/10)

John Haviland of Reed enquires re Sanskrit (or Hungarian) e-texts:
   A small sample of Sanskrit e-texts is available for anonymous
   ftp from blackbox.hacc.washington.edu in the directory pub/indic
   (Brihatsamhita, Panini Sutras, Buddhacarita and Saundaryalahari
    to be specific).
   These are encoded in the proposed Classical Sanskrit Extended
   standard for encoding romanized Indic languages.  For more
   discussion on this and related matters, I refer you to
   the listserv group Indology-l, based at liverpool.ac.uk
Tom
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Thomas Ridgeway, Director,
Humanities and Arts Computing Center/NorthWest Computing Support Center
35 Thomson Hall, University of Washington, DR-10
Seattle, WA 98195   phone: (206)-543-4218            *  Ask me about  *
Internet: ridgeway at blackbox.hacc.washington.edu      *    Unix TeX    *
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(2) --------------------------------------------------------------206---
Date: Wed, 9 Dec 1992 16:17 EST
From: Paul Mangiafico <PMANGIAFICO at guvax.acc.georgetown.edu>
Subject: Sanskrit and Hungarian E-texts

Regarding John Haviland's request for Sanskrit and Hungarian e-texts,
I was able to find a few with a quick search in Georgetown
University's CPET (Catalogue of Projects in Electronic Text). I have
included a list below, beginning with the ten categories under which
the data is classified, and followed by info on projects with Sanskrit
e-texts and one project with Hungarian e-texts. I hope this information is
useful to many HUMANISTs.

If you are in search of other e-texts, the Georgetown CPET may be of use
to you as well. The CPET database can be accessed via Telnet or modem,
or if you send me particulars on what you are looking for I can do a
quick search for you and email the results. In any case, if you would
like more information on this service, just send me a note.

Paul Mangiafico, project assistant
Center for Text & Technology
Georgetown University

pmangiafico at guvax.georgetown.edu

**** CATEGORIES OF CLASSIFICATION ****

0.   Identifying acronym or short reference.
1.   Name and affiliation of operation (with collaborators noted).
     References to any published description.
2.   Contact person and/or vendor with addresses (including
     telephone and email if possible).
3.   Primary disciplinary focus (and secondary interests) [e.g.
     Literature, Language, Linguistics, Music, Art, etc.].
4.   Focus: time period, location, individual, genre, or medium.
5.   Language(s) encoded; [English, French, German, et. al.].
6.   Intended use(s) [e.g. textbank, database, bibliography] with
     Goal (or statement of purpose) and Size [number of works, or
     entries, or citations].
7.   Format(s), including choice of sequential text or database
     excerpts, file formats, analytical programs and programming
     languages, text markup and encoding schemes, hardware and
     operating systems, etc. To what extent are the formats
     consistent throughout the archive?
8.   Form(s)  of access: if online, what policies? If tape, what
     track, bpi, block size, labels, parity setting? If diskette,
     what size and operating system or microcomputer? If CD-ROM,
     what format? What software is needed for accessing? Is it
     provided with the package? Availability and price.
9.   Source(s) of the archival holdings: encoded in-house, or
     obtained from elsewhere (where)? Textual authority used for
     encoding? Titles of the works held, bibliographical
     information on them.

**** PROJECTS WITH SANSKRIT E-TEXTS ****

Bamberg (Otto Friedrich Universita%t)/ Thesaurus of Texts in
          Ancient Indo-European Languages
     CPET#184
     0. THESIETEXT (Thesaurus of Texts in Ancient Indo-European
          Languages)
     1. Thesaurus Indogermanischer Textcorpora; Universitat
          Bamberg, Germany  See Journal "Die Sprache," Vol. 32/2
     2. Dr. Jost Gippert
          Universita%t Bamberg, Orientalistik
          Postfach 1549
          D-W-8600 Bamberg, Germany
     3. Literature, language, linguistics, history
     4. From beginning of literacy to 17th century; Eurasia
     5. Old Indic (Sanskrit), Old Iranian (Avestan, Old Persian),
          Hittite, Tokharian, Old Germanic, Greek (Ancient), Italic
          languages, Armenian (Old), and several other I.- E.
          languages.
     6. Textbank
     7. Sequential text; encoding scheme of DOS, WordCruncher,
          and WordPerfect 5.1
     8. Access on diskettes, CD-ROM (planned)
     9. Encoded by various scholars in different parts of Europe.

Hamburg (Univ)/ Sanskrit medical encyclopaedias
     CPET#191
     1. Sanskrit medical encyclopaedias
     2. Prof. R.E. Emmerick
          Iranian Studies
          University of Hamburg Germany
     3. Medicine
     4. Caraka, Susruta, Astangahrdaya, Astangasamgraha, and the
          Siddhasara of Ravigupta
     5. Sanskrit

Tu%bingen (Seminar fu%r Indologie und Vergleichende
     Religionswissenschaft)/ Tu%bingen Parana Project
     CPET#308
     1. Tu%bingen Parana Project. Peter Schreiner, Renate
               So%hnen, Heinrich v. Stietencron. Publications
               Indicies and Text of the Brahmapurana. Wiesbaden:
               Harrassowitz [1987]
     2. Professor Dr. Heinrich v. Stietencron
          Seminar fu%r Indologie und Vergleichende
          Religionswissenschaft
          Mu%nzgasse 30
          D-7400 Tu%bingen Germany
          Tel. 0049-7071-292675
     3. Indology (Indian studies), Sanskrit
     4. Classical Hinduism; Puranas, Brahmapurana
     5. Sanskrit
     6. Published indicies on microfiche; deposit of the input
          with the Oxford Text Archive has been announced but not
          yet carried out. The Brahmapurana is a single Sanskrit
          text with ca. 14000 verses.
     7. Straight-forward trans-literation with marking of sandhi,
          nominal compounds, references; TUSTEP format (ASCII
          format possible). TUSTEP programs for KWIC-index, reserve
          index word forms etc.
     9. Encoded in-house.

Zurich (Univ)/ Sanskrit texts
     CPET#268
     1. Sanskrit texts
     2. Prof. Peter Schreiner
          Abteilung fu%r Indologie
          Universita%t Zu%rich
          Ra%mistr. 68
          CH-8001 Zu%rich Switzerland
          tel. 0041-1-2572036
     3. Indology, Sanskrit, Hinduism, Indian philosophy
     4. Visnupurana, Manu, Sakuntala, Asvaghosa, Buddhacarita,
          Gaudapada-Karika, Adisesa, Paramarthasara, Bhagavadgita,
          Narayaniyam, Mahabharata, Svetasvatara-Upanisad.
     5. Sanskrit
     6. deposit with Oxford Text Archive intended
     7. Straight-forward trans-literation with marking of sandhi,
               nominal compounds, references; TUSTEP format (ASCII
               format possible). TUSTEP programs for KWIC-index,
               reserve index word forms etc.
     8. Presently none
     9. Encoded in-house

TX Austin (University of Texas)/ Thesaurus Linguae Sanskritae
     CPET#101
     1. Thesaurus Linguae Sanskritae, University of Texas
     2. Prof. R. Lariviere
          University of Texas
          Austin, Texas 78712
          tel. (512) 471-5811
     5. Sanskrit
     9. Texts include Mahabharata and Ramayana

**** PROJECT WITH HUNGARIAN E-TEXTS ****

PA Pittsburgh (Carnegie Mellon Univ)/ CHILDES Database
     CPET#95
     0. CHILDES (Child Language Data Exchange System)
     1. Childes Database, Carnegie Mellon Univ
          See "The Child Language Data Exchange System:  An
     Update," Journal of Child Language, [1990].  Snow,
           Catherine. "The Child Language Data Exchange System",
          ICAME Journal (No.14). Bergen, Norway: Norwegian
          Computing Center [April 1990]. Carterette, E. & Jones,
          M.H. Informal Speech. Berkeley: University of California
          Press [1974].  MacWhinney, B. & Snow, C. "The Child
          Language Data Exchange System", Journal of Child Language
          (Vol.12, pages 271-296). [1985].
     2. Brian MacWhinney
          Department of Psychology
          Carnegie Mellon University
          Pittsburgh, PA 15213
          BITNET: brian at andrew.bitnet
          Internet: edu%"brian at andrew.cmu.edu"
     3. Linguistics; psycholinguistics
     4. Transcripts of children's dialogue
     5. English, Afrikaans, Danish, Dutch, French, German,
          Hebrew, Hungarian, Italian, Polish, Slobin, Spanish,
          Tamil.
     6. Database of 40 sets of corpora of parent-child and child-
          child interactions from children speaking (13 languages
          in total); the corpora are divided into six major
          directories: English, non-English, narratives, books,
          language impairments, and second language acquisition;
          includes three major tools for child language research:
          (1) the CHILDES database of transcripts, (2) the CHAT
          system for transcribing and coding data, and (3) the CLAN
          programs for analyzing CHAT files; 140 million characters
          (140 MB).
     7. Database excerpts; available on floppies and tapes;
          detailed coding scheme has been devised and the data are
          put in that format
     8. (Planned) CD-ROM
     9. Obtained from researchers