Remarkable New Resource

Birgit Kellner birgit.kellner at UNIVIE.AC.AT
Tue Apr 11 08:05:37 UTC 2006


Jonathan Silk schrieb:
> Cross-posted to H-Buddhism and Indology
>
> My friend and colleague Dan Martin (Jerusalem) has asked me to post 
> the following information.
>
> Dan has prepared and posted for free public use a tremendous 
> bibliography/biographical resource, taking account mostly of Indian 
> and Tibetan works primarily of Buddhist interest. Toward this end he 
> has posted his 1769 page (!!) bibliography on a web site in two forms, 
> pdf and rtf. These are available for free download for a limited time 
> (since they take up space on someone else's server).  Go to:
>
>
> http://www.eecs.berkeley.edu/~keutzer/martin/
>
> [...]
> A development that has just taken place in the last few days is this: 
> As I noticed and as anyone else who attempts to use the files will 
> notice, their arrangement in a text document is not the most efficient 
> way to arrange things. For example, cross-references are impossible. 
> The citations are in fact alphabetical by author (with a large section 
> of anonymous = mostly canonical works at the beginning--note that each 
> author is given what biographical information is available, a huge 
> resource in its own right!). However, David Germano at Virginia has 
> already volunteered to arrange and post the work in a Wikipedia kind 
> of way, especially such that users will be able to add citations. This 
> will not, naturally, be implemented immediately, however.
Dear All,

first of all, thanks to Dan Martin for making this immensely useful 
resource available, and thanks to Jonathan for pointing it out.

As with the recent announcement of SARDS2 
(http://www.indologie.uni-halle.de/Sards2/), I have some
suggestions for improving the accessibility of data, or at least 
reflections that might lead to such suggestions ... since David Germano 
already volunteered to participate in efforts towards this end, I'm 
CC-ing this message to him.

[Warning: the following may contain an overdose of technical information 
that could put you to sleep if you're not terribly interested in 
structuring data and conceiving database applications.]

First of all, a Wiki is probably the easiest and fastest way of making 
Dan's data available - this would just involve some fiddling with 
regular expressions to structure the data into general author and work 
headers, some data checking, and finally some cut-and-paste into a 
Mediawiki system -, but it may not necessarily be the best way to build 
a useful resource that's easy to maintain in the long run because a 
large part of the data is structured (bibliographical entries).

As I suggested for SARDS2 a few weeks ago, an automatized way to import 
and export bibliographical citations into standard-compliant formats 
(such as BibTeX or TEI) would not only greatly improve data 
accessibility, but would also serve as a powerful incitement for more 
people to contribute their data into such centralized resources because, 
after all, they can also get data out of these systems and insert it 
into their own bibliographies with (next to) no editorial effort. As for 
SARDS2, Walter Slaje already informed me that such 
export/import-facilities are being envisaged for the future.

A second complex of issues emerges when Dan Martin's massive resource is 
compared with one that's similar in approach, but currently available 
only in German; this is our own little digitization of 
Steinkellner/Much's systematic overview of the literature of the 
logico-epistemological school of Buddhism (published in Göttingen 1995), 
called "SUEBS online" 
(http://www.istb.univie.ac.at/cgi-bin/suebs/suebs.cgi). ("SUEBS" is the 
acronym used for the print edition.)

SUEBS online is not yet finished; there are still formatting and design 
issues, and the data is not yet thoroughly checked. An English 
translation of the German bits that it contains would be desirable, but 
there are no concrete plans to implement it for the time being. (Since 
SUEBS online is currently my own "hobby", all that happens to it or 
doesn't happen to it depends on my own spare time, which is becoming 
increasingly limited.)

More importantly, SUEBS online is intended for collaborative enhancement 
and maintenance of data, allowing scholars and students to contribute 
further data and to edit what they already contributed. This involves 
creation of different permission levels, and will also require some 
editorial control for maintaining consistency of data in the future.

Because SUEBS online was converted from a text document with very little 
effort, its bibliographical entries are unfortunately not structured - 
automatized import/export of data is not possible at the moment. This 
should definitely change in the future.

Dan's resource and SUEBS online are structured in almost the same way: 
data is entered under the headings of authors and their works. There is 
some overlap between the two resources in the field of Buddhist pramana 
in India, where SUEBS, because it is intended as a specialized resource 
in this area, contains more detailed information (until 1994, that is, 
when SUEBS went into print). To me, this raises the issue whether a 
combination of these two resources in some way might not be desirable - 
not as a complete merging of data, but as a technical solution where 
data from one or both collections can be accessed through a meta 
search-engine or something of the kind.

SUEBS online exists as a suite of Perl scripts; data is stored in MySQL 
database tables, and diacritics are all in Unicode (UTF-8). User 
management with different permission levels is implemented in general, 
but needs some tweaking towards real collaborative entry and maintenance 
of data. I'm willing to contribute the code, as well as the database 
structure, to any more comprehensive project that moves into a similar 
direction - it could be a start to build up something more efficient and 
comprehensive. If David Germano is interested in building a more 
structured resource than a Wiki, the SUEBS online code, or at least the 
concrete approach that it embodies, could be used as a starting-point 
for further discussion.

As I've said above, a Wiki is the fastest way to make Dan's data 
available, but perhaps not the most meaningful. When dealing with such 
data, the choice is always between the two extremes of fast availability 
and time-consuming and potentially tedious manipulation of data (and 
creation of program code) for a more structured approach.

My own perspective is pragmatic and therefore somewhere in between: I do 
believe that investing effort into the structuring of data always pays 
off in the end, and that opting for what at first sight seems to be the 
fastest solution often has considerable drawbacks for future use and 
increase of data; for instance, if bibliographical entries in SUEBS 
online remain unstructured, chances are that people won't be terribly 
enthusiastic in contributing data because they can't get it back in a 
reasonable form. Plus, whoever adds data always has to remember the 
conventions used for data entry; this is tedious and consumes creative 
energies that are better directed elsewhere; it also inevitably results 
in mistakes and data inconsistency.

At the same time, I also believe that one does not need to structure 
data "atomistically" down to the smallest conceivable unit, which not 
only consumes infinitely more programming time until resources become 
available, but also makes database resources far too unwieldy and 
complicated not only for future users who are encouraged to contribute, 
but also for future programmers who have to maintain code.

Best regards,

Birgit Kellner
Institute for South Asian, Tibetan and Buddhist Studies
University of Vienna





More information about the INDOLOGY mailing list