Finding Indological full-book PDFs on Google Books
David Magier
magier at COLUMBIA.EDU
Mon Jun 18 11:08:07 UTC 2007
Dear All,
another angle on mass digitization projects (perhaps less true of Google
than of other, unnamed mass projects whose name begins with a large number)
that has librarians and scholars very concerned is the impact of the
digitization process on fragile original books and manuscripts. SOME such
projects are now well known march into libraries and archives (particularly
underfunded ones in the Subcontinent), to "wow" the local staff and
administrators with high-end equipment and high-profile publicity, secure
agreements and grandiose press releases, and then proceed to do shoddy
digitization work (including horrible metadata) while literally destroying
the books in the process. As librarians, we are concerned about preserving
original materials and content as part of the effort of preserving
knowledge for future generations. In the rush to digitize, many people
(certainly the general public) have lost sight of what for us is a basic
mantra:
"Digitization is a wonderful medium for DISSEMINATION. It is not a method
of long-term PRESERVATION."
For the latter, one must use conservation techniques to extend the life of
the book, and/or duplication of the content onto a proven long-term storage
medium (and format) that will be usable many generations into the future.
This latter category includes microfilm (chemical/physical studies say
microfilm, properly produced and stored, will last and be readable --
without any technology other than a lens -- up to 500 years from now), as
well as archival-quality preservation photocopying onto acid-free paper
(which, properly bound, creates a new copy of the book that should last
many hundreds of years). Does anyone really believe that the thousands of
books being digitized now are going to still be usable, as current-standard
PDFs on their hard drives or CD-ROMs, even 50 years from now??
Digital file content can (and some probably will) be carried forward in
usable formats into the future only by very active, very expensive, and
ONGOING permanent intervention via constant "refreshment" of the data into
each successive wave of current file formats and storage devices, as the
technologies involved continue to change at ever-increasing rates. Who is
going to make that investment, continually, into the future? For which
specific materials? At a reasonable guess, there will be lots of attrition
and lots of content will fall behind. (And if the original books from which
it was derived are not *preserved* as above, then the books and their
content are lost forever). Tt is really only the most commercially valuable
content that will continue to get the digital preservation investment
needed to refresh the data and keep the content viable. Do we really
believe that our indological books fall into that category?
I'm a strong advocate of digitization and dissemination, but I am
constantly fighting against a widespread, general misunderstanding under
which people feel that once a book as been digitized we can rest easy: it
has been "taken care of". Particularly given the dismal actual record of
what happens to books getting digitized, I feel scholars everywhere must
take much more active notice of the distinction between digitization and
preservation, and must make sure that appropriate attention is given to the
latter, even if it is so much less "sexy" than the former.
David Magier
South Asia Librarian
Columbia University
and
President, Center for South Asia Libraries
--On June 17, 2007 9:00:46 PM -0700 Jonathan Silk <silk at humnet.ucla.edu>
wrote:
> Dear Tim,
>
> Just a quick note: I do understand the logic that something can be better
> than nothing. But I think the concern is that if one is going to do
> something, it should be done right (not as much as a philosophical stance
> as a practical one). And if something is done by Google, even if badly,
> is it then likely that it will be done later better? Is it not a case of
> bad coin driving out good?
>
> Then, specifically:
>
>> . I did look at the pages Jonathan mentioned, and although several
>> had distortions along the left edge of the pages, they were quite
>> legible.
>
> Page 211--left side distorted, but yes, legible
> 212: approximately 1/5 of the [right side of the] page missing because it
> was placed on the scanner at a diagonal--to me, this does not count as
> 'quite legible'
> 213 more or less = 211
> 214 --the page was moved during scanning, such that a large part of the
> right side is indeed not legible.
>
> All of these problems and worse can be found throughout the whole
> book--at a very rough guess about every second or third page has this
> type of trouble, which almost systematically leaves part of the text
> legible-- the rate of trouble is astonishing (and far beyond that even of
> the old Indian reprints, or the work of even sloppy student assistants).
> I would not fear contradicton to say that as now available Burnouf's
> book is unreadable in the Google version. And this does not even address
> the issue of huge portions of some books missing, the book scanned not
> being the book catalogued (e.g., who in the Google group would scan PW
> when their records indicate that they already have it? yet, as I said,
> it is not PW at all...) etc
>
> Sorry--my quick note was not so quick. With this, I'm done with this
> topic, with the wish that those of us with a professional interest in a
> relatively narrow field might profitably discuss (in future, in a
> different forum?) how to prepare the relatively limited corpus of key
> materials we all are likely to find useful to have on our hard-drives.
>
> JAS
> --
> Jonathan Silk
> Department of Asian Languages & Cultures
> Center for Buddhist Studies
> UCLA
> 290 Royce Hall
> Box 951540
> Los Angeles, CA 90095-1540
> phone: (310) 206-8235
> fax: (310) 825-8808
> silk (at) humnet.ucla.edu
>
>
> From July 15, 2007:
>
> Prof. Dr. Jonathan Silk
> Instituut Kern / Universiteit Leiden
> Postbus 9515
> 2300 RA Leiden
More information about the INDOLOGY
mailing list