Finding Indological full-book PDFs on Google Books
Paul G. Hackett
ph2046 at COLUMBIA.EDU
Mon Jun 18 17:24:26 UTC 2007
At 9:12 AM -0700 6/18/07, Jonathan Silk wrote:
>one does not need the full Acrobat to do this. There is a nifty little program
great utility, Jonathan. Thanks for the link (I have the full
version of Acrobat, so I sometimes forget what features are missing
in the "Reader").
>the question of the utility of time spent replacing pdf pages,
True. Although I would argue that it's *still* faster than scanning
the original oneself. But this leads me to a bigger issue, which
David Magier raised.
At 7:08 AM -0400 6/18/07, David Magier wrote:
>one must use conservation techniques to extend the life of the book,
>and/or duplication of the content onto a proven long-term storage
>medium (and format) that will be usable many generations into the
>future. This latter category includes microfilm (chemical/physical
>studies say microfilm, properly produced and stored, will last and
>be readable -- without any technology other than a lens -- up to 500
>years from now), as well as archival-quality preservation
>photocopying onto acid-free paper (which, properly bound, creates a
>new copy of the book that should last many hundreds of years). Does
>anyone really believe that the thousands of books being digitized
>now are going to still be usable, as current-standard PDFs on their
>hard drives or CD-ROMs, even 50 years from now??
Certainly, but I don't think anyone would be foolish enough to think
that today's media will still be readable in the long-term or even
medium-term future. I agree with you, in principle about the
distinction, but don't think the issue should be one of
"digitization" vs. "preservation", for precisely the same reason that
they're not comparable. Nonetheless, the first *can* be leveraged
into the second.
>the impact of the digitization process on fragile original books and
>manuscripts.
<snip>
>As librarians, we are concerned about preserving original materials
>and content as part of the effort of preserving knowledge for future
>generations.
sure. but students destroy books everyday by repeated photocopying
and libraries themselves likewise "destroy" books everyday ... by
which I mean binding and re-binding books which destroys their
artefactual value. A perfect example is the treatment of Tibetan
books by some libraries.
To illustrate my point, however, I would point out the work being
done by Gene Smith at the TBRC <http://www.tbrc.org>, where they have
been digitizing Tibetan blockprints and manuscripts for sometime now.
The advantage to their high-resolution digitization, is that once
digitized, the originals never need be handled again. Moreover, Gene
Smith has actually set-up an agreement with a publisher to take the
TBRC digital images and produce custom printings (on preservation
quality, acid-free paper) of the books already scanned, replicating
their traditional format. For that matter, one could even produce
microfilm from the digital images ... microfilm that would *actually*
be clean, readable and useable, as opposed to much of what is still
being produced to this day by conventional photographic means.
Anyone who has ever attempted to get a clean, readable image off of
microfilm knows exactly what I mean.
The issue of data migration is not a small one and I am not trying
to trivialize or downplay the concerns you raise, but the simple fact
is that one needs to think about digital library issues in a much
broader context, fully integrating them into existing library
structures. IMHO, a good start would be the creation of "digital
preservation" departments in libraries, with knowledgeable, trained
staff (trained in *both* library and IT fields), rather than
relegating the job to often non-uniform (and often ad hoc) "tech
support" staff.
It's one thing to complain about Google and other, perhaps less
reputable organizations taking on these tasks, but if librarians and
their institutions aren't willing to step up and take the challenge,
then those others are the people who will do the job, and the end
user communities will be stuck with whatever they produce. Sure the
meta-data is shoddy on most of these items, but so was (and still
*is*) much of the pre-MARC card catalog records. I don't think
anyone would have argued that the retrospective conversion of card
catalogs should be held until the data was verified and corrected.
The situation with all this e-data seems comparable.
This is why -- speaking as a researcher now rather than a librarian
-- I maintain my own digital archive of books. I take what I can
find on the web, download it, proof it for errors, retrieve the
original if need be, selectively re-scan pages, catalog and archive
for my own personal use. It is my hope that someday there will be a
proper forum for so many academics who have and continue to do things
like this to share our resources rather than every individual having
to duplicate such admittedly tedious work. I keep hoping some
reputable university would at least make an attempt, but I have yet
to see anything. I guess the question for me, at least, is how can
this process be influenced in a more positive direction, since it
seems clear that such digitization initiatives will take place with
or without input from the academic community. I think "with" would
be better.
Sorry if this has turned into a long-winded rant, but I feel these
*are* important issues that you raise, David, and think they really
need to be discussed.
Paul Hackett
Columbia University
At 9:12 AM -0700 6/18/07, Jonathan Silk wrote:
>In re:
>
>> you could just download only the pages that are corrupted in the
>>Google version and replace them with the DLI Hyderbad images (I
>>think you would need the "Full" version of Acrobat to do this, not
>>just the reader).
>
>Leaving aside the question of the utility of time spent replacing
>pdf pages, one does not need the full Acrobat to do this. There is a
>nifty little program (sorry this time! Mac only :-) ) called
>"Combine PDFs" which allows one to, as the web site says, "Drop some
>PDF or picture files on the application or the main window. Reorder
>or remove pages as you want. Enter some meta information like the
>Title and save the new PDF."
>
>http://www.monkeybreadsoftware.de/Freeware/CombinePDFs.shtml
>
>It's nice and easy to use! JAS
>--
>Jonathan Silk
>Department of Asian Languages & Cultures
>Center for Buddhist Studies
>UCLA
>290 Royce Hall
>Box 951540
>Los Angeles, CA 90095-1540
>phone: (310) 206-8235
>fax: (310) 825-8808
>silk (at) humnet.ucla.edu
>
>
>From July 15, 2007:
>
>Prof. Dr. Jonathan Silk
>Instituut Kern / Universiteit Leiden
>Postbus 9515
>2300 RA Leiden
More information about the INDOLOGY
mailing list