Availability of sTog Palace Manuscript Kanjur on CD-ROM
Birgit Kellner
birgit.kellner at UNIVIE.AC.AT
Thu Apr 29 01:04:14 UTC 2004
Richard MAHONEY wrote:
>I don't think that there is any doubt about the value of this
>collection. My only reservation is that all digital texts seem to be
>available only in Adobe Acrobat format (PDF files):
>
> http://www.tbrc.org/catalog/order.php
>
>This is a little disappointing. I would have preferred the texts to
>have been available in the Tag(ged) Image File Format (TIFF files):
>
> http://partners.adobe.com/asn/developer/PDFS/TN/TIFF6.pdf (specs)
>
>TIFF images of digital texts tend to be easy to manipulate using a
>wide variety of applications.
>
>So my question is whether any readers have been able to successfully
>convert TBRC PDF files into TIFF files (preferably multi-page
>files). I have experimented on my BSD machine with `pdftoppm' (from
>XPDF bundle) followed by `ppm2tiff' (from libtiff bundle). This does
>work, but it is slow and tedious. I can imagine that this route would
>soon become a nightmare if one had a good number of TBRC PDF files to
>convert.
>
>I would very much appreciate any thoughts on this issue.
>
>
>Best regards,
>
> RBM
>
>
>
Richard,
I also used pdftoppm, but within a (equally slow and rather tedious)
process with different goals. Being mainly interested in bsTan-'gyur
data, I wanted to extract all individual folios from the pdf-files which
often contain many works, and then build a navigation model, in order to
find individual folios easier. This process can be partly automatised;
the rest would take some time, but I'm not in a hurry and am planning to
continue with this depending on the texts I'm reading. (I'm just doing
this for my own convenience but would be happy to share any usable
results with the TBRC or others.)
I also wanted to fit the entire bsTan-'gyur data on one CD, compressing
the images (jpg) as much as possible, with the main target being screen
legibility, not print. In this I didn't quite succeed, which may be due
to my own inexperience with image manipulation technology.
I haven't worked a lot with TIFF-data; could you perhaps spell out some
of the things one could do with these TIFF-files - apart from converting
them into other formats? OCR is the most obvious candidate, but is there
already reasonable software around for Tibetan blockprints?
Best regards,
Birgit Kellner
More information about the INDOLOGY
mailing list