A misconception regarding the PDF format (Re: Text processing in Unicode

rajam rajam at EARTHLINK.NET
Fri Mar 26 23:27:56 UTC 2010

Just wanted to make sure that we understand the principle behind the  
acronym PDF ( "Printable Document Format").

So, if we want to have a searchable PDF, we should ask the powers  
that are in the IT industry to develop something like an  
"SPDF" ("Searchable Printable Document Format").

Hope you can understand what I mean.


On Mar 26, 2010, at 2:33 PM, rajam wrote:

> PDF documents are searchable--but we have to abide by the rules of  
> the PDF technology or we should device our own technique to get  
> around them.
> We need to respect the technology (PDF or other) which has its own  
> characteristics as any other software in the industry.
> I agree with JLC that PDF files are "E-paper" and the format was  
> not "primarily invented for being a text storage format and it has  
> never been guaranteed that round-trip conversions is always  
> possible between PDF files and text files."
> I'd like to add that expecting something "post-inventional" won't  
> help us unless we do something about it -- for example, tell the  
> creators/inventors of the software what we want to see the software  
> do for us now or in the future. That's why the IT world has "tech  
> support" departments and "feedback" channels.
> Most importantly, I feel that our wishes like this one (that PDF  
> documents should "be searchable") would be more effective if we  
> direct them to the IT industry (for example to Adobe or any PDF  
> developers) rather than expressing them only here in an academic  
> forum as if we are just complaining about technology.
> --vsr
> (<www.letsgrammar.org>)
> On Mar 26, 2010, at 7:16 AM, George Hart wrote:
>> I have been playing around with unicode in both Tamil and  
>> Devanagari.  On the Mac (Snow Leopard), it is not possible to  
>> search pdf's in either writing system -- nor is it possible to use  
>> Acrobat to export such files into rtf or other editable format.   
>> Using Nisus on the Mac, searching works perfectly for both writing  
>> systems, and Rajam's problem does not appear.  Many documents are  
>> available as pdf's, and it is quite important that they be  
>> searchable.  Unfortunately, that is not the case at this point  
>> with at least two important Indic writing systems.  George Hart

More information about the INDOLOGY mailing list