Hans,
I think this conversation is slightly tangential to the original question, but I believe is useful for all members of the group. I hope it is okay to continue the discussion a little bit further.
I should have spoken about the charsets in my previous email. The use of wrong charset is perhaps most likely cause of garbled text, much more than the use of non-Unicode applications. Most of the modern word processing applications are Unicode compatible, but if the default character encoding scheme is set to anything other than Unicode (UTF-8), then, you are likely to see garbled text when you try to edit a document that used Unicode encoding.
Some of the recent versions of Microsoft Word doesn't let you change the default encoding from "Unicode (UTF-8)" easily, but the earlier versions of Microsoft Word had the default encoding set to windows-1252 (superset of ISO/IEC 8859-1). If the default encoding is set to any of those pre-Unicode standards, then, a unicode document saved on that application, while retaining common characters such as ā, ī, ū, ē, ō, will replace Unicode-specific characters with question marks, garbled text or blank spaces. I think this is the primary cause of most of your issues with your volume.
For a long time, Yahoo Mail and Yahoo Groups also had the default encoding set to ISO-8859-1, although they supported Unicode (UTF-8) encoding as well. That means, anyone receiving the email in Unicode can read it fine, but if they forward or respond to it, then the new receiver will get garbled text for the Unicode characters. I think Yahoo switched to Unicode all the way by 2010, albeit a bit later than most of other popular web portals.
As for the non-Unicode applications, I found Adobe Pagemaker to be the biggest offender, in my experience. Several of publishing houses in India still use Adobe Pagemaker for preparing material for publishing. although Pagemaker cannot accept Unicode characters. Recently, we had to write a program to map all Unicode based diacritics into a specific font equivalents to facilitate publishing the book in India.
I believe there's virtually no excuse in this day and age not to be using Unicode all the way. Some specialized encodings may be more efficient than the Unicode encodings for certain languages. But unless you're storing terabytes and terabytes of very specialized text, there's usually no reason to change your encoding from the default Unicode (UTF-8). If you have older documents that may have used other character encodings, it may be better to get them converted to use the "Unicode (UTF-8)" encoding.
Regards,
Suresh.