7 Bit tutorial
Toru Tomabechi
Toru.Tomabechi at orient.unil.ch
Mon Dec 16 22:24:50 UTC 1996
A few words on 7bit/8bit issue.
There seems to be some confusion between character-sets and their
encodings. No official character set (except for UNICODE which uses
16bits) is necessarily either 8bit- or 7bit-encoded. For example,
ISO-8859-1 (Latin-1 character set) can be encoded by the both encoding
methods. ISO-8859-1 is encoded using 8th bit when it is a part of X
Window system's compound-text encoding, but it uses only 7 bits within
IS0-2022-JP encoding. It should also be noted that huge
character-sets, such as Japanese Kanji, can be encoded with two
7bit-sequences (see Ken Lunde's _Understanding Japanese Information
Processing_, O'Reilly & Associates, 1993).
As for the mail-transfer protocol, it uses only 7 bits. Nevertheless,
it can handle correctly ISO-8859-1 characters along with
US-ASCII. This is done by an encoding-decoding convention which is to
be found in the RFC documentation. When you write an e-mail containing
some so-called `8bit' characters, if your mailer program is
intelligent enough to encode it according to such a convention, the
e-mail is treated correctly. There are some types of such
convention. The one often used in one byte area is the method called
`quoted-printable', while in two-byte character area, such as Japan,
the method with escape sequence is widely used. The Japanese
mailing-list for my favorite text editor, Mule (Multilingual
Enhancement to GNU Emacs), uses the latter encoding method, and people
there discuss using not only Japanese Kanji and US-ASCII, but also ISO
Latin 1, 2, 3, 4, 5, Cyrillic (ISO-8859-5), Greek (ISO-8859-7),
Chinese GB, Korean KSC, Thai TIS620, ... characters (devanagari is now
under development). And these character-sets are all encoded with
only 7 bits (or two 7bit sequences).
Apologies if Claude already knows what I wrote above. My point is: the
discussion should be not on the `8bit-7bit' issue, but on the
character-set which we need.
Toru Tomabechi
University of Lausanne
More information about the INDOLOGY
mailing list