7 Bit tutorial

Claude_Setzer at fcsmtp.mum.edu Claude_Setzer at fcsmtp.mum.edu
Mon Dec 16 19:00:07 UTC 1996

In my recent posting, I innocently assumed that the people I was addressing
new more about computers and file systems that I do. Since they had been very
active in discussing 7 bit systems and even writing software to use them, this
seemed a reasonable assumption. I was quite surprise to be accused of "not
speaking English." I was also very surprised that many people, including some
that are very knowledgeable computer experts, have many misunderstandings
about how characters are generated and processed by a computer and by email. I
do not in any way claim to be an expert, but will try to help the situation
with this comment.

The main point of my posting was to try to remove what I believe is a very
incorrect belief that many people have. It seems that the computer people have
convinced most of the world population the there is a serious technical reason
that prevents computers from using 8 bit characters. It is my strong belief
that this is completely untrue and a relatively easy problem to fix,
especially compared to all the tens of thousands of hours that have been put
into attempts to consistently use 7 bit systems and get them to cooperate with
8 bit systems. Based on recent postings, I don't think it will ever be
possible to get people to agree on a single 7 bit coding system. Many people
are very emotionally attached to their own favorite. On the other hand, it
doesn't seem that there is much disagreement on 8 bits character standards,
since most people already use those characters for hand written messages. This
seems to be, in every way, a less expensive and more effective system to use.

7 and 8 Bits Tutorial:

A bit is the most fundamental piece of information processed by a computer.
Physically it corresponds to one of two voltage levels in an electronic
circuit. Logically, these two voltage levels are represented by either a one
or a zero. To simplify the circuits and to insure accuracy, computers store
and transfer information in groups of bits, often called words. Since the late
1970s, all computers have used 8 bit (or larger) "words."

Therefore, when you type a character on your keyboard, it always gets
transmitted as an 8 bit "word." There is no choice because the computer is not
capable of transmitting an odd length word, such as 7. Actually most fonts are
8 bits fonts, too. The problem arises when some software or hardware starts
ignoring the last bit in every word. Then the 8 bit characters are translated
into 7 bit ones, often with dire consequences.

An eight bit computer word allows a character (represented by a string of 8
ones and zeros) to take on one of 256 values. In most languages, 256
characters are adequate to represent at least the most commonly used
characters. When we only look at the first seven bits, however, half of those
potential characters are ignored, or worse, translated into a different
character. With a 7 bit font, only 128 characters can be used. Although this
seems like a lot, many of those are taken up by punctuation and control
signals, so it actually puts a rather severe limitation on communication. In
the case of transliterated Devanagari, you loose all or most of the
diacritical marks.

With respect to email, there are two separate problems: body text and attached
files. It is possible for an email system to independently use 7 or 8 bits
capabilities in each area. Body text relates to what you type into the message
area. An attached file can often be sent correctly with 8 bit fonts even when
your email does not support 8 bit fonts in the body text. The problem gets
more complicated when an email system decides your body text is too large and
converts it into an attached file. This usually happens at the receiver end.
When you send and receive an attached file, both email systems must be capable
of interpreting the type of encoding that is used for transmission, and both
computers must have a font with the same character placement.

Another misunderstanding comes in with respect to the relationship between the
keyboard, the computer display screen, and what is stored in a file. There is
no fixed relationship between these. A software package may let you type in on
the keyboard using your choice of 7 or 8 bit fonts, display on the screen in a
16 bit font (for example Peter Freund and Ralph Bunker are working on a
Devanagri font set that has over a thousand character possibilities. They call
this a font tray I believe, but it could have been a 16 bit font.), and save
to a file using a different 7, 8 or 16 bit font. For example, you could type
in using your favorite 7 bit Romanized font of English characters, and the
screen could display what is typed, or it could display a Romanized font with
diacritical marks, or it could display just as easily in Devanagari or Telegu
fonts that could be 8 or even 16 bits in size. When the file is saved, it
could be in any of these formats, or a totally different one.

The point I am trying to make is that computers use 8 bit fonts anyway, and
our lives will probably become much easier and simpler if we put our time into
standardizing on 8 bit system than if we continue working on the much more
limited and emotionally charged 7 bit systems. Yes, we do need 7 bit systems
today, but I don't think there is any technical reason why we should need them
tomorrow. And I think it will much easier to please everyone with 8 bit

Sincerely ,

Claude Setzer       csetzer at mum.edu    or    cssetzer at mum.edu


         Sent via ExpressNet/SMTP(tm), Internet Gateway of the Gods!
               ExpressNet/SMTP (c)1994-95 Delphic Software, Inc.

More information about the INDOLOGY mailing list