CSX+: final draft

John Smith jds10 at CUS.CAM.AC.UK
Thu Jul 23 14:11:51 UTC 1998

This message is being posted to both the Indology and Conv-dev mailing
lists. Apologies to those who receive it twice.

Some weeks ago I announced a proposal for an enhanced version of the
CSX character set, to be named CSX+: the aim was to provide a broadly
CSX-compatible encoding containing a number of new characters whose
absence in CSX had caused problems to users. Specifically, it was
proposed that CSX+ should implement all the characters required by the
draft ISO/TC46/SC2 standard for transliteration of Indian languages:
this has been under discussion for the past year on the conv-dev
mailing list, and is now nearing completion.

Recent contributions to the conv-dev discussion have made it
impossible to finalise CSX+ until now, since significant changes to
the draft standard have continued to be debated. However, it does now
appear that a broad consensus may have been reached, and I think the
time has come to agree a definitive version of the encoding.

The attached file contains the proposed definition of the CSX+
character set. It differs in three respects from the previous version:
(1) character 156 is now e-macron-tilde, not sterling (pound sign);
(2) character 180 is now m-breve, not l-underring-acute; (3) character
192 is now o-macron-tilde, not M-overdot. The characters sacrificed
appear to me to be the most readily disposable ones, and their
replacements are necessary: m-breve is used for the Sinhalese
half-nasal before a labial, and e-macron-tilde and o-macron-tilde are
required for citing forms from northern Indian languages in a southern
(or pan-Indian) context.

I would welcome comments on this final draft of the CSX+ "standard".
Assuming it meets with general approval, I shall release a set of
fonts implementing it in about one week's time.

John Smith

Dr J. D. Smith                *  jds10 at cam.ac.uk
Faculty of Oriental Studies   *  Tel. 01223 335140 (Switchboard 01223 335106)
Sidgwick Avenue               *  Fax  01223 335110
Cambridge CB3 9DA             *  http://bombay.oriental.cam.ac.uk/index.html

# CSX+ encoding for mkt1font and vpl2vpl
# Enhanced version of CSX (Classical Sanskrit eXtended encoding)
# for the representation of Indian languages in Roman script
# CSX+ aims to be downward compatible with CSX, save for moving aacute
# away from the slot (decimal 160) used as non-breaking space on PCs.
# It also seeks to implement the (draft) ISO/TC46/SC2 standard, while
# retaining a useful set of European accented characters and adding
# dashes and directional double quotes.

128	C cedilla
129	u dieresis
130	e acute
131	a circumflex
132	a dieresis
133	a grave
134	a ring
135	c cedilla
136	e circumflex
137	e dieresis
138	e grave
139	i dieresis
140	i circumflex
141	i grave
142	A dieresis
143	A ring
144	E acute
145	ae
146	AE
147	o circumflex
148	o dieresis
149	o grave
150	u circumflex
151	u grave
152	ae macron		# Was y dieresis in CSX
153	O dieresis
154	U dieresis
155	u breve			# Was cent in CSX
156	emacron tilde		# Was sterling in CSX
157	r underring		# Was yen in CSX
158	a acute
159	r underbar
160	space			# Non-breaking space on PC: was a acute in CSX
161	i acute
162	o acute
163	u acute
164	n tilde
165	N tilde
166	l tilde
167	m overdot
168	amacron breve
169	imacron breve
170	umacron breve
171	amacron tilde
172	imacron tilde
173	n underbar
174	runderring macron	# Was guillemotleft in CSX
175	l underring		# Was guillemotright in CSX
176	lunderring macron
177	runderring acute
178	runderring grave
179	runderringmacron acute
180	m breve
181	amacron acute
182	amacron grave
183	imacron acute
184	imacron grave
185	e macron
186	o macron
187	R underring
188	y overdot
189	umacron acute
190	umacron grave
191	r breve
192	omacron tilde
193	m candrabindu
194	t underbar
195	E macron
196	O macron
197	n breve
198	runderdot acute
199	runderdot grave
200	K h			# Overwritten by next definition
200	Kh underbar
201	k underbar
202	space			# Non-breaking space on Macintosh
203	AE macron
204	k h			# Overwritten by next definition
204	kh underbar
205	g overdot
206	c circumflex
207	runderdotmacron acute
208	a tilde
209	i tilde
210	u tilde
211	e tilde
212	o tilde
213	e breve
214	o breve
215	l underbar
216	umacron tilde
217	G overdot
218	C circumflex
219	h underbar
220	h underbreve
221	endash
222	emdash
223	quotedblleft
224	a macron
225	germandbls
226	A macron
227	i macron
228	I macron
229	u macron
230	U macron
231	r underdot
232	R underdot
233	runderdot macron
234	Runderdot macron
235	l underdot
236	L underdot
237	lunderdot macron
238	Lunderdot macron
239	n overdot
240	N overdot
241	t underdot
242	T underdot
243	d underdot
244	D underdot
245	n underdot
246	N underdot
247	s acute
248	S acute
249	s underdot
250	S underdot
251	quotedblright
252	m underdot
253	M underdot
254	h underdot
255	H underdot

More information about the INDOLOGY mailing list