ACII, ISCII, Unicode [was: CSX+ fonts &c]

Jaap Pranger yaap at XS4ALL.NL
Thu Jun 17 01:51:49 UTC 1999

On Tue, 15 Jun 1999 John Smith wrote: 

>On Tue, 15 Jun 1999, Jaap Pranger wrote:
>> ...
>> Context sensitive rendering will also be available through UniScribe
>> (the Windows Unicode Script Processor) in Windows NT 5.0,

>This was news to me. Is there any indication of when an Indian-language
>renderer might appear?

Not really, "UniScribe is expected to ship with Internet Explorer 5.0" 

For details of Uniscribe: 

To wet your appetite, some relevant paragraphs below.              

<copy> --- <> ---  

Multilanguage text support in Windows 2000
[article by] F. Avery Bishop, David C. Brown, David M. Meltzer

Multilingual Features in Windows NT 5.0 

All language versions of Windows NT 5.0 will be enabled for all supported 
languages, including European, and Far Eastern. This includes languages 
written with complex scripts such as Arabic, Hebrew, Thai, Devanagari, 
and Tamil. Applications that display plain text using Unicode can handle
mixed text from any of the supported scripts. For example, you can pass 
a Unicode string containing French, Thai, Hindi, Korean, and Arabic text 
to ExtTextOutW, which will display the whole string in one pass. Thus, 
a Unicode application (one compiled with the -DUNICODE option) can display 
text in any of the supported scripts, without changing locales, on any 
language version of Windows NT 5.0. [............................................]

As you can see from the sample code in Figure 1, Indic scripts must be 
handled separately because they have no charset values. Since there is 
no default ACP value for Indic scripts, none of the Win32® ANSI entry 
points (the A routines) will work with Indic text. Indic text is not 
automatically translated to Unicode. There are ways to force translation 
of Indic text to or from Unicode using MultiByteToWideChar and 
WideCharToMultiByte by specifying the appropriate code page. 
However, an Indic input locale can only pass Indic text to a Unicode 
application, so full support for Indic scripts requires a Unicode 
application. [............................................]

There are two requirements for displaying complex scripts correctly 
using the standard Win32-based applications. First, applications should 
save characters in a buffer and display the whole line of text at once 
rather than, for example, calling ExtTextOut on each character as it is 
typed in by the user. When characters are written out one by one, the 
complex script shaping modules cannot determine the context for correct 
reordering and glyph shaping.   [............................................]

Uniscribe [Unicode Script Processor (USP10.DLL)]

The new Unicode Script Processor (USP10.DLL), also known as Uniscribe, 
is a collection of APIs that enables a text layout client to format complex 
scripts. Uniscribe supports the complex rules found in scripts such as 
Arabic, Indian, and Thai. Uniscribe also handles scripts written from 
right-to-left such as Arabic or Hebrew, and supports the mixing of scripts. 
For plain-text clients, Uniscribe provides a range of ScriptString functions 
that are similar to TextOut, with additional support for caret placement. 
The remainder of the Uniscribe interfaces provide finer control to clients. 

Although native to Windows NT 5.0, the Uniscribe DLL may also be distributed 
for use on Windows NT 4.0, Windows 95, and Windows 98-based systems. 
USP10.DLL is also expected to ship with Microsoft® Internet Explorer 5.0. 

Uniscribe uses multiple shaping engines that contain the layout knowledge for 
particular scripts (see Figure 4 ). It also takes advantage of the OpenType 
layout shaping engine for handling font-specific script features such as glyph 
generation, extent measurement, and word-breaking support. 



Jaap Pranger 


More information about the INDOLOGY mailing list