ACII, ISCII, Unicode [was: CSX+ fonts &c]
yaap at XS4ALL.NL
Thu Jun 17 01:51:49 UTC 1999
On Tue, 15 Jun 1999 John Smith wrote:
>On Tue, 15 Jun 1999, Jaap Pranger wrote:
>> Context sensitive rendering will also be available through UniScribe
>> (the Windows Unicode Script Processor) in Windows NT 5.0,
>This was news to me. Is there any indication of when an Indian-language
>renderer might appear?
Not really, "UniScribe is expected to ship with Internet Explorer 5.0"
For details of Uniscribe:
To wet your appetite, some relevant paragraphs below.
<copy> --- <http://www.microsoft.com/globaldev/> ---
Multilanguage text support in Windows 2000
[article by] F. Avery Bishop, David C. Brown, David M. Meltzer
Multilingual Features in Windows NT 5.0
All language versions of Windows NT 5.0 will be enabled for all supported
languages, including European, and Far Eastern. This includes languages
written with complex scripts such as Arabic, Hebrew, Thai, Devanagari,
and Tamil. Applications that display plain text using Unicode can handle
mixed text from any of the supported scripts. For example, you can pass
a Unicode string containing French, Thai, Hindi, Korean, and Arabic text
to ExtTextOutW, which will display the whole string in one pass. Thus,
a Unicode application (one compiled with the -DUNICODE option) can display
text in any of the supported scripts, without changing locales, on any
language version of Windows NT 5.0. [............................................]
As you can see from the sample code in Figure 1, Indic scripts must be
handled separately because they have no charset values. Since there is
no default ACP value for Indic scripts, none of the Win32® ANSI entry
points (the A routines) will work with Indic text. Indic text is not
automatically translated to Unicode. There are ways to force translation
of Indic text to or from Unicode using MultiByteToWideChar and
WideCharToMultiByte by specifying the appropriate code page.
However, an Indic input locale can only pass Indic text to a Unicode
application, so full support for Indic scripts requires a Unicode
There are two requirements for displaying complex scripts correctly
using the standard Win32-based applications. First, applications should
save characters in a buffer and display the whole line of text at once
rather than, for example, calling ExtTextOut on each character as it is
typed in by the user. When characters are written out one by one, the
complex script shaping modules cannot determine the context for correct
reordering and glyph shaping. [............................................]
Uniscribe [Unicode Script Processor (USP10.DLL)]
The new Unicode Script Processor (USP10.DLL), also known as Uniscribe,
is a collection of APIs that enables a text layout client to format complex
scripts. Uniscribe supports the complex rules found in scripts such as
Arabic, Indian, and Thai. Uniscribe also handles scripts written from
right-to-left such as Arabic or Hebrew, and supports the mixing of scripts.
For plain-text clients, Uniscribe provides a range of ScriptString functions
that are similar to TextOut, with additional support for caret placement.
The remainder of the Uniscribe interfaces provide finer control to clients.
Although native to Windows NT 5.0, the Uniscribe DLL may also be distributed
for use on Windows NT 4.0, Windows 95, and Windows 98-based systems.
USP10.DLL is also expected to ship with Microsoft® Internet Explorer 5.0.
Uniscribe uses multiple shaping engines that contain the layout knowledge for
particular scripts (see Figure 4 ). It also takes advantage of the OpenType
layout shaping engine for handling font-specific script features such as glyph
generation, extent measurement, and word-breaking support.
More information about the INDOLOGY