[r6rs-discuss] on rationale 9.6 Characters and Strings

From: John Cowan <cowan>
Date: Wed, 27 Jun 2007 02:25:13 -0400

Thomas Lord scripsit:

> The draft, on the other hand,
> enumerates a list of abstract characters which must be present and which
> are the only characters that /may/ be present in an implementation.

I don't have a problem with allowing Scheme systems to have
implementation-dependent characters over and above Unicode scalar
values, provided that char->integer assigns them values in excess
of #x10FFFF.

> This portion of the rationale is simply confused. The phrase "carries
> no semantic information at all" is particularly inexplicable (because,
> of course: sequences of UTF-16 code values have perfectly well-defined
> semantics!).

Sequences, yes. Individual surrogate characters, no.

> Not all texts are expressed over a taxonomy of writing systems which has
> been recognized by the Unicode consortium and, indeed, some texts are
> understood to be in writing systems that the Unicode consortium has
> explicitly declined to encode.

Can you give an example? Klingon was rejected for Unicode encoding
because in fact no one could point to texts in the Klingon language
written in Klingon script.

-- 
John Cowan      cowan at ccil.org        http://www.ccil.org/~cowan
        Is it not written, "That which is written, is written"?
Received on Wed Jun 27 2007 - 02:25:13 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC