[r6rs-discuss] unicode (re comment #134) from John Cowan on 2006-12-17 (r6rs-discuss.mbox)

From: John Cowan <cowan>
Date: Sun Dec 17 00:46:48 2006

Thomas Lord scripsit:

> Noncharacter code points are explicitly described as suitable
> for internal use.

So they are, and R5.91RS explicitly permits them. Noncharacter code
points are not the same as surrogate code points, which are *not*
explicitly described as suitable (and are not suitable) for internal use.

Specifically, allowing the representation of surrogate code points
means that UTF-16 cannot be used as an internal representation at all
(it cannot distinguish between two consecutive surrogate code points
and a non-BMP character) and means that UTF-8 and UTF-32 cannot be used
directly either, but only in the form of non-standard variants.

> For every natural number (integers greater than or equal to 0)
> there exists a distinct CHAR value. The set of all such
> values are called "simple characters".

Whatever for? There does not exist a countable infinity of simple
characters to represent, Galactic Empire or no. The number is
*always* going to be finite, by the nature of graphical representations:
if there were a countable infinity of characters, there would be for
each character infinitely many that are essentially indistinguishable
from it, since each character can be represented as a pixel grid of
finite size.

I omit the rest, since it depends on this original and useless notion.

-- 
"But I am the real Strider, fortunately,"       John Cowan
he said, looking down at them with his face     cowan_at_ccil.org
softened by a sudden smile.  "I am Aragorn son  http://www.ccil.org/~cowan
of Arathorn, and if by life or death I can
save you, I will."  --LotR Book I Chapter 10

Received on Sun Dec 17 2006 - 00:46:38 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:00 UTC