[r6rs-discuss] [Formal] the CHAR? type

From: John Cowan <cowan>
Date: Fri Nov 17 14:40:48 2006

Thomas Lord scripsit:

> The restriction in section 9.14, prohibitting the domain of
> INTEGER->CHAR from including surrogates, should be relaxed.
> Implementations should be permitted, not required, to adopt
> that restriction.

I'm against it.

> 1. In general, the less restricted model is simpler and more
> powerful. In an implementation without the restriction,
> the CHAR? type can simply be isomorphic with a set of
> exact integers in some (possibly improper) superset of
> [0,#xFFFFFFFF].

If you want u32 vectors, you know where to find them.

> That enables things like "bucky bits"
> (a fine lisp tradition).

Such a fine old tradition, in fact, that they were made optional in CLtL1
and removed altogether from CLtL2/ANSI CL. They were also accompanied
in CLtL1 by a type called "string-char", which implementations could
define as a subset of "char" that excluded some or all of the bucky bits.

Allowing arbitrary u32 values without creating a string-char type means
that at least one means of representing strings must be as a u32 vector.
Using the Unicode definition makes it possible to use UTF-8 or UTF-16
internally throughout.

> It is certainly easy to teach learn. It seems to be simpler to
> implement, too.

So is weak typing a la C.

> 2. The I/O issues can be solved in a clever way -- by
> reinterpreting ill-formed UTF-8 and UTF-16 as spellings
> of sequences of certain private-use codepoints.
> Round-trips with processes that don't understand these
> private use characters are perfectly robust to the
> extent that those processes are conforming.

Those who try to reinvent Unicode, etc. There are several ways to resolve
ill-formed byte sequences: replace with U+FFFD, throw and exception,
ignore junk. This is just what is already provided.

-- 
But you, Wormtongue, you have done what you could for your true master.  Some
reward you have earned at least.  Yet Saruman is apt to overlook his bargains.
I should advise you to go quickly and remind him, lest he forget your faithful
service.  --Gandalf             John Cowan <cowan_at_ccil.org>
Received on Fri Nov 17 2006 - 14:40:35 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:00 UTC