[R6RS] Unicode scalar value escape sequences
Michael Sperber
sperber
Thu Mar 3 10:44:01 EST 2005
>>>>> "Marc" == Marc Feeley <feeley at IRO.UMontreal.CA> writes:
>> Well, sure. But given that you have all kinds of procedures that
>> operate on strings of length 1 only, I don't see how you're making the
>> character data type go away in any real sense---you still effectively
>> have a separate type. It's just that the type is wedged into the
>> language in a way that, to me, makes way less sense than the current
>> setup.
Marc> How would you define (char-upcase (integer->char #x00df))?
Marc> What about (char-ci<? (integer->char #x00df) #\T)?
We've been through this a zillion times: via the standard Unicode case
mapping.
Marc> I'm against it. I think the syntax would be strange, overly complex
Marc> and redundant.
I don't get this argument: Matthew's proposal has three different
escape sequences for scalar values. Implicit termination incurs
complexity. Also, specifying scalar values via the vanilla numerical
literal syntax *removes* redundancy, as you can just re-use the lexer
for the numerical literals.
Marc> Here's another proposal. Keep the \xhh and \uhhhh syntaxes for
Marc> compatibility with C and Java (i.e. exactly 2 and 4 hexadecimals
Marc> respectively) but require a delimiter for \U and allow any number of
Marc> hexadecimals, i.e.
Marc> "\U20;\U00000021;\Ua;" = " !\n"
I would consider this an improvement over the current status as well.
--
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla
More information about the R6RS
mailing list