[r6rs-discuss] perhaps i should be formal, but....

From: MichaelL_at_frogware.com <MichaelL>
Date: Wed Mar 14 19:05:37 2007

> On the contrary. When you *allow* surrogate values as
> actual character values you don't mandate them. Therefore,
> an implementation that uses UTF-16 as you describe is
> *also allowed*.

It's possible that I misunderstand you, but I think we're on the same
page. I'm also concerned that R6RS, as currently written, seems to require
UCS-4/UTF-32 strings. The problem is that string-ref returns characters,
and characters can't be surrogates. Given that Windows, Mac, Java, and
IBM's ICU all use UTF-16, that would be a Bad Thing. In fact, my position
would be even more extreme: I lament the loss of single/multi byte strings
in general (which would include UTF-8). They're still useful for low-level
work. In fact, they'll still be needed--think of the various Scheme to C
compilers, for example, that will need a char equivalent--they just won't
be standardized anymore.

In retrospect I think it would have been better if Unicode characters and
strings were added as new types rather than replacing the existing ones.
Then we'd have uchar and ustring and, perhaps, fewer
backward-compatibility issues. Of course, that was *nearly* done: old
strings are basically bytevectors now. But there's no bytevector-upper or
bytevector-<? and such, so no, something was lost, at least for "low
level" work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.r6rs.org/pipermail/r6rs-discuss/attachments/20070314/eb40826f/attachment.html
Received on Wed Mar 14 2007 - 19:05:09 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC