[r6rs-discuss] perhaps i should be formal, but....

From: MichaelL_at_frogware.com <MichaelL>
Date: Wed Mar 14 19:57:09 2007

> > It's possible that I misunderstand you, but I think we're on the same
page.
> > I'm also concerned that R6RS, as currently written, seems to require
> > UCS-4/UTF-32 strings. The problem is that string-ref returns
characters, and
> > characters can't be surrogates. Given that Windows, Mac, Java,
andIBM's ICU
> > all use UTF-16, that would be a Bad Thing. In fact, my position would
be
> > even more extreme: I lament the loss of single/multi byte strings in
general
> > (which would include UTF-8). They're still useful for low-level work.
In
> > fact, they'll still be needed--think of the various Scheme to C
compilers,
> > for example, that will need a char equivalent--they just won't be
> > standardized anymore.
>
> Is there any reason why bytevectors will not fill the need for
> single-byte strings?

They can, but...

First, from a practical perspective many useful operations (string<?,
string-downcase, etc.) have been lost. (If they were replaced they would
have rather funny names--bytevector<?, bytevector-downcase, etc!) Second,
from a clarity perspective bytevectors are meant to be much more
general-purpose than strings; they have, for example, operations for
getting and setting integer and floating point numbers. Those are rather
odd operations for a string!

Bytevectors are definitely a very useful low-level addition to Scheme. But
single/multi-byte strings were, I think, an unnecessary loss, especially
for those who do lots of operating sytem- and library-level work.

(Automatic and unavoidable coversion to and from single/multi-byte strings
isn't a good idea because a) there is a potentially avoidable performance
hit and b) the conversion isn't guaranteed, so there are questions about
what to do in the case of failure.)
Received on Wed Mar 14 2007 - 19:56:38 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC