[r6rs-discuss] Strings

From: Abdulaziz Ghuloum <aghuloum>
Date: Mon Mar 26 07:38:33 2007

On Mar 25, 2007, at 10:32 PM, MichaelL_at_frogware.com wrote:

> But I'll tell you what. Find a document, written by someone with
> substantial Unicode experience, that recommends UTF-32 as the best
> overall
> in-memory encoding.

For some "all-Scheme" systems, even UTF-32 may be suboptimal since
string-ref
would incur two additional instructions (shift and tag) while string-
set!
would take one instruction hit (untag) while ordinarily each could be
done
with a single machine instruction. A representation of strings as an
array
of tagged characters may be a win for all Scheme operations and would
only
lose for cross-language communication (which may lose anyways
depending on
the encoding of the interfaced-to environment, or the number of types of
foreign libraries or operating systems).

I would not expect a Unicode expert to know about implementation
details of
optimizing Scheme implementations, which are far different from the
details
and constraints of a C library, a browser, or a stand-alone XSLT
processor).
I would take their advice as a rule-of-thumb (as in follow it when
you don't
know any better). I trust that the editors know better.

Aziz,,,
Received on Mon Mar 26 2007 - 00:20:15 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC