> On 3/15/07, MichaelL_at_frogware.com <MichaelL_at_frogware.com> wrote:
> > > If string-ref also required O(1) time complexity, then you'd be
right.
> > > But it doesn't; it's perfectly fine to implement string-ref on top
of
> > > underlying UTF-8 or UTF-16 character sequences; you just have to
settle
> > > for O(N) performance.
> >
> > Are you suggesting that indexes represent code points rather than code
> > units? I haven't seen anyone do that, not as the one-and-only
interface to
> > elements of a string. Have you? And do you think UTF-8/UTF-16
> > implementations should be *required* to do that? (Obviously, then,
> > string-length would have to return the number of code points rather
than
> > the number of code units.)
>
> SBCL does that.
>
http://sbcl.sourceforge.net/sbcl-internals/Character-and-String-Types.html
I think SBCL uses UCS-4-sized code units when Unicode is enabled. If
that's correct, then no, it doesn't do "that", it simply chooses an
encoding that avoids the problem (at the expense of space).
Received on Thu Mar 15 2007 - 17:36:29 UTC