Marcin 'Qrczak' Kowalczyk <qrczak_at_knm.org.pl> wrote:
> A disadvantage of UTF-16 is that character predicates like
> char-alphabetic? break for characters above U+FFFF.
This kind of bug is pretty common in Java, but it isn't a
necessary consequence of using UTF-16.
Nor does focusing on scalar values fix the problem:
(define (all-alphabetic? s)
(for-all char-alphabetic? (string->list s))) ;BUG
This bug is both subtler and more likely to bite.
You could fix both by providing higher-level APIs:
(string-first s) ===> the first grapheme cluster
(string-rest s) ===> everything else
and so on. The way this leads is to a realignment of
all the string/character APIs toward grapheme clusters,
away from scalar values. I offer this because if the
editors want to do something unconventional, I think
this is the way to go.
-j
Received on Sat Mar 24 2007 - 23:12:51 UTC
This archive was generated by hypermail 2.3.0
: Wed Oct 23 2024 - 09:15:01 UTC