[r6rs-discuss] Comparing Strings
The C runtime library has two string comparison functions, strcmp and
strcoll. strcmp is not locale-aware while strcoll is. Some implementations
add case-insensitive variants of those functions, stricmp and stricoll.
In R6RS the case-sensitive string comparison functions look like strcmp
while the case-insensitive comparison functions look somewhat like
stricoll. (Not really, but I'll get to that.) If that's true there's a
funny and unexpected asymmetry between the two sets of functions.
I believe the main issue is that strings use a different case folding
algorithm than characters do. From the specification:
(char-upcase #\?) ===> #\?
(string-upcase "Stra?e") ===> "STRASSE"
(string-ci=? "Stra?e" "Strasse") ===> #t
(string-ci=? "Stra?e" "STRASSE") ===> #t
I would expect the stricmp-equivalent variants of string comparison to use
the algorithm that characters use rather than the one it currently uses.
The current functions may be useful, but I believe that a) there's a
missing set of functions and b) the existing set of functions, if they
remain, should have names different from what they do now. (They represent
a different concept, rather like strcmp and strcoll do.)
In the end, are the existing functions *really* useful? Honestly, I can't
think how. The string-upcase example is cute, but the case-insensitive
comparison functions that use it are useless for any serious work. They
have a semblance of locale awareness, but they aren't locale aware and
that fact would show through rather quickly. In fact, it even shows
through in case-folding: (string-downcase "STRASSE") ===> "strasse", not
"stra?e". Whether that's right or wrong depends on where you are.
I think it would make more sense for R6RS to define a full set of truly
locale-aware functions and place them in a separate (and optional)
library. In any event, I question the benefit of the functions as they're
currently defined.
Comments?
Received on Wed Feb 14 2007 - 12:00:39 UTC
This archive was generated by hypermail 2.3.0
: Wed Oct 23 2024 - 09:15:01 UTC