[r6rs-discuss] Strings

From: Thomas Lord <lord>
Date: Tue Mar 27 14:52:14 2007

Jon Wilson wrote:
> Hi Tom,
> Just a nitpick, peripheral to the actual content of the thread.

It's a fair nitpick of the language I used but, lemme clarify, briefly.

>> (UTF16? x) implies (CHARLIKE? x)
>> (CHAR? x) implies (CHARLIKE? x)
>> (GRAPHEME? x) implies (CHARLIKE? x)
>> etc.
>> In that sense, we're left arguing mostly over the names of
>> things and I'm on the side that says the proper name for
>> the imagined CHARLIKE? type is actually, gosh, CHAR?.
> This is not identical to arguing over the names of things. The above
> bits of pseudocode mean that the set of all UTF16s is a subset of the
> set of all CHARLIKEs, the set of all CHARs is a subset of the set of
> all CHARLIKEs, and the set of all GRAPHEMEs is a subset of the set of
> all CHARLIKEs, etc.
> Saying that the proper name for the imagined CHARLIKE? type is
> actually CHAR? implies that the set of all CHARs is equal to the set
> of all CHARLIKEs. This further implies that the set of all UTF16s is
> a subset of the set of all CHARs, and the set of all GRAPHEMEs is a
> subset of the set of all CHARs, something entirely non-trivial from
> the first bunch of relations. Saying that we are arguing over the
> names of things implies that we are arguing over something trivial,
> because names are arbitrary provided they don't collide.
>

My formal comment advocates for a situation in which it is /permitted/ that:

    UTF8 is a subset of CHAR
    UTF16 is a subset of CHAR
    CODEPOINT is a subset of CHAR
    GRAPHEME is a subset of CHAR
    TRAFFICLIGHTSIGNAL is a subset of CHAR
    etc.

    and, for example,

    CODEPOINT and TRAFFICLIGHTSIGNAL are disjoint


In my proposal, it is /required/ that a small subset of CODEPOINT is
present in all implementations, and it is /suggested/ ("/should/" in the
language of the standard) that /if/ a larger set of codepoints are
supported, they too be CHAR types, have the natural CHAR->INTEGER
mapping, and be what is given by INTEGER->CHAR for those values.

My contention is that the 5.92, in contrast, /requires/ that CODEPOINT
is equal to CHAR and /requires/ that all of CODEPOINT be supported.


>
> PS: I'm using terms from naive set theory here (which I know you
> dislike) because I am not sufficiently familiar with the various
> axiomatic set theories to employ their terminology and because it is
> not immediately obvious that anything beyond naive set theory is
> needed here, as we are not being rigorous enough to invoke the various
> paradoxes. Of course, I guess the reason why naive set theory is
> naive is that it is not immediately obvious that anything beyond it is
> required. hmmm....

I'll see if I can clear that up in a separate message.

-t


>
> _______________________________________________
> r6rs-discuss mailing list
> r6rs-discuss_at_lists.r6rs.org
> http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.r6rs.org/pipermail/r6rs-discuss/attachments/20070327/16c3e68f/attachment.html
Received on Tue Mar 27 2007 - 15:02:26 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC