[r6rs-discuss] perhaps i should be formal, but....

From: Thomas Lord <lord>
Date: Wed Mar 14 18:11:36 2007

Well, I wasn't gonna say it but I'm glad *someone* brought
up this use-case.

(And, no, this use-case isn't my main inspiration but it is on
the list things I think about in this area.)

-t


Shiro Kawai wrote:
> From: William D Clinger <will_at_ccs.neu.edu>
> Subject: Re: [r6rs-discuss] perhaps i should be formal, but....
> Date: Wed, 14 Mar 2007 16:16:11 -0400
>
>
>> I am posting this as an individual member of the Scheme
>> community. I am not speaking for the R6RS editors.
>>
>> Thomas Lord wrote:
>>
>>> Earlier revisions of the standard defined a portable character set,
>>> allowing implementations to freely expand beyond that set.
>>> In a portable program, if only the portable character set is
>>> used, reliably portable behavior obtains.
>>>
>> What's different now is that Unicode has become an
>> established standard, and the portability advantages
>> of requiring Scheme programs to use Unicode (which
>> is more than just a character set) appear far larger
>> than any advantages that might still be derived from
>> allowing implementations and programs to choose their
>> own character sets.
>>
>
> I also want to reserve a possibility of using different
> character sets / encodings, but I agree that Unicode is
> the only practical standard for portable programs. I'm
> happy as far as R6RS does not prohibit an implementation
> to use alternative character set / encodings if it wish.
>
> For example, Japanese official family registration system
> uses its own character set and codepoints, since it needs
> to distinguish more subtle differences of characters than
> Unicode. (There's no clear line between abstract characters
> and glyphs---it is context-dependent, and for family names
> the line gets closer to glyphs).
>
> The range restriction of integer->char seems reasonable
> to guarantee the portable behavior. The implementation
> can have another procedure that can deal with non-unicode
> range/character set.
>
> Although I feel it better that the standard uses clearer
> namings between integer and character conversion, such as
> unicode-scalar-value->char (or some abbreviation of it), which
> makes it clear that one can't pass non-unicode scalar value.
> This isn't a strong desire, though.
>
> The wording of "character" object definition, however, could
> be changed. It is unclear to me that (char? <obj>) can return
> #t if <obj> is in the implementation's extended character
> set but not in unicode. If it can't, I can still provide
> (extended-char? <obj>) for example, but it's a bit awkward.
> So as other procedures that deals with characters---in
> (string <char> ...), should each <char> be in unicode? (I hope not!)
>
> --shiro
>
>
> _______________________________________________
> r6rs-discuss mailing list
> r6rs-discuss_at_lists.r6rs.org
> http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
>
>
Received on Wed Mar 14 2007 - 18:20:40 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC