[R6RS] draft Unicode SRFI
Matthew Flatt
mflatt
Thu Jun 30 09:08:07 EDT 2005
At Thu, 30 Jun 2005 08:24:33 +0200, Michael Sperber wrote:> > * I added the \<eol><whitespaces> and \<space> string escapes, as
> > discussed on this list.
>
> I think I probably missed \<space>---does this mean I can only
> terminate a variable-length escape sequence with a space?
No. It's so you can terminate a \<eol> sequence and continue with spaces.
See Kent's message here for an example:
http://mailman.iro.umontreal.ca/mailman/private/r6rs/2005-June/000655.html
> > * I left octal escapes for strings intact for compatibility for C.
> > (Also, I actually use them --- perhaps from spending too much time
> > with UTF-8 encoding.) There's no octal for characters, though.
>
> Did you use octal escapes to denote UTF-8 code units or actually
> scalar values?
Code units.
> > * I added an extension for symbols that allows any non-whitespace
> > character above 127 where a <letter> is allowed. Is this too
> > liberal?
>
> Yes. Shouldn't we at least restrict to Unicode letters and numbers?
I think we want Unicode symbols to be in Scheme symbols, for example.
> It seems to me we at least should exclude Unicode separators.
Separators are defined by SRFI-14 to be whitespace, right?
> - Could the SRFI please have an issues section where the things we
> haven't agreed on are listed?
Ok, I'll add that.
> - I think the delimiter issue for character literals could use an
> example. Otherwise, the point may get lost on the casual reader.
I'll add that, too.
> - The document says "any C string literal is also a Scheme string
> literal": I don't believe that's true anymore, as the \x syntax is
> variable-length in C.
In that case, I favor changing \x, but...
> (The sentence is literally true, I guess, but
> not in a meaningful way.) As a result, I'm pretty confused on the
> compatibility issue---if we're not compatible with C, we could also
> make octal escapes fixed-length at least, to make the whole
> scalar-value-literal issue a little less patchwork than it seems
> now. Compatibility with C and Java should also be in the issues
> section probably.
... there seems to be more support among the editors to ditch octal and
not worry about complete compatibility with C. That's ok with me.
> - What are your plans wrt the reference implementation? In my mind,
> we could and should provide one for most of it. I'd be happy to
> donate code.
I had no plans. I'm happy to assemble code starting with yours.
> - I don't understand how I could portably use the locale functionality
> in my code, since the document doesn't specify a single string I
> might use as a locale name. Also, the locale stuff could (and, to
> my mind should) have a reference implementation for at least some
> locales from the Unicode standard. (We could bum a starting point
> off Alex Shinn, I think.)
I'll investigate standards on locale names, which I've never done
before. In MzScheme, it's effectively defined as "whatever setlocale()
likes".
> - The sentence on UnicodeData.txt should probably be expanded a little
> bit and include a link to
> http://www.unicode.org/Public/UNIDATA/UnicodeData.txt be
> understandable by non-Unicode-wizards.
Ok.
> - The section on here strings should probably refer to the scsh
> manual, and possibly to the manuals of PLT Scheme and Gambit-C.
Ok.
> Typos:
Thanks.
> - The document makes out Neil Van Dyke as an R6RS editor.
Sorry, Anton! (You, Neil, and David were faceless northeastern "Van"s
to me on the plt mailing list, at first. I don't confuse the people,
anymore, but I sometimes use the wrong name --- just like I sometimes
call my kids by the wrong names.)
Matthew
More information about the R6RS
mailing list