[R6RS] draft Unicode SRFI

Fri Jul 1 11:02:47 EDT 2005

At Fri, 1 Jul 2005 10:24:57 -0400, Marc Feeley wrote:
> On 1-Jul-05, at 9:41 AM, Matthew Flatt wrote:
> 
> > I agree with this, but it doesn't solve the problem when a "text" file
> > is already open in binary mode,
> 
> I don't understand what you mean here.  Do you mean that the compiler/ 
> interpreter has opened the source file in binary mode?  Why would it  
> do that?  If it is opened in text mode, using the "universal" end-of- 
> line encoding ({CR+LF, CR, LF} map to #\newline) would solve the  
> problem, no?  You probably want this behavior anyway for here  
> strings, and to correctly keep track of source code location.

First, this doesn't handle the \r\r\n case, unless you allow newline as
one of the whitespaces in \<eol><whitespace>.

Second, we're talking about reader syntax, not just code syntax, so
someone might arrive at confusion through `open-input-file' and `read'.

Finally, I'm not sure that a port-level conversion is the right thing
for here-strings. I can easily imagine wanting the source characters
intact in a here-string, including \r and \n.

More generally, I favor a "just-in-time" solution to any encoding
issue, which in this case means sorting out newlines at the level of
the parser. I can see your position, though.

Is there anyone who *doesn't* think I should change to the more
conventional assumption that newlines have been sorted out before
parsing?

Matthew