From: Chris Hanson <cph_at_csail.mit.edu>
Subject: Re: [r6rs-discuss] Scheme should not be changed to be case sensitive
Date: Tue, 21 Nov 2006 01:33:04 -0500
> Shiro Kawai wrote:
> > By prefix you mean some mark like #!fold-case? Well, it works,
> > but the story can be complicated.
[...]
> > [...more reasons why this could be a problem...]
>
> How is this different from a BOM? It seems to me that exactly the same
> problems crop up.
Not exactly. BOM is taken care of by the interface between
internal characters and external data chunk---most naturally
at ports. Scheme world beyond the character ports would never
see BOM. And ports usually know where to look at or to place BOM.
On the other hand, for symbol cases, 'read' and 'write' need to
know how to deal with them. They don't know whether they're
dealing with something in the middle of the larger data chunk
or not.
But they are sort of similar, indeed. I thought that _was_ a
problem, but it's just annoying kind of one rather than a
show-stopper, so I won't argue further.
> As for the "spirit of Scheme", I think you should be a little more
> careful before invoking that particular genie. It appears in different
> forms to different people, as was made painfully clear in the previous
> standardization efforts.
I take your advice. Actually, in practice, I agree that it would
be convenient if we have a choice. That would make the spec bigger,
though, so it's a trade off between spec complexity and simplicity.
If most of us don't mind the spec to be a bit bigger including this
stuff, it's ok for me.
> > [...module system makes this moot since old files won't work...]
>
> I don't buy that particular argument, precisely because it too breaks
> backwards compatibility. The library spec will have to be fixed, and
> I'll get to that in due course.
Point taken.
> > It may be still a convenient means to tag the file to
> > indicate which mode the file wants to be read. I've been
> > using such kind of tag to indicate the character encoding
> > of the file, and it is working very well (I'm working
> > in the environment where the source files are written in
> > utf-8, euc-jp, shift-jis, and latin-1).
>
> I think this is exactly right. In fact, I'll go one step further:
> Scheme files should have an extensible header that can specify this as
> well as other things (e.g. character coding, language, locale). But
> regardless of the details, I think the information about case
> sensitivity should be associated with the file containing the symbols,
> and not in the code that reads it.
>
> So let me suggest an alternative compromise. Suppose we assume some
> kind of marker that appears early in the file, prior to any symbol.
> (Normally I would say the first line, but I can see that this might be
> an issue for unix scripts.) The marker is not dynamic -- it's allowed
> to appear once, in a very limited scope, and it's easy to tell whether
> it's present or absent. Assume that there is a default behavior in the
> absence of a marker -- I'll get to the issue of _what_ the default is
> shortly. Also, by default, output files have no markers and conform to
> the default behavior for input files.
Generally I agree. In the technical details there are
issues to be addressed, but I'll discuss it in separate
message.
> Now, assuming that the above suggestion is agreeable, we've reduced our
> disagreement to the question of the default behavior.
[...]
> You seem to be arguing that the default should be case sensitivity.
That's my preference, but I can live with implementation-dependent
default and two kind of explicit tags.
> On a slightly different note: one thing that worries me in this
> discussion is that several of the participants seem to think that case
> sensitivity is the obvious choice, and that those in favor of case
> insensitivity bear the burden of proof. I am relatively agnostic about
> the technical choice (except as it concerns compatibility), but I feel
> strongly that the choice is anything but obvious.
>
> We are having a discussion about making a break with over 40 years of
> Lisp history (and nearly 30 years of Scheme history), and for some
> reason this doesn't seem to be a part of the discussion.
My guess is that's an artifact from the limited capability
of early computers (upper-case only) and just stuck by inertia.
Anyway, to me, more implementations are moving towards case
sensitivity. So it is good to provide a choice, at least.
BTW, I reflected the feedback to my little case-sensitivity
survey and put it on wiki:
http://practical-scheme.net/wiliki/schemexref.cgi/Concept:CaseSensitivity
--shiro
Received on Wed Nov 22 2006 - 16:19:15 UTC