[r6rs-discuss] [Formal] Non-ASCII characters should not be treated all alike

From: Thomas Lord <lord>
Date: Fri Dec 1 20:39:22 2006

John Cowan wrote:
> Thomas Lord scripsit:
>
>
>> At the same time, the excuse "because it contradicts Unicode
>> best practices" is not, in and of itself, sufficient reason to assign
>> any source text an error meaning. That is, implementations should
>> be permitted to assign such source texts a non-error meaning.
>>
>
> Section 3 says:
>
> An implementation is not permitted to extend the lexical or read
> syntax in any way, with one exception: it need not treat the
> syntax #!<identifier>, for any <identifier> (see section 3.2.3)
> that is not r6rs, as a syntax violation, and it may use specific
> #!-prefixed identifiers as flags indicating that subsequent
> input contains extensions to the standard lexical syntax.
>
> Since the characters under dispute are not derivable from any production
> in Chapter 3 (other than the <character> and <string element> productions),
> they are forbidden in conforming R6RS programs, and conforming R6RS
> implementations must reject them (though nothing prevents a given
> implementation from providing one or more modes in which it does not
> claim to conform to R6RS).
>

The restriction against certain characters appears to make it necessary
that implementations have a feature which rejects source texts containing
those characters. Yet ass the Report begins: "Programming languages should
be designed not by piling feature on top of feature, but by removing
the weaknesses and restrictions that make additional features appear
necessary. "

Removing the restriction removes the need for a feature that enforces
the restriction and otherwise has no impact on the execution of portable,
non-divergent R6RS programs. One could argue that if the restriction
is *not*
removed before R6RS is finalized then, in spirit, R6RS is simply no
longer Scheme because it places demands on implementors that originate
out of nothing more than the aspirations of the authors to prevent
certain programs from being written.

The principle "be strict in what you transmit, be tolerant in what
you receive" loosely applies here. R6RS can transmit with discipline
by, yes, specifying a portable lexical syntax yet, figuratively, R6RS
can receive with tolerance by simply remaining (mostly) silent on
the required meaning of non-portable texts.


> On your argument, the program
>
> (let ((x 1) (y 2) (cons x y))
>
> could be assigned a menaing by a conforming R6RS implementation.
>
>


Now we are discussing, very generally, source texts which either
lexically or syntactically fail to have a non-diverging meaning
assigned by the Report. This is a good category of source texts
to consider. Yes, an important question is whether or not
the Report should require implementations to treat such texts as
errors. The report might instead simply decline to define the meaning
of such texts.

The relevant lexical and syntactic properties of the programs in question
are statically decidable and programs that make that evaluation can be
written in Scheme. Such a program is a tool that tells you whether
or not, at least lexically and syntactically, a source text has a robust
meaning under the Report. Such a program is conventionally called a
"lint" tool and lint tools are often very valuable.

That would be a proper solution, in keeping with the introduction
to the Report: to yes, please, go ahead and define (with a reference
implementation, preferably) a "lint" program for R6RS -- but do not
require that an implementation accept only programs which pass
the "lint" test. There is simply no need for such a restriction.

-t

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.r6rs.org/pipermail/r6rs-discuss/attachments/20061201/cef1cb1b/attachment-0001.html
Received on Fri Dec 01 2006 - 20:41:00 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:00 UTC