[r6rs-discuss] [Formal] Trivial Enhancement of macros in v5.91: capture-syntax from William D Clinger on 2006-11-29 (r6rs-discuss.mbox)

From: William D Clinger <will>
Date: Wed Nov 29 22:12:17 2006

I am posting this as an individual member of the Scheme
community. I am not speaking for the R6RS editors, and
this message should not be confused with the editors'
eventual formal response.

The reason I am pursuing this conversation is because I
think this stuff is subtle, and discussing these issues
with genuine experts has been of tremendous help to me
as I try to understand just where the draft R6RS library
system went wrong.

Andre van Tonder wrote:

> > That means I cannot write a reference implementation for R6RS
> > arithmetic that guards against a procedural macro that creates
> > a macro-time bignum and tries to insert it as a quoted constant
> > into the macro-expanded code.
>
> I'm not sure this is correct, though. I think datum->syntax
> must be applied to datums, which bignums presumably would not be.

By bignum, I meant an exact integer that is not a fixnum.
With that definition, it becomes absolutely clear that a
bignum is a datum in the sense defined by the draft R6RS.

The problem is that there are at least three distinct
languages involved here. One of those languages, which
I'll call L0, is the language in which the R6RS-conformant
script is written. Another language, which I'll call L1,
is the language in which the procedural macro's transformer
is written. A third language, which I'll call M for meta,
is the language in which the macro expander is written, and
I will assume that the library expander is also written in
M.

Some might perceive a fourth language, L--, in which the
reference implementation of arithmetic is written. From
my point of view, however, the reference implementation
of arithmetic (and any other libraries that use it) is
just a subset of the libraries that are imported (possibly
indirectly) by the (r6rs) library, which (let us assume)
is imported by the script.

Whether L-- is regarded as identical with L0 turns out not
to matter too much for the point I am making, which is that
the draft R6RS makes the mistake of treating L--, L0, and
L1 as the same language. With separated bindings, and an
invoke-separately-for-each-phase semantics, L0 and L1 will
not be the same language because the reference implementation
of arithmetic will be invoked twice, making some of their
fundamental types incompatible. This happens regardless
of whether you or I prefer to regard L-- and L0 as the
same language.

That means there will be two different versions of bignums:
those used by the script written in L0, and those used by
the procedural macro written in L1. The bignums used by
the procedural macro written in L1 really are the bignums
of language L1, so datum->syntax cannot complain when it
is passed an L1-bignum as an argument.

It appears that the existing reference implementations of
R6RS libraries will just embed that L1-bignum into the
macro-expanded code, without any marshalling [1].

With separate compilation based on conventional files, the
external representation of the L1-bignum will be written
to a file. Later the file will be read by the L0-linker,
and the external representation of the L1-bignum will be
read as an L0-bignum, and all will be well.

Without separate compilation, however, the L1-bignum would
probably cause a syntax exception to be raised. That syntax
exception appears to be allowed by the draft R6RS. Indeed,
that syntax exception appears to be required, in some sense,
by the draft R6RS.

We therefore observe that separate compilation changes the
behavior of the program. One could argue that the behavior
of separate compilation is at fault for not raising the
exception. In my opinion, however, the behavior at fault
is the one that raises the syntax exception. In my opinion,
the L1-bignum should have been marshalled into an L0-bignum,
and the fact that the reference implementation of libraries
fails to do so is the root of the problem.

I cannot consider the exception to be a bug in the reference
implementation of libraries, however, because the draft R6RS
appears to allow the exception to be raised. That, I think,
is a bug in the draft R6RS.

On the other hand, I sort of understand why the R6RS doesn't
require all L1-values to be marshalled into L0-values.
Marshalling of values between different languages (e.g. L1
and L0) tends to be unreliable; something usually gets lost
in translation. (Maybe nothing would be lost when translating
bignums from L1 to L0, but translating a flonum from L1 to L0
might change its precision and lose accuracy.)

On the third hand, the semantics of both L1 and L0 are under
the control of the R6RS. The R6RS would be within its rights
to require marshalling of all values that flow from L1 to L0.

How could such a requirement could be implemented within
language M? (Remember language M? It's the language used
to implement the macro expander and library system.) How
would language M even know which of the L1-values are
datums?

The answer to those questions is that the implementors of
the Scheme system on which the L0-script is running should
be required to do whatever is necessary to make the macro
expander and library system, written in M, know how to
marshall L1-values into L0-values. If the implementors
can't figure out how to do that, then they shouldn't try
to implement the R6RS.

As things stand, however, the semantics of scripts that
use libraries is rather implementation-dependent.

That isn't quite as bad as it sounds. Programs that don't
use procedural macros (via fenders or syntax-case) have a
reasonably portable semantics, so Scheme programmers can
work around the problems we've been discussing by avoiding
all procedural macros.

Unfortunately, the draft R6RS provides no way for a library
to protect itself against being imported by a procedural
macro. If we could fix that, then it would become possible
to write portable libraries in R6RS Scheme.

That, in my opinion, would represent a major advance over
the current draft R6RS.

> > That is an example of the mixing both of us deplore.
>
> I don't actually mind a constant number meaning the same
> thing at all levels.

As you can see from this message, I think the R6RS should
*require* it (and all other datums [sic]) to mean the same
thing at all levels.

The R6RS should also allow implementations of R6RS Scheme
to be written substantially in R6RS Scheme. At least four
of the R6RS editors agreed that an implementation of Scheme
should be allowed to implement "a variety of builtin types
including pairs, vectors, strings, and the numeric tower"
as record types [2,3,4,5].

As can be seen from the argument above and from previous
messages, that would imply serious marshalling between
levels---unless the R6RS also allows L1 to be the same
language as L0, in the specific sense that is achieved
by sharing bindings between levels. In implementations
that use shared bindings, the marshalling can be trivial.

That, I believe, is the core of the argument for allowing
shared bindings. Not allowing them would add complexity
and inefficiency, all for the sake of a feature that many
Scheme programs just do not need: procedural macros.

Will

[1] http://lists.r6rs.org/pipermail/r6rs-discuss/2006-November/001197.html
[2] http://www.r6rs.org/r6rs-editors/2005-July/000786.html
[3] http://www.r6rs.org/r6rs-editors/2005-July/000787.html
[4] http://www.r6rs.org/r6rs-editors/2005-July/000789.html
[5] http://www.r6rs.org/r6rs-editors/2005-August/000812.html
Received on Wed Nov 29 2006 - 22:12:00 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:00 UTC