[r6rs-discuss] Sharp questions and things like bytevectors

From: r6rsguy at free-comp-shop.com <r6rsguy>
Date: Thu, 21 Jun 2007 21:24:02 -0400

> From: Michael Sperber
>
> > r6rsguy at free-comp-shop.com writes:
> >
> > I am concerned that the syntax #vu8(...) for a bytevector
> > is irregular, hard to remember, and ugly.
> >
> > The name of these objects has been changed to "bytevector",
> > why not write the external representation as #bytevector(...)?
>
> This post has an explanation:
>
> http://srfi.schemers.org/srfi-4/mail-archive/msg00002.html

That post has a good explanation of the #\v, given that the
first character of the "mnemonic" name determines the syntax.
It says essentially "we are stuck in a corner and must choose
a letter that hasn't been used".

It is not an argument against #bytevector, per se, it
is an argument against any word that starts with B, D,
E, F, I, O, T, or X. Everybody got that list memorized?
Now add V to it.

From: ludovic.courtes at laas.fr
>
> The read syntax is also acceptable IMO, especially
> since it doesn't conflict with syntaxes specified by
> SRFIs such as SRFI-4.

Of course, the real question is not only what past SRFIs
may be in conflict, but also whether there is an irregularity
in the lexical structure that will preclude future SRFIs.

So the question is, in general, what follows a sharp?

Initially, I was going to point out that #vu8(...) conflicts
with SRFI-10, "#, External Form", which would have the bytevector
written as #,(u8 ...). Then I saw that R5.94 specifies that
#,foo (sharp comma foo) is an abbreviation of (unsyntax foo),
so R6RS kills SRFI-10, and I think rightly so. SRFI-10 could
be re-written with a different character after the sharp,
while the comma _should_ mean un-[quote,syntax].

Then it occurred to me that #tag(<datum>*) could itself be
a reasonable lexical syntax for a read-time application
of a reader procedure called #tag to the specified datums.

So, what can the tag be?

I spent some time working on a proposal that it be any
sequence of letters, but that is torqued around the fact
that #xbabe is an integer.

Here is a list of uses of sharp in R5.94

<boolean> -> #t #f #T #F
<exactness> -> #i #I #e #E
<radix> -> #b #B #o #O #d #D #x #X
<unknown digit> -> #
<comment> -> #;
             #|...|#
<abrev prefix> -> #' #` #, #,_at_
<vector> -> #(<datum>*)
<lexical flag> -> #!<identifier>
       where r6rs is the only identifier with an assigned meaning.
<character> -> #\<any character>
               #\<char name>
               #\x<hex value>

The definition seems to indicate that in a character specifed as
a hex value the #\x must be in lower case. Should #\X<hex value>
be added to that last production?

<bytevector datum> -> #vu8(...)

Is #VU8 forbidden? Except for this, if the character after the
sharp is a letter, then it can be in either upper or lower case.

It seems that the #\: (colon) is still unused after a sharp.

So here's the modest proposal:

<datum> -> <lexeme datum> | <compound datum> | <special datum>
<special datum> -> <reader prefix> <datum>
<reader prefix> -> #:<identifier>

A <lexical flag> may change the lexical structure of the
entire following text, but a <reader prefix> changes only the
way the following <datum> is read. In particular, it does
not change the syntax of a <datum>. For that reason, a
<reader prefix> uses a new character after the sharp.

So what is a reader prefix? I don't know. As of R6RS
there is only one, #:bytevector (or #:vu8, if you must).
It must be followed by a list of small integers, and it
is a <datum> and a <bytevector>. The point is that
there is now room for R7RS, future SRFIs, or crazed
rogue programmers to define some new ones, whatever
they are.

See SRFI-4 and SRFI-10 for a list of things they might be.

   -- Keith
Received on Thu Jun 21 2007 - 21:24:02 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC