[r6rs-discuss] on rational 6.7 Compund library names

From: Thomas Lord <lord>
Date: Wed, 27 Jun 2007 01:00:57 -0700

Anton van Straaten wrote:
> I said "general purpose, Turing complete". My point, in case you
> really missed it, is that there's a distinct lack of general purpose
> programming languages that use the mechanism that you claimed was
> "industry best practice". Therefore, it is not industry best
> practice, in the context of general purpose programming languages.

Sorry, but I think that XQuery is very much "general purpose" (and of
course it is also Turing complete). That aside...

You ignored my point; I did not miss yours. URIs are "industry best
practice" for externalizable, lightly-structured, distributed-allocation
names. The need for such names arises in many languages but few
languages besides Scheme have the privilege of defining such a namespace
in the historic context of URIs-as-best-practice.





>
>> I don't see how URIs "don't fit in with Scheme's approach to naming"
>> -- you'll have to explain that one for me.
>>
>> I'm sure I don't know what you mean by "Scheme's approach to symbolic
>> abstraction" either.
>
> When representing some external structure in Scheme, it's typical to
> use the symbol data type to model the names (keywords, identifiers,
> words, etc.) in that external structure.

Hardly. Quite the opposite. As the popular interfaces for opening
files illustrate, it is quite common to use *strings* for external
names. Those things which are commonly named by Scheme identifiers
are more typically *not* subject to this kind of externalization (e.g.,
variables). Strings make good common sense for names with a large
external significance: they are intrinsically serializable in simple ways.


> The canonical example of this is treating Scheme source code as data.
> In that case, symbols are used to represent identifiers (i.e. names)
> in the code being operated on.
>

Identifiers are used for matters internal to programs, such as variable
names. By careful design, library names are not variable names and,
generally, do not behave like Scheme's internal names. For example, in
library source, the library name and import names can not be produced by
macro expansion. They are not part of a tree of lexical scopes. They
are explicitly designed (in the draft and in my proposed changes to the
draft) to be processed by external tools that are simplistic and
insensitive to Scheme's scoping rules.



> Another relevant example of this is SXML, which represents XML data in
> this way. One can, of course, represent XML within Scheme using
> strings, but there are many benefits to using S-expressions and
> list/symbol structure instead, having to do with Scheme's intrinsic
> support for the those structures.

I am having trouble making sense of you there but I think you are
confusing surface syntax with logical structure and with internal
representation.

XML has its native surface syntaxes and sometimes those are very
convenient. It would even be convenient to mix such surface syntaxes
into Scheme source. It is also sometimes very convenient to use a
surface syntax for XML that is derived from Scheme's syntactic
datums. I think we agree about that much, no?

Of course, in the preferred API at run-time, XMLish datums ought to have
APIs whose logical structure follows the DOM definition, at least
insofar as we want to manipulate those datums with procedures that enjoy
closure properties over the domain of XML datums.

And, yes, lists and symbols are likely suspects when it comes time to
pick a run-time representation of those XMLish datums.

What of all this? I don't see how it makes any point that would lead
you to argue with me.

Instead, you ought to embrace and explore canonicalizing XMLish types
and APIs within Scheme for many of the same reasons that there is
justified enthusiasm about canonicalizing Unicode types and APIs.





>
> URIs can certainly be embedded as strings within S-expressions, or
> even as monolithic symbols or identifiers, but to manipulate them at
> the component level, the first thing you would need to do is invoke a
> conversion procedure to turn them into a more usable form.

That is true. The reason that such rules are defined separately from,
say, the rules for cons pairs and symbols, is because of the added
constraints of global utility, serializability, and distributed
allocation. When you are talking about names for variables that occur
only within some textually bounded lexical scope, the design constraints
are one way -- when you are talking about names for entities that will
be exchanged as commodities among members of a globally distributed and
decentralized population, the design constraints are another thing.

> That more usable form would typically at least resemble the library
> naming system specified by R6RS. Using URIs imposes the need for
> conversion, which isn't necessary otherwise, and that is an indicator
> of the sense in which they don't fit in with Scheme's approach to
> naming or symbolic abstraction.
> \

"Thing should not be treated as simpler than they really are and yadda
yadda yadda...." I think Einstein said that.




> (It would be a little different, btw, if standard Scheme was more
> polymorphic, and could treat a URI as though it were a list. Although
> there'd still be an argument against URIs from Scheme aesthetics.)

URIs are not lists. They just aren't. So, it is nonsense to talk
about treating a URI as if it were a list. You are just thinking
wishfully. "If wishes and buts were candy and nuts we'd all have a
bowl of granola." It is possible that Heisenberg said that.


>
> While library names have some external denotation, they also have a
> meaning within the language, and that meaning has nothing to do with
> URIs.

Yes, that's right, there are both internal and external constraints to
satisfy simultaneously. That is part of what makes it a non-trivial
engineering problem.



> For example, what purpose does the protocol field of a URI serve, when
> it appears as a library name in Scheme code? I know the W3C/REST
> answer to this, but it really isn't relevant to Scheme source code.

The "protocol field" (really called the "method" -- examples are
"http:", "ftp:", "mailto:", and "urn:") names a commonly accepted way to
interpret an "authority" field (examples are "//google.com",
"//schemers.org", and "uri:/").

The URI design's combination of method fields and authority fields are
the essence of how URIs achieve a decentralized, mostly non-rival
allocation strategy.

We would like programmers all around the world to be able to publish
named Scheme libraries on an equal footing with one another, no? That
is, we do not wish them to have to appeal to a central authority to be
assigned a library name, yet we would like all separately published
libraries to have distinct names, right? The URI structure of method
and authority fields achieves those goals in a very pragmatic way.





>
> Using URIs as libary names in Scheme source code represents a
> conflation of concerns.


What conflation? We would, ideally, like all published libraries to be
fully compositional, at least among libraries published by non-malicious
parties. That requires a distributed allocation strategy and you won't
come up with a good mechanism for that until you either adopt or
re-invent the URI method-plus-authority structure.



>
>>> Well, we could specify the syntax of Java as being Scheme syntax,
>>> and then it would no longer be a foreign syntax, would it? But
>>> somehow that doesn't make such a course of action any more appropriate.
>>
>>
>> It would be very handy, I suspect, to have a Scheme reader extension
>> library which interpreted Java syntax as denoting either syntax trees
>> of a particular sort or as a sequence of calls to constructors.
>
> Is a reader extension needed? PLT Scheme has libraries which support
> reading Java code and manipulating the resulting ASTs. But that
> doesn't make Java syntax part of the Scheme language.

It is a small step from what you describe to being able to embed Java
syntax in source texts. That is somewhat unexciting but if we talk
instead about embedding XML syntax, the result is likely to be quite handy.


> Having a way to read URIs is distinct from specifying URIs as a basic
> way to name certain entities within the Scheme language.
>


That's true. But here, in the case of libraries, we have entities that
need externalizable names that satisfy design constraints like
distributed allocation and practical serializability. URIs are a good
answer to such constraints and syntactic s-exps are not.



>> In contrast, Scheme currently has no syntax for identifiers which are
>> intended to have significance external to Scheme itself, such as
>> library names. It has no model for such identifiers.
>> Serializability suggests that such identifiers should be recognizable
>> as strings. Common sense and current best practices (and the sound
>> principles behind them) suggest that those strings should contain URIs.
>
> Common sense suggests no such thing. It does suggest to me that more
> specific and substantive arguments are lacking, though. The claim
> about best practice is empty at best, until a relevant example is cited.

The main area where people wrestle over externalizable identifiers is in
network protocols and, these days, if you aren't using URIs for your
application-independent identifiers, you need to provide a positive
rationale.



>
> Besides, it would be necessary to have a mapping from URIs to the
> actual, non-globally-unique libraries which they denote. There's no
> advantage to having that mapping be from URIs to that external
> referent, as opposed to from an S-expression. It's also
> straightforward to define a mapping from an S-expression to a URI if
> needed, and I'm sure that'll be done at some point.
>


There is considerable advantage. For example, many database systems
have excellent, richly detailed, highly accurate support for treating
URIs as database keys -- such as might be used to map from a library
name to the contextually appropriate source text.

I don't know of any serious database that uses Scheme s-exps as primary
keys.


>> At best, you are saying that in order to make it incrementally easier
>> to write such tools in Scheme, we must make it monumentally harder to
>> write or adapt such tools in other languages.
>
> How so? Such tools in other languages have to, at minimum, deal with
> S-exps to some extent even to get to the library names. So what's so
> monumentally hard?


The library syntax of the draft is carefully and explicitly designed so
that the extent to which tools must understand s-exp syntax is severely
minimized. It wanders, aimlessly, away from design pattern when it
introduces library names which are syntactic (recursive) lists of Scheme
identifiers and syntactic numbers.


>
> And why should Scheme's design be skewed towards having its source be
> processed by other languages, at the expense of its own norms?

Because Scheme is an engineering project, among other things. It
exists within a larger world.



>
>> A Scheme URI "parser" (in the weak sense that you refer to) for URIs
>> is but a few lines of code and can be vastly useful in many
>> applications that range far afield of handling library source code.
>
> Such a parser indeed makes a useful library function, and many
> implementations offer one.
>
>> In contrast, a PHP "parser" for s-exp-based library names would
>> require implementing significant chunks of a Scheme reader and would
>> have little utility outside of handling Scheme library texts.
>
> Aside from the point that they'd need to be able to minimally parse
> S-expressions anyway, I see no reason why Scheme's design should be
> influenced by such considerations.


For reasons similar to why Scheme should work on Unicode rather than
making up its own global text encoding: because someone else has
already worked out a model that follows principles that make sense --
there is no need for a gratuitously different, redundant model of a
single concept.


>
> The S-exp format has proven so useful that people implement it for
> native use in other languages, such as C and OCaml. Let them eat
> S-expressions.
>


Blah blah.


>> The design principle I think should be emphasized here is something
>> like "Situationally Appropriate Consonance"
>
> Oh, I agree. In the case of Scheme source code, it's clear that
> embedding URIs to represent library names is not situationally
> appropriate, and is not any kind of consonance. Again, the same
> appears to be true of all other general purpose, Turing-complete
> languages.
>

Except that it doesn't. See above.




>> and the many uses for externalized library names suggest that URIs
>> are by far the most parsimonious choice.
>
> Redundantly repeating a protocol every time I name a Scheme library
> does not strike me as parsimonious.


Then introduce a semantic model of dynamic context when library sources
are processed and permit relative URIs. That is the pattern commonly
applied in response to such a design constraint.




>
>> I don't think that the statement of "scope" was intended to be
>> normative in the way you imply. If it was, then it is
>> correspondingly faulty.
>
> I'm not saying it's normative, but it's meaningful, and it's been out
> there for years for feedback and discussion.
>


And here we are, enjoying feedback and discussion. This is no small
part of the problem with the very idea of continuing the revised report
series: the transaction costs for each new revision apparently must
include a multi-year process. That is no way to evolve a language.


>> No, it does not. If the denotational semantics were still in place
>> I would want an ExecEnv category whose structure describes "bag of
>> source files". I guess in an operational semantics I'd want an
>> ExecEnv set or something.
>>
>> More pragmatically, I'd like reflection on ExecEnv, such as getting a
>> list of available source files. Perhaps even a standard API for
>> adding new source files (gasp!).
>
> Do you really believe that the lack of a fully formal specification
> here is going to be a problem for the very first incarnation of a
> standard libary system for Scheme?
>

Yes, I do.



>> I can't make sense of that. Java is observably not the product of
>> a single organization.
> ...
>> So, really, there isn't a single point in your claim there that I can
>> accept.
>
> The Java Language Specification was put out and long controlled by Sun
> Microsystems, Inc. I'm not making any judgements about that, but it
> has led to Java being in a very different situation than Scheme.
>


Please, whatever you do, just throw around lots of vague innuendo
combined with vacuous truisms.


> Another huge factor is that the Java specification also included a
> portable bytecode format, which goes a long way towards reducing
> divergence between implementations. That alone can explain much of
> the difference between Scheme and Java standardization.

A bold claim and perhaps a launching point for polemic but, really, I
don't think you are saying anything serious there.


>
>> Why is this an R7 issue rather than a contemporary issue? Can we
>> make progress in no other way than if we sequence work according to
>> your priorities? Are your suggested priorities obviously the
>> fastest way to make progress? How did you arrive at these conclusions?
>
> They are not my priorities, or even the editors' priorities. See the
> documents I referenced, going back to the charter produced by the
> Strategy Committee in 2004. The formation of those priorities has
> been based on a great deal of community discussion going back years
> before 2004. Arguably, going back to at least 1991, by which time the
> attempts to standardize a record system for R4RS had failed.
>
> It's wonderful that we're now at a point where standardization is
> going so well that we can argue about whether Scheme should be
> pioneering a new approach for library naming in the source code of
> general-purpose languages. But it does make sense to me to first
> settle on some more basic features before worrying about what really
> are incredible luxuries, compared to where Scheme standardization has
> been at.
>

As a social matter, I think we should just lower the cost of
seriously-taken formal specifications by emphasizing SRFIs and putting
the "revised report" series into its grave. We need to coordinate what
is written down as peer-reviewed spec with what implementors actually
do. We are now, mostly, wasting time on a pointless and probably
harmful newly "revised report" when, instead, we could regard the draft
as a mature but not-yet stable living document, push implementations to
at least approximately support it, and get on with growing the language.

There is something utterly Unschemey (in the sense of having nothing to
do with essential gists of concepts) to wrestling over something as
expensive yet arbitrary as the contents of an imagined R6RS.



> URI library names allegedly solve a problem that Scheme can, right
> now, only wish it had.

Someone once said "Be careful what you pretend to be because that's what
you turn out to be" or words to that effect. The same logic with a
different spin says "Pretend you are what you want to be because, to the
extent you can do that, you can become what you want to be."



> And it's a problem that every other general purpose language seems to
> be able to live with. That's why this is not an R6RS issue.
>


You keep imagining or assuming some big superstructure in which terms
like "an R6RS issue" make good sense for good reason.



>> (Yet another example of why publishing R6 at all is a bad idea -- it
>> leads to this kind of totemic struggle over names (and hence timing)
>> of efforts.)
>
> I don't know about any totemic struggles.

Above, you made a claim about what is and what is not "an R6RS issue".
That is an example. What an utterly pointless and unproductive
question it should be whether or not something is "an R6RS issue" yet,
here we are going on about it.



> I was just trying to expand on a technical rationale that I thought
> was being misunderstood.

A lot of crap in politics function in just the way you are demonstrating
here. There's this vague, made-up, highly problematic epistomological
concept of "an R6RS issue" and in your statement there you are
conflating that very messed up concept with a technical rationale.




>
>>>> That is to say that conservatively coded bundles of Java source
>>>> code files have a sufficiently well defined semantics that they can
>>>> be exchanged as commodities and executed on generic, virtual,
>>>> fungible platforms (e.g., commodity grids).
>>>
>>>
>>> That should, in fact, largely apply to a conservatively coded bundle
>>> of R6RS Scheme code, particularly if implementations follow the
>>> recommendations in the non-normative appendices.
>>
>>
>>
>> An interesting thought-experiment or, perhaps, real project: can
>> you specify (perhaps presuming the draft) a SRFI that specifies a
>> commodity form for Scheme-based system images?
>
> That's very different from "conservatively coded bundles of ... source
> code files". The draft already specifies what's needed for those to
> work. But system images? What's the goal, considering that Schemes
> vary so dramatically in their implementation styles?
>

Why can't a conservatively coded bundle of source code files be a system
image?

-t
Received on Wed Jun 27 2007 - 04:00:57 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC