[r6rs-discuss] [Formal] Allow inline hex escapes anywhere

From: John Cowan <cowan>
Date: Thu Mar 22 00:42:13 2007

Alan Watson scripsit:

> Many current Schemes have lexers written for ASCII (or Latin-1)
> character sets. Conversion of these lexers to the new standard would be
> easier if the report allowed inline hex escapes to appear anywhere in
> Scheme code. One would simply add a pass before the lexer that converts
> non-ASCII characters to inline hex escapes and converts inline hex
> escapes representing ASCII characters to ASCII characters, and would
> modify the lexer to handle inline hex escapes as appropriate.

The difficulty with that idea (which is what Java uses) is that
it changes the meaning of a\x20;b. Instead of being a single
symbol of three characters, it is equivalent to a b, a sequence
of two symbols.

> This modification would also create a means to portably interchange
> programs using only ASCII, although I'm not sure if this is especially
> useful given UTF-8.

In order to make the conversion reversible, it's necessary to use
a pattern like Java's:

# The Java programming language specifies a standard way of transforming a
# program written in Unicode into ASCII that changes a program into a form
# that can be processed by ASCII-based tools. The transformation involves
# converting any Unicode escapes in the source text of the program to
# ASCII by adding an extra u-for example, \uxxxx becomes \uuxxxx-while
# simultaneously converting non-ASCII characters in the source text to
# Unicode escapes containing a single u each.
#
# This transformed version is equally acceptable to a compiler for the
# Java programming language ("Java compiler") and represents the exact
# same program. The exact Unicode source can later be restored from this
# ASCII form by converting each escape sequence where multiple u's are
# present to a sequence of Unicode characters with one fewer u, while
# simultaneously converting each escape sequence with a single u to the
# corresponding single Unicode character.

IMHO all this is more trouble than it's worth. Let's stick to the
R5.92RS version.

-- 
John Cowan   cowan_at_ccil.org    http://ccil.org/~cowan
I come from under the hill, and under the hills and over the hills my paths
led. And through the air. I am he that walks unseen.  I am the clue-finder,
the web-cutter, the stinging fly. I was chosen for the lucky number.  --Bilbo
Received on Thu Mar 22 2007 - 00:42:07 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC