[R6RS] external representation for bytes objects
William D Clinger
will at ccs.neu.edu
Mon Aug 14 18:07:48 EDT 2006
I have checked in some minor edits for document/bytes.tex.
I would like to propose an external representation for
bytes objects, e.g. #u8(...).
The rationale for this become pretty clear if you look
at the current version of unicode/normalization.sch.
That file defines eight tables, five of which are or
contain bytes objects. The three tables that aren't
bytes objects and don't contain bytes objects can be
quoted constants; with separate compilation, they
will probably compile to some representation that is
at least as compact as their run-time representation.
The five tables that are bytes objects or contain
bytes objects are unlikely to compile to so compact
a representation. For example,
(define canonical-compositions
(vector
(list
(list->bytevector
'(#x0 #x41 #x0 #x45 #x0 #x49 #x0 #x4e ...))
(list->bytevector
'(#x0 #xc0 #x0 #xc8 #x0 #xcc #x1 #xf8 ...)))
(list
(list->bytevector '(...))
(list->bytevector '(...)))
...))
is likely to compile to a representation that
contains one pair for every byte that will end up
in a bytes objects, plus some code that puts the
table together at run time. That's a factor of 8
to 10 in space, for a large table, and it wouldn't
be necessary if we had an external representation
for bytes objects:
(define canonical-compositions
'#((#u8(#x0 #x41 #x0 #x45 #x0 #x49 #x0 #x4e ...)
#u8(#x0 #xc0 #x0 #xc8 #x0 #xcc #x1 #xf8 ...))
(#u8(...)
#u8(...))
...))
Will
More information about the R6RS
mailing list