[r6rs-discuss] Stateful codecs and inefficient transcoding

From: Shiro Kawai <shiro>
Date: Tue Oct 31 02:08:33 2006

From: William D Clinger <will_at_ccs.neu.edu>
Subject: [r6rs-discuss] Stateful codecs and inefficient transcoding
Date: Tue, 31 Oct 2006 00:53:20 -0500

> It doesn't surprise me that there are standard encodings
> that would require unlimited lookahead. In my opinion,
> it is probably not practical for an implementation to
> support those encodings via the transcoders of the draft
> R6RS. In my opinion, the transcoders of the draft R6RS
> are barely adequate for the Unicode encodings, and are
> obviously not adequate for all possible translations
> from binary to text and vice versa.

I agree. I just feel it is regrettable that the
implementation need to provide different mechanism while
the language standard defines

> > Are there any other examples that
> > shows the usefulness of the transient encoder?
>
> I am told that XML files can contain several different
> encodings of text, and it appears that the "transient
> encoder" technique is useful for XML files.

Well, just speaking of switching transcoding schemes, one
practical example is reading MIME (rfc2045) document. But both
examples apply transcoders on certain chunk of data, not on
character-by-character. Layering a transcoding port on the
original source port is another way of handling it.

> Agreed. Note, however, that mixing binary with textual
> i/o, and mixing different transcoders for textual i/o,
> was (in my opinion) regarded as a requirement. That
> requirement already implies cutting streams into pieces
> that use different transcoders (or none at all).

Right. In Gauche, we support mixing binary and textual I/O
natively in the port, while support transcoding by port layering.

Probably the concern about layering is efficiency.
Although our transcoding ports perform efficient block I/O
on the underlying source/sink ports, it does incur overhead
of extra copying, especially if the source/sink is something
like string ports. The implementation could provide a
backdoor in such ports to share the buffer, however.

--shiro
Received on Tue Oct 31 2006 - 02:10:03 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC