[r6rs-discuss] Stateful codecs and inefficient transcoding

From: Marcin 'Qrczak' Kowalczyk <qrczak>
Date: Tue Nov 7 19:21:19 2006

Shiro Kawai <shiro_at_lava.net> writes:

> Are the default (current-input-port) and (current-output-port)
> text ports or binary ports? If they are text ports, then
> we need to have a way to specify their transcoders before the
> Scheme process begins. If they are binary ports, probably
> large number of scripts need to take extra steps to set up
> text I/O:

In my language they are text ports, but their encoding is established
the first time they are accessed (accessed in the sense of applying
(current-input-port), not in the sense of I/O). The encoding is taken
from a dynamic parameter containing the default encoding.

This allows to change the default encoding before it takes effect.
Program invocation arguments are also decoded lazily.

The underlying binary ports are also available (but fixed, not
dynamically settable), so another possibility is to explicitly make
text ports on top of them, forgetting about the original current
text ports.

> * The behavior of flushing transcoded output port should be
> clarified. If the transcoding is stateful, should flush 'reset'
> the state, or merely empty the buffer but keeping the state?
> Resetting the state may produce some extra output, such as
> escape sequence.

My design has a mode for flushing (gzip and bzip2 compression
distinguishes all of them):

* none - no flushing
* sync - ensure that a reader will be able to read everything
  written so far; this is the default when flushing is invoked
  explicitly
* reset - reset the state to the initial state
* end - no more data will be provided

'none' allows to use a mode to specify whether flushing is done at all.
A buffered output port is associated with a flush mode to be used
automatically after each write operation, and a mode to be used after
each line.

> However, if the underlying transcoder uses libiconv, the only way
> to enforce the resetting is to call iconv_close(),

No; when inbuf or *inbuf is NULL, the state is reset (which may cause
to produce some output). This is in general necessary to use before
finishing with the conversion; iconv_close doesn't include gathering
output.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak_at_knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/
Received on Tue Nov 07 2006 - 19:21:09 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:00 UTC