[r6rs-discuss] Stateful codecs and inefficient transcoding

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Per Bothner <per>
Date: Mon Oct 30 15:30:48 2006

William D Clinger wrote:
> In other words, you can expect get-char to deliver
> about half the performance of C's getc when reading
> characters one at a time.

In many cases, sure. But consider a call to get-char. It cannot
"transcode-ahead", since the next call could be a get-u8. Well,
it probably could, as long as it can back up if need by, but
that could make for a complicated implementation. (One issue is
ignoring errors when trancoding-ahead.)

Converting "simple" decodings can be done directly in the
Scheme implementation. By "simple" I mean UTF-8, Latin-1,
utf16be, and utf16le.

More complex table-driven decoding would be ridiculous
to do in Scheme. Not a priori, but because it makes much more
sense to use existing libraries, such as iconv.

So you really have to explain how you would implement
character decoding using iconv while still being only
twice as slow as C, and allowing a get-char to be followed
by a get-u8.

Recommmendation: allow get-char/read/... after get-u8/get-bytes-n/...
but do not require an implementation to support the converse
(reading bytes after reading characters). Or only require support
for reading bytes after reading characters for a few simple
standard encodings - primarily UTF8.

-- 
	--Per Bothner
per_at_bothner.com   http://per.bothner.com/

Received on Mon Oct 30 2006 - 15:30:41 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC