[r6rs-discuss] Stateful codecs and inefficient transcoding
I am posting this as an individual member of the Scheme
community. I am not speaking for the R6RS editors.
Per Bothner quoting me:
> > In other words, you can expect get-char to deliver
> > about half the performance of C's getc when reading
> > characters one at a time.
>
> In many cases, sure. But consider a call to get-char. It cannot
> "transcode-ahead", since the next call could be a get-u8.
The claim you quoted was qualified: "With the default
transcoder". I measured the performance without any
"transcode-ahead", so the performance I reported is
what you should expect even if the next call could be
to get-u8.
> More complex table-driven decoding would be ridiculous
> to do in Scheme. Not a priori, but because it makes much more
> sense to use existing libraries, such as iconv.
Agreed.
> So you really have to explain how you would implement
> character decoding using iconv while still being only
> twice as slow as C, and allowing a get-char to be followed
> by a get-u8.
In my previous message, I explained:
* how character-at-a-time decoding with the default
transcoder would be only twice as slow as C's getc,
while allowing any get-char to be followed by a
get-u8
* how character decoding could call iconv, with about
the same performance as in C, in programs that do
not need to follow a get-char with a get-u8
> Recommmendation: allow get-char/read/... after get-u8/get-bytes-n/...
> but do not require an implementation to support the converse
> (reading bytes after reading characters).
I do not understand how this restriction would improve
performance beyond what we can already expect with the
draft R6RS semantics.
> Or only require support
> for reading bytes after reading characters for a few simple
> standard encodings - primarily UTF8.
That is exactly what the draft R6RS does. The complete
list of encodings for which the draft R6RS would require
support for reading bytes after characters is:
Latin-1
UTF-8
UTF-16LE
UTF-16BE
UTF-32LE
UTF-32BE
Will
Received on Mon Oct 30 2006 - 16:24:08 UTC
This archive was generated by hypermail 2.3.0
: Wed Oct 23 2024 - 09:15:01 UTC