[R6RS] BOM-based codecs

William D Clinger will at ccs.neu.edu
Wed Aug 16 14:12:47 EDT 2006


Mike wrote:
> > The utf-bom-codec you are proposing would not do away with
> > the need for the utf-16-codec and utf-32-codec that I have
> > been advocating.  Those codecs differ from the utf-bom-codec
> > you are proposing in that they:
> >
> >  * are Unicode standards
> >  * default in the standard way when there is no byte order mark
> >  * never default to an unexpected width
> 
> OK, I'l make that change.  I'm not sure the FAQ qualifies as the
> definitive standard, though.

Definitely not.  See below, however.

> I was working off the table on page 325
> of this chapter of the standard document:
> 
> http://www.unicode.org/unicode/uni2book/ch13.pdf

Thank you for that reference.  It's for an old version,
though; in the most recent version that is available
online, version 4.0, that chapter is chapter 15:
http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf

The UTF-16 and UTF-32 encoding schemes (not to be
confused with the UTF-16 and UTF-32 encoding forms,
which are described in the same chapter) are described
in chapter 3.  I'm having trouble downloading version
4.0 of that chapter, but version 3.0 describes the
official semantics of byte order marks in section
3.10, starting on page 78, and summarizes in table
3-7 on page 80.  The FAQ is consistent with that part
of the official standard.

Will



More information about the R6RS mailing list