Brian C. Barnes scripsit:
> 1. String->utfX
>
> a. Should the resulting byte-vector contain byte order marks for utf16
> and utf32?
Definitely not.
> 2. utfX->string
>
> a. What is the expected result if a user specifies (for example)
> little endian, but the bytevector itself contains byte marks for big endian?
The string will begin with the non-character \#xFFFE,
and the rest of it will be garbled. (Don't do that, in other words.)
IMHO it would be better if the utf{16,32}->string functions were able
to take an additional argument specifying whether the endianness is
mandatory (BOM is treated as a character) or optional (if BOM is present,
believe it, otherwise use the endianness as a default).
--
In politics, obedience and support John Cowan <cowan at ccil.org>
are the same thing. --Hannah Arendt http://www.ccil.org/~cowan
Received on Fri May 25 2007 - 20:16:21 UTC