[r6rs-discuss] string->utfX and utfX-string questions

From: John Cowan <cowan>
Date: Fri, 25 May 2007 20:16:21 -0400

Brian C. Barnes scripsit:

> 1. String->utfX
>
> a. Should the resulting byte-vector contain byte order marks for utf16
> and utf32?

Definitely not.

> 2. utfX->string
>
> a. What is the expected result if a user specifies (for example)
> little endian, but the bytevector itself contains byte marks for big endian?

The string will begin with the non-character \#xFFFE,
and the rest of it will be garbled. (Don't do that, in other words.)

IMHO it would be better if the utf{16,32}->string functions were able
to take an additional argument specifying whether the endianness is
mandatory (BOM is treated as a character) or optional (if BOM is present,
believe it, otherwise use the endianness as a default).

-- 
In politics, obedience and support      John Cowan <cowan at ccil.org>
are the same thing.  --Hannah Arendt    http://www.ccil.org/~cowan
Received on Fri May 25 2007 - 20:16:21 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC