[r6rs-discuss] Strings

From: Per Bothner <per>
Date: Sun Mar 25 19:18:52 2007

Thomas Lord wrote:
> In the assumptions you asked me to consider, performance
> only matters for appending to a string and we can assume
> that in most cases space has already been reserved for this.
> Now you are revoking the assumptions you asked me to
> reason under. We are going in circles.

Sorry.

I think for a large fraction of applications of mutable
string you don't really need anything more than appending to
the (right) end of the buffer, which we agree is simple,
efficient, and doesn't require marker objects.

I also agree there is a case to be made for only having
immutable strings in Scheme. In most cases a compiler can
probably translate append to internal-append! fairly well.

I also think there are relatively few applications of
string-length and string-ref, and in those care where
performance (i.e. O(1) time) is important, they can be
replaced by string-codeunit-ref and string-codeunit-length
and similar functions.

I agree with John Cowan that there is very little justification
for a "character" type at all. For example replacing a
character in a mutable string is seldom useful. Instead you're
more likely to want to replace a substring by another string.

Finally, *if* you want mutable strings allowing arbitrary
replacements (including insertions and deletions) you might want
to consider a "marker" type.

Bottom line: I can live with a number of different approaches
to strings in Scheme. My recommendations:

* Remove the expectation that string-ref and string-length be
O(1). That precludes common and simple implementation methods,
and the functions aren't useful enough for that.

* More generally, write the specification with the assumption
that many/most Scheme implementations will use a simple
UTF-8 array or a UTF-16 array. In the case of mutable
strings, the array may be grown/relocated, and optionally
use a buffer-gap scheme. We should not assume or require
anything more complicated.

* Possibly remove string-set!. At least deprecate it, and note
that it may be inefficient.

* Either make strings immutable, or make them variable-length
and provide string-append!.

* Possibly specify a low-level code-point-level library for
efficiently implementing low-level algorithms.
-- 
	--Per Bothner
per_at_bothner.com   http://per.bothner.com/
Received on Sun Mar 25 2007 - 18:03:30 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC