[r6rs-discuss] [Formal] bytevector aliasing severely impedes optimizations

From: Bradley Lucier <lucier>
Date: Mon Mar 5 03:35:52 2007

---
This message is a formal comment which was submitted to formal-comment_at_r6rs.org, following the requirements described at: http://www.r6rs.org/process.html
---
Name:                  Brad Lucier
email:                 lucier_at_math.purdue.edu
Type of issue:         Defect
Priority:              Serious, if you're interested in performance
R6RS component:        Bytevectors
Version of the report: Revised5.92 Report on the Algorithmic Language  
Scheme - Standard Libraries
Summary:              bytevector aliasing severely impedes optimizations
Description:
I presume that many people might want to use bytevectors as described  
in this report to increase computational speed and decrease memory  
requirements by avoiding boxing/unboxing of objects that might  
otherwise be boxed when held in generic vectors.  At least, I don't  
see any other way to get this type of performance or reduce memory in  
this way.
By having only one type of bytevector that aliases all of 32-bit  
integers, 64-bit integers, 32-bit IEEE 754 numbers, and 64-bit IEEE  
754 numbers, optimization opportunities for compilers are severely  
degraded.  One does not know, for example, whether storing a 64-bit  
IEEE double into bytevector A changes the value of a 32-bit integer  
read from bytevector B without actually checking whether A and B are  
the same objects and whether the range of indices used to access A  
and B overlap.
This very problem has been recognized by recent C standards, which  
forbid such types of aliasing except by going through (char *).  (The  
proposed R6RS bytevectors would propose a problem for more than  
Scheme->C compilers, however---it is a library design problem.)  It  
could be said by analogy that the proposed R6RS libraries offer  
*only* a (char*)  (more tamed than in C), which solves one small  
class of problems (how to allow semi-portable, low-level translation  
between data types that can be considered sequences of bytes; how to  
write I/O device drivers; ...)  while completely ignoring a much  
larger and more important class of problems (allowing fast and memory- 
efficient access to large arrays of homogeneous numerical data).
Received on Sat Mar 03 2007 - 12:45:46 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC