[r6rs-discuss] [Formal] 5.91: LF should not be the only line separator from Anton van Straaten on 2006-10-03 (r6rs-discuss.mbox)

From: Anton van Straaten <anton>
Date: Tue Oct 3 05:21:49 2006

The following is a copy of a new formal comment from Reinder Verlinde.
(The original was submitted correctly, but incorrectly processed by me.)

---
This message is a formal comment which was submitted to 
formal-comment_at_r6rs.org, following the requirements described at: 
http://www.r6rs.org/process.html
---
Submitter's name: Reinder Verlinde
submitter's email address: anmwzj_at_hetnet.nl
type of issue: Defect
priority: Major
R6RS component: Unicode/Lexical Syntax
version of the report:  5.91
one-sentence summary: LF should not be the only line separator
Section 3.2.1 contains the following production:
     <intra-line whitespace> --> <any <whitespace> that is not  <linefeed>
First, a minor, textual issue: the < and > tags do not balance (a >  is 
missing at the end of the line)
The major issue here is the choice to make "line feed" the only inter- 
line separator.
<http://en.wikipedia.org/wiki/Newline#Unicode>, although far from 
normative, states:
    "The Unicode standard addresses the problem by defining a large 
number of
     characters that conforming applications should recognize as line 
terminators:
      LF:    Line Feed, u000A
      CR:    Carriage Return, u000D
      CR+LF: CR followed by LF, u000D followed by u000A
      NEL:   Next Line, u0085
      FF:    Form Feed, u000C
      LS:    Line Separator, u2028
      PS:    Paragraph Separator, u2029"
but in my reading, it is consistent with <http://www.unicode.org/ 
reports/tr14>. If a goal of the spec is to make Scheme Unicode 
compliant, it must follow the mandatory aspects of that page.
When not making this change, be aware that source files will be 
single-line on some (minority) platforms (examples: Mac OS 9 and 
earlier, and if I read things correctly, EBCDIC-based systems)
This may mean additional changes to the grammar. I haven't completely 
thought it over, but I think the cleanest approach would be to define 
the lexical syntax as consisting of two phases, the first of which 
normalizes line endings (say to LFs).

Received on Tue Oct 03 2006 - 05:22:21 UTC

This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC