--- This message is a formal comment which was submitted to formal-comment_at_r6rs.org, following the requirements described at: http://www.r6rs.org/process.html --- Submitter's name: Reinder Verlinde submitter's email address: anmwzj_at_hetnet.nl type of issue: Defect priority: Major R6RS component: Unicode/Lexical Syntax version of the report: 5.91 one-sentence summary: LF should not be the only line separator Section 3.2.1 contains the following production: <intra-line whitespace> --> <any <whitespace> that is not <linefeed> First, a minor, textual issue: the < and > tags do not balance (a > is missing at the end of the line) The major issue here is the choice to make "line feed" the only inter- line separator. <http://en.wikipedia.org/wiki/Newline#Unicode>, although far from normative, states: "The Unicode standard addresses the problem by defining a large number of characters that conforming applications should recognize as line terminators: LF: Line Feed, u000A CR: Carriage Return, u000D CR+LF: CR followed by LF, u000D followed by u000A NEL: Next Line, u0085 FF: Form Feed, u000C LS: Line Separator, u2028 PS: Paragraph Separator, u2029" but in my reading, it is consistent with <http://www.unicode.org/ reports/tr14>. If a goal of the spec is to make Scheme Unicode compliant, it must follow the mandatory aspects of that page. When not making this change, be aware that source files will be single-line on some (minority) platforms (examples: Mac OS 9 and earlier, and if I read things correctly, EBCDIC-based systems) This may mean additional changes to the grammar. I haven't completely thought it over, but I think the cleanest approach would be to define the lexical syntax as consisting of two phases, the first of which normalizes line endings (say to LFs).Received on Tue Oct 03 2006 - 05:22:21 UTC
This archive was generated by hypermail 2.3.0 : Wed Oct 23 2024 - 09:15:01 UTC