[r6rs-discuss] Why lexers can be simpler when restricted to ASCII
In formal comment 231, I stated:
"Many current Schemes have lexers written for ASCII (or Latin-1)
character sets. Conversion of these lexers to the new standard would be
easier if the report allowed inline hex escapes to appear anywhere in
Scheme code."
The editors replied:
"It is unclear why converting the lexers would be significantly simpler
through this change"
Let me explain my original opinion. Many Schemes currently have lexers
written in C using "char". These need converting to "long" to handle
Unicode. Furthermore, table-driven approaches are practical for ASCII
(128 values), but not practical for Unicode (roughly 2^24 values).
In case that isn't clear enough: My Scheme uses flex for its lexer. I
cannot see how to simply convert it to accept Unicode. I think I will
have to dump flex and implement a new lexer by hand.
Regards,
Alan
Received on Mon Apr 23 2007 - 13:06:05 UTC
This archive was generated by hypermail 2.3.0
: Wed Oct 23 2024 - 09:15:01 UTC