Less verbose type definition form (was: [R6RS] Records comments)

Fri Jul 15 10:05:48 EDT 2005

Here is my proposal for a "less verbose" type definition form.

Grammar of "basic" type definition form
---------------------------------------

<type-definition> ::=  ( define-type <type-name> <type-attrib-or- 
field-def>* )

<type-name> ::=  <identifier>

<type-attrib-or-field-def> ::=  <type-attribute>  |  <field-definition>

<type-attribute>
   ::=  <extends-attribute>
     |  <extensible-attribute>
     |  <id-attribute>
     |  <constructor-attribute>
     |  <predicate-attribute>

<extends-attribute>
   ::=  extends: <type-name>
     |  extends: #f

<extensible-attribute>
   ::=  extensible: #t
     |  extensible: #f

<id-attribute>
   ::=  id: <globally-unique-type-name>
     |  id: #f

<globally-unique-type-name> ::=  <identifier>

<constructor-attribute>
   ::=  constructor: <constructor-name>

<constructor-name> ::=  <identifier>

<predicate-attribute>
   ::=  predicate: <predicate-name>

<predicate-name> ::=  <identifier>

<field-definition>
   ::=  <field-name>
     |  ( <field-name>                             <field-attribute>* )
     |  ( <field-name> <getter-name>               <field-attribute>* )
     |  ( <field-name> <getter-name> <setter-name> <field-attribute>* )

<field-name> ::=  <identifier> not ending with a colon
<getter-name> ::=  <identifier>
<setter-name> ::=  <identifier>

<field-attribute>
   ::=  mutable: #t
     |  mutable: #f

In the following explanation, T is the name of the type being defined  
(the <type-name> identifier immediately after "define-type"), F1 is  
the <field-name> of this type definition's first <field-definition>,  
F2 is the name of the second, etc.

Defaults for type attributes
----------------------------

At most one <extends-attribute> is allowed in a <type-definition>.   
When no <extends-attribute> appears in the <type-definition>, the  
following default is used

   extends: #f

At most one <extensible-attribute> is allowed in a <type- 
definition>.  When no <extensible-attribute> appears in the <type- 
definition>, the following default is used

   extensible: #f

At most one <id-attribute> is allowed in a <type-definition>.  When  
no <id-attribute> appears in the <type-definition>, the following  
default is used

   id: #f

At most one <constructor-attribute> is allowed in a <type- 
definition>.  When no <constructor-attribute> appears in the <type- 
definition>, the following default is used

   constructor: make-T

At most one <predicate-attribute> is allowed in a <type-definition>.   
When no <predicate-attribute> appears in the <type-definition>, the  
following default is used

   predicate: T?

Semantics
---------

The <extends-attribute> indicates the type inheritance relationship.

   extends: <type-name>     T is derived from <type-name>
                            which must be extensible.
                            T inherits the fields defined in
                            <type-name>.  Instances of T
                            are also instances of <type-name>.

   extends: #f              T is a base record type.

The <extensible-attribute> indicates whether or not T is extensible.

   extensible: #t           T is extensible.

   extensible: #f           T is not extensible.

The <id-attribute> indicates the identity of the type.  The identity  
is used by the predicate to test if an object is an instance of T.

   id: <globally-unique-type-name>  T's identity is the identifier
                                    supplied.

   id: #f                           T's identity is regenerated
                                    each time the type
                                    definition is evaluated
                                    (i.e. the type is generative).

The <constructor-attribute> indicates the name of the constructor  
procedure.

   constructor: <constructor-name>

The constructor procedure can be viewed as a lambda expression of the  
form

   (lambda (I1 I2... F1 F2...) <body>)

where I1, I2... correspond to the field names of the parent type, if  
T is a subtype.

The <predicate-attribute> indicates the name of the predicate procedure.

   predicate: <predicate-name>

The fundamental <field-definition> forms are

   ( F G   )  immutable field named F whose getter procedure
              is named G

   ( F G S )  mutable field named F whose getter procedure
              is named G and  whose setter procedure is named S

The other field definition forms can be explained in terms of the  
fundamental field definition forms

   F                     = ( F T-F T-F-set! )

   ( F )                 = ( F T-F T-F-set! )
   ( F mutable: #t )     = ( F T-F T-F-set! )
   ( F mutable: #f )     = ( F T-F          )

   ( F G )               = ( F G            )
   ( F G mutable: #t )   = *error*
   ( F G mutable: #f )   = ( F G            )

   ( F G S )             = ( F G   S        )
   ( F G S mutable: #t ) = ( F G   S        )
   ( F G S mutable: #f ) = *error*

Each field is initialized from the constructor's formal parameter of  
the same name.

Examples
--------

All of the following type definitions for a "2D point" type are  
equivalent

   (define-type point x y)

   (define-type point
     constructor: make-point
     x
     y)

   (define-type point
     predicate: point?
     (x mutable: #t)
     (y mutable: #t))

   (define-type point
     constructor: make-point
     predicate: point?
     (x point-x point-x-set!)
     (y point-y point-y-set!))

Here is the definition of an extensible 2D point and a non-extensible  
3D point

   (define-type point2D
     extensible: #t
     x
     y)

   (define-type point3D
     extends: point2D
     z)

Extensions
----------

The basic type definition form specified above could be extended to  
allow arbitrary field initialization.  One approach is to allow  
specifying the parameters of the constructor by extending the  
<constructor-attribute> category

   <constructor-attribute>
     ::=  constructor: <constructor-name>
       |  constructor: ( <constructor-name> <variable>* )

and to add a field initialization form to <field-attribute>

   <field-attribute>
     ::=  mutable: #t
       |  mutable: #f
       |  init: <expression>

The scope of the constructor's formal parameters includes the field  
initialization expressions (i.e. the <expression> in "init:  
<expression>").  If not specified the field F is initialized from the  
formal parameter of the same name.  It is also reasonable to specify  
that the <constructor-attribute>

   constructor: C

is equivalent to

   constructor: ( C F1 F2... )

When inheritance is used, it is not clear how the field  
initialization expressions are interpreted.  For instance in

   (define-type point2D
     extensible: #t
     constructor: (make-point2D x)
     x
     (y init: (* x x)))

   (define-type point3D extends: point2D
     constructor: (make-point3D z)
     z)

how are the fields x and y initialized when make-point3D is called?

I think that the parent type's formal parameters should implicitly be  
prefixed to the constructor's parameters.  So in the example above  
make-point3D would take two parameters: x (inherited from point2D)  
and z.  This means that the names of the formal parameters must be  
distinct from those of the parent's constructor.  This approach  
allows things like

   (define-type point2D
     extensible: #t
     constructor: (make-point2D x)
     x
     (y init: (* x x)))

   (define-type point3D extends: point2D
     (z init: (- x z))) ; note: accessing the parent constructor's  
parameter x
                        ; y is not accessible

An instance initializer form could also be added as a <type-attribute>.

   <type-attribute>
     ::=  <extends-attribute>
       |  <extensible-attribute>
       |  <id-attribute>
       |  <constructor-attribute>
       |  <predicate-attribute>
       |  <init-attribute>

   <init-attribute>
     ::=  init: (lambda (<self-name>) <body>)

   <self-name> ::=  <identifier>

After the record instance is allocated and the field initialization  
forms have been evaluated, the <init-attribute> is called with the  
record instance as its single parameter (after calling the <init- 
attribute>(s) of the parent(s) of course).

I'm a bit uncomfortable with this because immutable fields can't be  
initialized using the instance initializer form.

In fact, I would be content with the basic type definition form.  If  
you need to initialize fields specially, just define your own  
constructor procedure (outside of the type definition form).  For  
example:

   (define-type point2D
     extensible: #t
     constructor: make-plain-point2D
     x
     y)

   (define (make-point2D x)
     (make-plain-point2D x (* x x)))

   (define-type point3D extends: point2D
     constructor: make-plain-point3D
     z)

   (define (make-point3D x z)
     (make-plain-point3D x (* x x) (- x z)))

Anyway, with these extensions, this proposal provides the same  
features as the "featureful syntactic layer" of the Clinger/Dybvig/ 
Sperber proposal.  However, it (in my view)

1) has a syntax that is more lightweight in the common
    case, i.e. locally used non-extensible small records

2) has a syntax that is easier to parse for humans (not based
    on positional arguments)

3) has a syntax that is more extensible (for adding new type
    attributes or field attributes)

Marc