Quoting/anti-quoting

From Successor ML

Jump to: navigation, search

Contents

Introduction

We propose making the SML/NJ and Moscow ML quoting/anti-quoting system a standard language feature.

Motivation

Standard ML is often used to implement another language L, for example, the syntax of the HOL logic or the syntax of CCS for the Concurrency Workbench. Typically, one defines the abstract syntax of L by a datatype declaration. Then useful functions over the datatype can be defined, e.g., finding the free variables of a formula when L is a logic. It is often the case that concrete syntax is desired as well, being easier for humans to read than abstract syntax, and so the production of parsers and prettyprinters for embedded languages is commonplace. However, strings as the bearers of concrete syntax are a bit inflexible. Hence the motivation for quotations and also antiquotations.

(text stolen from SML/NJ page)

Assumptions

This proposal requires that the use of either two or three ASCII symbols. We follow the design given by SML/NJ and use backtick as starting and ending delimiter and caret as the antiquote, but a different choice could be made.

Additionally, this would require introducing a new type name/constructor to be added to the standard basis. Figure 24 in Appendix C of the Definition would need to have the following appended

 frag |-> ( frag, {quote |-> (∀ 'a. string -> 'a frag, c),
                   antiquote |-> (∀ 'a. 'a * int -> 'a frag, c)})

The nonfix column of Figure 25 in Appendix C of the Definition would need following two lines appeneded

 quote     |-> (∀ 'a. string -> 'a frag, c)
 antiquote |-> (∀ 'a. 'a * int -> 'a frag, c)

Figure 26 in Appendix D of the Definition would need the addition

 frag |-> {quote |-> (quote, c), antiquote |-> (antiquote,c)}

Which could also be understood in terms of the following datatype definition

 datatype 'a frag = quote of string | antiquote of 'a * int

This proposal differs from the datatype used by SML/NJ in that antiquote receives an additional integer argument. The idea behind this is to aid in error reporting by specifying the width of the antiquote in characters. This makes it possible given a fragment list to correctly compute the character positions in quoted strings.

Syntax

   atexp ::= ...
             `qexp`
   qtext ::= ASCII characters minus ^ and `
   qexp  ::= qtext qexp
             ^atexp qexp

Static Semantics

Follows from the definition as a derived form.

Dynamic Semantics

Follows from the definition as a derived form.

Interactions

None known other than syntactic.

Compatibility

This change may require "stealing" some currently free identifiers for use in quotation/anti-quotation. It has been expressed that using backtick as a delimiter may be a poor choice. One possibility would be to instead of using single character delimiters, use two character delimters such as bananas

 atexp ::= ...
           (| qexp |)

As the choice is essentially cosmetic there will likely be no end to the debate on what the correct decision should be.

Implementation

This extension is entirely a derived form the expression `qexp` can just be converted by the parser to a frag list via the following translation

 (|qexp|) =              |->  nil
 (|qexp|) = qtext qexp   |->  (quote "qtext")::(|qexp|) 
 (|qexp|) = ^atexp qexp  |->  (antiquote (atexp, i))::(|qexp|)  (where i is determined by the parser)
Personal tools