Quoting/anti-quoting
From Successor ML
Contents |
Introduction
We propose making the SML/NJ and Moscow ML quoting/anti-quoting system a standard language feature.
Motivation
Standard ML is often used to implement another language L, for example, the syntax of the HOL logic or the syntax of CCS for the Concurrency Workbench. Typically, one defines the abstract syntax of L by a datatype declaration. Then useful functions over the datatype can be defined, e.g., finding the free variables of a formula when L is a logic. It is often the case that concrete syntax is desired as well, being easier for humans to read than abstract syntax, and so the production of parsers and prettyprinters for embedded languages is commonplace. However, strings as the bearers of concrete syntax are a bit inflexible. Hence the motivation for quotations and also antiquotations.
(text stolen from SML/NJ page)
Assumptions
This proposal requires that the use of either two or three ASCII symbols. We follow the design given by SML/NJ and use backtick as starting and ending delimiter and caret as the antiquote, but a different choice could be made.
Additionally, this would require introducing a new type name/constructor to be added to the standard basis. Figure 24 in Appendix C of the Definition would need to have the following appended
frag |-> ( frag, {quote |-> (∀ 'a. string -> 'a frag, c), antiquote |-> (∀ 'a. 'a * int -> 'a frag, c)})
The nonfix column of Figure 25 in Appendix C of the Definition would need following two lines appeneded
quote |-> (∀ 'a. string -> 'a frag, c) antiquote |-> (∀ 'a. 'a * int -> 'a frag, c)
Figure 26 in Appendix D of the Definition would need the addition
frag |-> {quote |-> (quote, c), antiquote |-> (antiquote,c)}
Which could also be understood in terms of the following datatype definition
datatype 'a frag = quote of string | antiquote of 'a * int
This proposal differs from the datatype used by SML/NJ in that antiquote receives an additional integer argument. The idea behind this is to aid in error reporting by specifying the width of the antiquote in characters. This makes it possible given a fragment list to correctly compute the character positions in quoted strings.
Syntax
atexp ::= ... `qexp`
qtext ::= ASCII characters minus ^ and `
qexp ::= qtext qexp ^atexp qexp
Static Semantics
Follows from the definition as a derived form.
Dynamic Semantics
Follows from the definition as a derived form.
Interactions
None known other than syntactic.
Compatibility
This change may require "stealing" some currently free identifiers for use in quotation/anti-quotation. It has been expressed that using backtick as a delimiter may be a poor choice. One possibility would be to instead of using single character delimiters, use two character delimters such as bananas
atexp ::= ... (| qexp |)
As the choice is essentially cosmetic there will likely be no end to the debate on what the correct decision should be.
Implementation
This extension is entirely a derived form the expression `qexp` can just be converted by the parser to a frag list via the following translation
(|qexp|) = |-> nil (|qexp|) = qtext qexp |-> (quote "qtext")::(|qexp|) (|qexp|) = ^atexp qexp |-> (antiquote (atexp, i))::(|qexp|) (where i is determined by the parser)