[xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix

Per Bothner per at bothner.com
Mon Feb 10 16:54:32 PST 2014


On 02/10/2014 11:31 AM, Michael Kay wrote:
> We're currently in a position where the world has discovered better ways of serializing structured data, but hasn't yet discovered a better way of serializing narrative text or of information that mixes narrative text with structured data (which is the domain that I find most interesting).

You might found interesting SRFI-108 
http://srfi.schemers.org/srfi-108/srfi-108.html
This defines a Scheme language extension for "Named quasi-literal 
constructors"
which I intended to be useful for both structured data and rich test.
XQuery's
   <p>Hello <em>{$name}</em></p>
would be represented as:
   &p{Hello &em[name]!}

The difference is that SRFI-108 defines a *framework*, in that &p is
syntactic sugar for a call to a function or macro $construct$:p; the meaning
of the latter depends on whatever is in scope according to context.

There is a related SRFI-109 http://srfi.schemers.org/srfi-109/srfi-109.html
for extended multi-line string literals.  Both use '&' for escapes, as 
in XML.
I.e. character and entity references use the XML syntax, while an embedded
expression uses &[...].  The following is equivalent to Java's ("Hello 
"+name+"!"):
    &{Hello &[name]!}

The interesting this is that a simple string:
   &{Hello &[name]!}
can be easily converted to rich text.  For example:
   &p{Hello &[name]!}
or:
   &p{Hello &em[name]!}
assuming you have an HTML "vocabulary" in scope.

There is also a related embedded-XML syntax SRFI-107
http://srfi.schemers.org/srfi-107/srfi-107.html
This is a superset of XML, but uses the same syntax as SRFI-108/-109
for references and embedded expressions.

Kawa 1.14 implements SRFI-107, SRFI-108, and SRFI-109:
http://per.bothner.com/blog/2013/Kawa-1.14-released/

SRFI-108 defines a language embedding/extension (and specifically for
Lisp-family languages), rather than a serialization/interchange format,
but just like JSON one could define a subset or variant as a
possible data format.
-- 
	--Per Bothner
per at bothner.com   http://per.bothner.com/


More information about the talk mailing list