[xquery-talk] XQuery Style Conventions

Fri Oct 21 10:43:01 PDT 2005

> Also, I personally prefer
> 
> <div>{
>   $content-generated-elsewere 
> }</div>
> 
> over
> 
> <div>
>   { $content-generated-elsewere }
> </div>
> 
> for two reasons:
> 
> 1. It does not generate different results regarding whitespaces
> 2. I can more easily edit (copy/paste) the XQuery expression...
> 

Yes, I also tend to use that style. There are further reasons:

  3. It closely matches the style that's popular in Java

  if (x) {
     stuff;
  } // end if

  4. It leads to one level of indentation rather than two

  <div>{
    for $x in a/b/c
    return $x/d
  }</div>

However, I can't say I've really cracked the problem of creating a clear
layout for XQuery. I have particular problems with things like:

  count(for $x in 
            ("",  for $y in a/b/c 
                  return string($d)
            )
        return $x/ancestor::*
  )

The problem here is the mixing of syntactic styles. In XQuery, "for" and
"if" are statement-like, whereas other expressions are expression-like, and
in visual terms statement-like constructs don't nest very easily inside
expression-like constructs. Perhaps the style guide needs to say that at
this point it's worth using a function call: but that's putting a lot more
burden on the optimizer, and with the current state of the art it's not
something I would advise doing just for aesthetic reasons. Variables are
probably a safer bet: I quite like the pseudo-sequential style

   let $v1 := for $y in a/b/c return string($d)
   let $v2 := ("", $v1)
   let $v3 := for $x in $v2 return $x/ancestor::*
   return count($v3)

I've even (after strongly opposing the feature in the past) allowed myself
to reuse the same variable name in such cases, because it makes it easy to
add an extra processing step in the middle of the sequence:

   let $result := for $y in a/b/c return string($d)
   let $result := ("", $result)
   let $result := for $x in $result return $x/ancestor::*
   return count($result)

But my assumptions about optimization of function calls vis-a-vis variables
may be based too heavily on knowledge of my own product!

There are a few other things in the style guide that I would tend to do
differently, but these are personal preferences:

* Never use a default function namespace other than the F+O namespace, it
will only confuse your readers (and never use the fn: prefix either)

* Use upper-case names only for those global variables whose value is a
constant: not for those that require run-time evaluation, e.g.

declare variable $lookup := doc("lookup-table.xml");

* I'm not that comfortable with the mix of hyphenation and camelCase, but
it's hard to suggest anything better given that the standard functions use
hyphens and the standard types use camelCase. I'd suggest that variable
names should follow whatever convention is used for element names in the
source document.

I would also suggest that the module structure (and choice of namespaces)
should follow the namespace structure used in the source vocabulary: a
library module should be a collection of functions associated with the
elements and types defined in a schema.

Michael Kay
http://www.saxonica.com/