[xquery-talk] Necessary whitespace
mike at saxonica.com
Mon Apr 27 02:53:09 PDT 2015
To confuse matters, though, I see that we still have the problematic statement in A.2 "When tokenizing, the longest possible match that is consistent with the EBNF is used." This to my mind has always suggested the idea that the tokenization is sensitive to the grammatical context. And in some cases it is; you don't want to go looking for QNames or IntegerLiterals when you're in DirElementContent, just because a QName or IntegerLiteral is longer than a Char. However, it could also be read as meaning that given "12 div3", tokenizing "div3" as one token is not consistent with the EBNF (it doesn't lead to a valid parse), so it should be tokenized as two tokens. I don't think that has ever been the intent, and I guess section A.2.2 on delimiting and non-delimiting terminals was added to eliminate this interpretation.
mike at saxonica.com
+44 (0) 118 946 5893
On 27 Apr 2015, at 10:18, Ghislain Fourny <g at 28.io> wrote:
> I agree with Christian on the parses/doesn't parse classification.
> My understanding is as follows: 3 and div are non-delimiting terminal
> symbols, and hence must be separated by a whitespace.
> This is specified here:
> 12!(12 div.) doesn't parse because the . after a QName requires a
> whitespace (. and - are listed as exceptions in the above link). The
> same applies to div-.
> 1<<a>2</a> doesn't parse because << would be recognized as a token.
> 1<<<a>2</a> parses though.
> I hope it helps!
> Kind regards,
> On Mon, Apr 27, 2015 at 10:12 AM, Christian Grün
> <christian.gruen at gmail.com> wrote:
>> Hi Benito,
>> These ones are valid:
>>> 12!(.div 3)
>> ...and these ones are not:
>>> 12!(12 div.)
>>> 12 div-3
>>> 3!(12 div-.)
>> It would take some time to elaborate all the reasons for that (I would
>> surely need to look it up as well), but "12 div-3" is maybe easy to
>> explain: div-3 is also a valid name test and, thus, path expression.
>> talk at x-query.com
> talk at x-query.com
More information about the talk