[xquery-talk] Necessary whitespace
jmdyck at ibiblio.org
Tue Apr 28 14:17:05 PDT 2015
On 15-04-28 04:33 PM, Benito van der Zander wrote:
> Hi Michael,
>> I don't think there's a problem with saying it's tokenized as two tokens.
>> Just because a text can be tokenized doesn't mean it's free of syntax
>> errors. And section A.2.2 gives just one of the many requirements that a
>> sequence of tokens must satisfy in order to be error-free. (Specifically,
>> "div" and "3" are adjacent non-delimiting terminal symbols, and so must be
>> separated by Whitespace and/or Comments.)
> What if it parses it in
> 12!(12 div.)
> as two tokens?
> "." is a terminal symbol, and "div" is not a NCName there, just part of a
As pointed out by Ghislain yesterday, the last paragraph of A.2.2 applies:
if a QName or NCName is followed by a "." or "-", the two tokens must be
separated by whitespace and/or Comments.
> Or in
> as "<" and "<a>2</a>"
> "<<" is longer, but not consistent.
"<<" is longer than "<", and there are continuations of "1<<" that conform
to the EBNF, so the LMP rule compels the tokenizer to pick "<<", which leads
to raising an error at ">". Ghislain also said this yesterday.
It's unclear what you mean by "consistent". If you mean that having the
tokenizer pick "<<" is not consistent with parsing the string as:
1 < <a>2</a>
then, yes, that's quite true.
More information about the talk