<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Joe,</p>
<p>Forgive the length but I'm likely to bump my head on this issue
in the future, so a fuller than necessary explanation:</p>
<p>Started with the simplest regex that would capture the parens:</p>
<p>1. fn:analyze-string("On February 13, 1968, Secretary of State
Dean Rusk sent a message to Israeli Foreign Minister Abba Eban
calling upon Israel to endorse openly Resolution 242, and on May
13 President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace. (79,
171) ", "\(\d.*\)")<br>
</p>
1. Result: <fn:analyze-string-result
xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>
<fn:non-match>On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic
</fn:non-match><br>
<fn:match>(UAR) President Gamal Abdel Nasser, urging him to
seize the unique opportunity offered by the Jarring mission to
achieve peace. (79, 171)</fn:match><br>
<fn:non-match> </fn:non-match><br>
</fn:analyze-string-result><br>
<br>
OK, so what do we know about the desired matches? Digits plus (, )
with no spaces. Yes?<br>
<br>
2. fn:analyze-string("On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace. (79,
171) ", "\(\d, \d+\)")<br>
<br>
So I match parens plus digits, ", " (comma plus whitespace), digits
plus paren.<br>
<br>
2. Result: <fn:analyze-string-result
xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>
<fn:non-match>On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace.
</fn:non-match><br>
<fn:match>(79, 171)</fn:match><br>
<fn:non-match> </fn:non-match><br>
</fn:analyze-string-result><br>
<br>
I need to split the two numbers and what better to do that than
alternative matching?<br>
<br>
3. fn:analyze-string("On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace. (79,
171) ", "\(\d+ | \d+\)")<br>
<br>
3. Result: <fn:analyze-string-result
xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>
<fn:non-match>On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace.
(79,</fn:non-match><br>
<fn:match> 171)</fn:match><br>
<fn:non-match> </fn:non-match><br>
</fn:analyze-string-result><br>
<br>
Your probably already laughing because you see my mistake, which I
correct in #4:<br>
<br>
4. fn:analyze-string("On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace. (79,
171) ", "\(\d+|\d+\)")<br>
<br>
4. Result: <fn:analyze-string-result
xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>
<fn:non-match>On February 13, 1968, Secretary of State Dean
Rusk sent a message to Israeli Foreign Minister Abba Eban calling
upon Israel to endorse openly Resolution 242, and on May 13
President Johnson sent a letter to United Arab Republic (UAR)
President Gamal Abdel Nasser, urging him to seize the unique
opportunity offered by the Jarring mission to achieve peace.
</fn:non-match><br>
<fn:match>(79</fn:match><br>
<fn:non-match>,</fn:non-match><br>
<fn:match> 171)</fn:match><br>
<fn:non-match> </fn:non-match><br>
</fn:analyze-string-result><br>
<br>
The error was here: "\(\d+ | \d+\)", which would only match
(any-digit plus a white space, whereas the number in question was
followed by *no space* and a comma. <br>
<br>
Know thy data!<br>
<br>
Examples created on BaseX. BTW, I started from known good examples
in XQuery Functions 3.1, verified that they worked and then created
the search strings. <br>
<br>
Hope this helps!<br>
<br>
Patrick<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 04/23/2018 12:22 PM, Joe Wicentowski
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAHwerk2zD+0GSwNFBh6m3PZZz0RMrv441shB98wU22kJRXVimQ@mail.gmail.com">
<meta http-equiv="Context-Type" content="text/html; charset=UTF-8">
<div dir="ltr">Hi all,
<div><br>
</div>
<div>I <span>have encountered an unexpected challenge
constructing a regex for a pattern I am looking for. I </span>am
looking for numbers in parentheses. For example, in the
following string:</div>
<div><br>
</div>
<div>
<div> "On February 13, 1968, Secretary of State Dean Rusk
sent a </div>
<div> message to Israeli Foreign Minister Abba Eban calling
upon Israel to </div>
<div> endorse openly Resolution 242, and on May 13
President Johnson sent a </div>
<div> letter to United Arab Republic (UAR) President Gamal
Abdel Nasser, </div>
<div> urging him to seize the unique opportunity offered by
the Jarring </div>
<div> mission to achieve peace. (79, 171)"</div>
</div>
<div><br>
</div>
<div>... I would like to match "79" and "171" (but not "UAR" or
"13" or "1968"). I have been trying to construct a regex for
use with analyze-string to capture this pattern, but I have
not been successful. I have tried the following:</div>
<div><br>
</div>
<div> analyze-string($string, "(?:\()(?:(\d+)(?:, )?)+(?:\))")</div>
<div><br>
</div>
<div>In other words, there are these 3 components:</div>
<div><br>
</div>
<div> 1. (?:\() a non-capturing group consisting of an open
parens, followed by</div>
<div> 2. (?:(\d+)(?:, )?)+ one or more non-capturing groups
consisting of (a number followed by an optional, non-matching
comma-and-space), followed by</div>
<div> 3. (?:\)) <span>a non-capturing group consisting of<span> a</span></span> close
parens</div>
<div><br>
</div>
<div>I was expecting to get the following output:</div>
<div><br>
</div>
<div>
<div> <fn:analyze-string-result xmlns:fn="<a
href="http://www.w3.org/2005/xpath-functions"
target="_blank" moz-do-not-send="true">http://www.w3.org/2005/xpath-functions</a>"></div>
<div> <fn:non-match>On February 13, 1968, Secretary
of State Dean Rusk sent a </div>
<div> message to Israeli Foreign Minister Abba Eban calling
upon Israel to </div>
<div> endorse openly Resolution 242, and on May 13
President Johnson sent a </div>
<div> letter to United Arab Republic (UAR) President Gamal
Abdel Nasser, </div>
<div> urging him to seize the unique opportunity offered by
the Jarring </div>
<div> mission to achieve peace. </fn:non-match></div>
<div> <fn:match>(<fn:group
nr="1">79</fn:group>, </div>
<div> <fn:group
nr="1">171</fn:group>)</fn:match></div>
<div> </fn:analyze-string-result></div>
</div>
<div><br>
</div>
<div>However, the actual result is that the first number ("79")
is skipped, and only the 2nd number ("171") is captured:</div>
<div><br>
</div>
<div>
<div> <fn:analyze-string-result xmlns:fn="<a
href="http://www.w3.org/2005/xpath-functions"
target="_blank" moz-do-not-send="true">http://www.w3.org/2005/xpath-functions</a>"></div>
<div> <fn:non-match>On February 13, 1968, Secretary
of State Dean Rusk sent a </div>
<div> message to Israeli Foreign Minister Abba Eban calling
upon Israel to </div>
<div> endorse openly Resolution 242, and on May 13
President Johnson sent a </div>
<div> letter to United Arab Republic (UAR) President Gamal
Abdel Nasser, </div>
<div> urging him to seize the unique opportunity offered by
the Jarring </div>
<div> mission to achieve peace. </fn:non-match></div>
<div> <fn:match>(79, </div>
<div> <fn:group
nr="1">171</fn:group>)</fn:match></div>
<div> </fn:analyze-string-result></div>
</div>
<div><br>
</div>
<div>What am I missing? Can anyone suggest a regex that is able
to capture both numbers inside the parentheses? Or do I need
to make a two-pass run through this, finding parenthetical
text with a first analyze-string like "\(.+\)" and then
looking inside its matches with a second analyze-string like
"(\d+)(?:, )?"?</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Joe</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
<a class="moz-txt-link-abbreviated" href="mailto:talk@x-query.com">talk@x-query.com</a>
<a class="moz-txt-link-freetext" href="http://x-query.com/mailman/listinfo/talk">http://x-query.com/mailman/listinfo/talk</a></pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Patrick Durusau
<a class="moz-txt-link-abbreviated" href="mailto:patrick@durusau.net">patrick@durusau.net</a>
Technical Advisory Board, OASIS (TAB)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): <a class="moz-txt-link-freetext" href="http://tm.durusau.net">http://tm.durusau.net</a>
Homepage: <a class="moz-txt-link-freetext" href="http://www.durusau.net">http://www.durusau.net</a>
Twitter: patrickDurusau </pre>
</body>
</html>