<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Joe,</p>

    <p>Forgive the length but I'm likely to bump my head on this issue

      in the future, so a fuller than necessary explanation:</p>

    <p>Started with the simplest regex that would capture the parens:</p>

    <p>1. fn:analyze-string("On February 13, 1968, Secretary of State

      Dean Rusk sent a message to Israeli Foreign Minister Abba Eban

      calling upon Israel to endorse openly Resolution 242, and on May

      13 President Johnson sent a letter to United Arab Republic (UAR)

      President Gamal Abdel Nasser, urging him to seize the unique

      opportunity offered by the Jarring mission to achieve peace. (79,

      171) ", "\(\d.*\)")<br>

    </p>

    1. Result: <fn:analyze-string-result

    xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>

      <fn:non-match>On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic

    </fn:non-match><br>

      <fn:match>(UAR) President Gamal Abdel Nasser, urging him to

    seize the unique opportunity offered by the Jarring mission to

    achieve peace. (79, 171)</fn:match><br>

      <fn:non-match> </fn:non-match><br>

    </fn:analyze-string-result><br>

    <br>

    OK, so what do we know about the desired matches? Digits plus (, )

    with no spaces. Yes?<br>

    <br>

    2. fn:analyze-string("On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace. (79,

    171) ", "\(\d, \d+\)")<br>

    <br>

    So I match parens plus digits, ", " (comma plus whitespace), digits

    plus paren.<br>

    <br>

    2. Result: <fn:analyze-string-result

    xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>

      <fn:non-match>On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace.

    </fn:non-match><br>

      <fn:match>(79, 171)</fn:match><br>

      <fn:non-match> </fn:non-match><br>

    </fn:analyze-string-result><br>

    <br>

    I need to split the two numbers and what better to do that than

    alternative matching?<br>

    <br>

    3. fn:analyze-string("On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace. (79,

    171) ", "\(\d+ | \d+\)")<br>

    <br>

    3. Result: <fn:analyze-string-result

    xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>

      <fn:non-match>On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace.

    (79,</fn:non-match><br>

      <fn:match> 171)</fn:match><br>

      <fn:non-match> </fn:non-match><br>

    </fn:analyze-string-result><br>

    <br>

    Your probably already laughing because you see my mistake, which I

    correct in #4:<br>

    <br>

    4. fn:analyze-string("On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace. (79,

    171) ", "\(\d+|\d+\)")<br>

    <br>

    4. Result: <fn:analyze-string-result

    xmlns:fn=<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2005/xpath-functions">"http://www.w3.org/2005/xpath-functions"</a>><br>

      <fn:non-match>On February 13, 1968, Secretary of State Dean

    Rusk sent a message to Israeli Foreign Minister Abba Eban calling

    upon Israel to endorse openly Resolution 242, and on May 13

    President Johnson sent a letter to United Arab Republic (UAR)

    President Gamal Abdel Nasser, urging him to seize the unique

    opportunity offered by the Jarring mission to achieve peace.

    </fn:non-match><br>

      <fn:match>(79</fn:match><br>

      <fn:non-match>,</fn:non-match><br>

      <fn:match> 171)</fn:match><br>

      <fn:non-match> </fn:non-match><br>

    </fn:analyze-string-result><br>

    <br>

    The error was here: "\(\d+ | \d+\)", which would only match

    (any-digit plus a white space, whereas the number in question was

    followed by *no space* and a comma. <br>

    <br>

    Know thy data!<br>

    <br>

    Examples created on BaseX. BTW, I started from known good examples

    in XQuery Functions 3.1, verified that they worked and then created

    the search strings. <br>

    <br>

    Hope this helps!<br>

    <br>

    Patrick<br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 04/23/2018 12:22 PM, Joe Wicentowski

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAHwerk2zD+0GSwNFBh6m3PZZz0RMrv441shB98wU22kJRXVimQ@mail.gmail.com">

      <meta http-equiv="Context-Type" content="text/html; charset=UTF-8">

      <div dir="ltr">Hi all,

        <div><br>

        </div>

        <div>I <span>have encountered an unexpected challenge

            constructing a regex for a pattern I am looking for.  I </span>am

          looking for numbers in parentheses.  For example, in the

          following string:</div>

        <div><br>

        </div>

        <div>

          <div>  "On February 13, 1968, Secretary of State Dean Rusk

            sent a </div>

          <div>    message to Israeli Foreign Minister Abba Eban calling

            upon Israel to </div>

          <div>    endorse openly Resolution 242, and on May 13

            President Johnson sent a </div>

          <div>    letter to United Arab Republic (UAR) President Gamal

            Abdel Nasser, </div>

          <div>    urging him to seize the unique opportunity offered by

            the Jarring </div>

          <div>    mission to achieve peace. (79, 171)"</div>

        </div>

        <div><br>

        </div>

        <div>... I would like to match "79" and "171" (but not "UAR" or

          "13" or "1968").  I have been trying to construct a regex for

          use with analyze-string to capture this pattern, but I have

          not been successful.  I have tried the following:</div>

        <div><br>

        </div>

        <div>  analyze-string($string, "(?:\()(?:(\d+)(?:, )?)+(?:\))")</div>

        <div><br>

        </div>

        <div>In other words, there are these 3 components:</div>

        <div><br>

        </div>

        <div>  1. (?:\() a non-capturing group consisting of an open

          parens, followed by</div>

        <div>  2. (?:(\d+)(?:, )?)+ one or more non-capturing groups

          consisting of (a number followed by an optional, non-matching

          comma-and-space), followed by</div>

        <div>  3. (?:\)) <span>a non-capturing group consisting of<span> a</span></span> close

          parens</div>

        <div><br>

        </div>

        <div>I was expecting to get the following output:</div>

        <div><br>

        </div>

        <div>

          <div>  <fn:analyze-string-result xmlns:fn="<a

              href="http://www.w3.org/2005/xpath-functions"

              target="_blank" moz-do-not-send="true">http://www.w3.org/2005/xpath-functions</a>"></div>

          <div>    <fn:non-match>On February 13, 1968, Secretary

            of State Dean Rusk sent a </div>

          <div>    message to Israeli Foreign Minister Abba Eban calling

            upon Israel to </div>

          <div>    endorse openly Resolution 242, and on May 13

            President Johnson sent a </div>

          <div>    letter to United Arab Republic (UAR) President Gamal

            Abdel Nasser, </div>

          <div>    urging him to seize the unique opportunity offered by

            the Jarring </div>

          <div>    mission to achieve peace. </fn:non-match></div>

          <div>    <fn:match>(<fn:group

            nr="1">79</fn:group>, </div>

          <div>      <fn:group

            nr="1">171</fn:group>)</fn:match></div>

          <div>  </fn:analyze-string-result></div>

        </div>

        <div><br>

        </div>

        <div>However, the actual result is that the first number ("79")

          is skipped, and only the 2nd number ("171") is captured:</div>

        <div><br>

        </div>

        <div>

          <div>  <fn:analyze-string-result xmlns:fn="<a

              href="http://www.w3.org/2005/xpath-functions"

              target="_blank" moz-do-not-send="true">http://www.w3.org/2005/xpath-functions</a>"></div>

          <div>    <fn:non-match>On February 13, 1968, Secretary

            of State Dean Rusk sent a </div>

          <div>    message to Israeli Foreign Minister Abba Eban calling

            upon Israel to </div>

          <div>    endorse openly Resolution 242, and on May 13

            President Johnson sent a </div>

          <div>    letter to United Arab Republic (UAR) President Gamal

            Abdel Nasser, </div>

          <div>    urging him to seize the unique opportunity offered by

            the Jarring </div>

          <div>    mission to achieve peace. </fn:non-match></div>

          <div>    <fn:match>(79, </div>

          <div>      <fn:group

            nr="1">171</fn:group>)</fn:match></div>

          <div>  </fn:analyze-string-result></div>

        </div>

        <div><br>

        </div>

        <div>What am I missing?  Can anyone suggest a regex that is able

          to capture both numbers inside the parentheses?  Or do I need

          to make a two-pass run through this, finding parenthetical

          text with a first analyze-string like "\(.+\)" and then

          looking inside its matches with a second analyze-string like

          "(\d+)(?:, )?"?</div>

        <div><br>

        </div>

        <div>Thanks,</div>

        <div>Joe</div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

<a class="moz-txt-link-abbreviated" href="mailto:talk@x-query.com">talk@x-query.com</a>

<a class="moz-txt-link-freetext" href="http://x-query.com/mailman/listinfo/talk">http://x-query.com/mailman/listinfo/talk</a></pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Patrick Durusau

<a class="moz-txt-link-abbreviated" href="mailto:patrick@durusau.net">patrick@durusau.net</a>

Technical Advisory Board, OASIS (TAB)

Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300

Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): <a class="moz-txt-link-freetext" href="http://tm.durusau.net">http://tm.durusau.net</a>

Homepage: <a class="moz-txt-link-freetext" href="http://www.durusau.net">http://www.durusau.net</a>

Twitter: patrickDurusau </pre>

  </body>

</html>