[xquery-talk] Finding a name and the resulting
Graham Reeds
grahamr at ntlworld.com
Thu Aug 17 02:38:34 PDT 2006
I have a HTML file that contains table that has information regarding
projects and progression of the people working on those progressions
The file itself is passed through TagSoup to make it well formed. From
this i would like to extract the information. For this I am using the
XOM to build the document and NUX to provide the XQuery capability.
Now I am looking for a name followed by a number. This is a sample with
whitespace removed, but linebreaks left in:
<td align="left" colspan="1" rowspan="1" valign="TOP" width="200">
<b>Bob Stevens:
</b>
</td>
<td align="center" colspan="1" rowspan="1" valign="TOP" width="75">
<b/>
<p class="purple">
<b>
<b>SCMM9</b>
</b>
</p>
<b>
</b>
</td>
Other pages have varying numbers of columns but this is the simplest
page with a single name, followed by a letter/number combo. Some early
documents are just a number.
What I need is to parse out the names and their document, from a
pre-generated list, and fed into an array. So a list of Tom, Dick and
Harry would mean Tom was [0] in the array, Dick is [1], etc.
Most sites that I have found that talk about XQuery perform simple
queries that turn one type of XML into another. One of the best ones is
http://www-128.ibm.com/developerworks/xml/library/j-jtp03225.html. I
need some more similar to that if possible.
Thanks, Graham Reeds.
More information about the talk
mailing list