[xquery-talk] Re: Finding a name and the resulting
Graham Reeds
grahamr at ntlworld.com
Wed Aug 30 01:45:39 PDT 2006
Sorry about the delay in replying to the questions - other matters to
attend to.
> Hey, that was not a question that can be answered using yes/no! ;-)
Sorry about that. Didn't read your response properly.
>
> I think you really have to come up with the problem solution in
> non-XQuery terms before we (the list) can help you implement that in
> XQuery. E.g. find out how you can determine which table cell relates
> to which user, what do you want do to with multiple values for one
> user, are there any exceptions etc.
The table that I deemed would be the easiest is work with is 4 cells
wide with the possibility of having just 2 cells populated with data.
The cells are simply name-value pairs with the first cell the name and
the second cell the value (an alpha-numeric). To conserve screen space
the original authors placed 2 name-value pairs per row - awkward I know
(they didn't even hyper link them instead had to go to another screen to
see how far in the workers are on the project).
An ascii example of the layout:
+------+---------+------+----------+
| Tom | ABC123| Dick | DEF456 |
+------+---------+------+----------+
| Harry | IJK789 | | |
+------+---------+------+----------+
This table is nested within other tables for layout and really is an
antiquated system - the amount of hours I have put in this (between
other tasks) I think I could of written the features and learnt the
finer points of java in the same time (c++ is my first language).
Currently I have program that can read in a page that using a
combination of Nux, Xom, TagSoup and Saxon. In trying to implement
http://www-128.ibm.com/developerworks/xml/library/j-jtp03225.html
scraping of the Yahoo stock quote for IBM using the below code simply
gives the output of <table /> instead of 81.40. I may of misinterpreted
how to get the value out of results but I should of got slightly more
than a closed table. It is entirely possible though that tagsoup has
nuked all possibility of extracting the expected value. That is
something I need to look into.
Anyway, thanks for all your continued help.
Graham Reeds.
source:
public void getPage()
{
try
{
XMLReader tagsoup =
XMLReaderFactory.createXMLReader("org.ccil.cowan.tagsoup.Parser");
Document doc = new
Builder(tagsoup).build("http://finance.yahoo.com/q?s=IBM");
String query = "<table>\n"+
"{\n"+
" for $d in //td\n"+
" where contains($d/text()[1], \"Last Trade\")\n"+
" return <tr><td> { data($d/following-sibling::td) } </td></tr>\n"+
"}\n"+
"</table>";
Nodes results = XQueryUtil.xquery(doc, query);
for (int i=0; i < results.size(); i++)
{
System.out.println(results.get(i).toXML());
// System.out.println(results.get(i));
}
}
catch (/* the various exceptions */)
{
// ...
}
More information about the talk
mailing list