XSLT, HTML to XML, understanding external Website

Discussion in 'XML' started by Arne Pagel, Jul 15, 2012.

  1. Arne Pagel

    Arne Pagel Guest

    Dear all,

    currently I am searching for a concept for importing Information from an external website to my
    xslt/php based lunch order system.
    My current Idea is to filter this external website with xslt and convert the necessary information
    to an xml file.

    At the moment I am trying to import the weekly changing menu from a Restaurant.
    The Problem is, that the Website of this restaurant is probably maintained through a web based CMS,
    which means that the quality and consistency of the web page is not that high.

    Main problem is that one important Information delimiter is the linefeed <BR> within normal text.
    I am stuck at the point how I can react on <BR> Tags at normal node text.

    Below you can find an extract of the original web-site.

    With the current xslt Template an empty node filtering is done:

    - - -
    <xsl:template match="table/tr/td/div">
    <xsl:if test=". != ''">
    DIV:<xsl:value-of select="." /> <br/>
    </xsl:if>
    </xsl:template>
    - - -

    Now I want to add the following functionality:
    - this Template should just work at a table which contains the phrase "Mittagstischkarte"
    - The linefeed's <br> within the text should be Identified
    - The Menues are just clearly separated by the price,
    an number of the Format X.XX should be identified
    - Rows with just formating content without real text (A-Z a-z 0-9) should be ignored

    Do you think this can all be done with xlst?
    It is also possible to do this in more templates with different calls from php, or to add some php
    post / intermediate processing.


    Here is the extract of the Original website (sorry, content is German)
    - - -
    <table width="100%" border="0" cellpadding="0" cellspacing="0">
    <tr>
    <td width="30" height="552"></td>
    <td width="529" valign="top">
    <div align="center"><font size="4"><b>Mittagstischkarte</b></font><br><br><font
    size="4"><font size="3">Unser wöchentlich wechselnder Mittagstisch</font></font> <br><font
    size="4"><font size="3">von 12.00 bis 14.00 Uhr</font></font></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"><font size="3"></font>&nbsp;</div>
    <div align="center"><b><font size="3">"Eintopf der Woche"</font></b><br>Linseneintopf mit
    Bockwurst<br>¤ 5,50 <br></div>
    <div align="center"><font size="3"></font>&nbsp;</div>
    <div align="center"><font size="6">Tagessuppe &nbsp; 1,50 ¤<br><br></font>&nbsp;<br></div>
    <div align="center"><font size="3">Kasseler mit Sauerkraut und Kartoffelpüree<br><b>5,50
    ¤</b><br></font><br><font size="4"><font size="2">__________</font></font><font size="4"><br>kl.
    Schnitzel mit Sauce nach Wahl,<br>Bratkartoffeln und Gemüse<br><br></font><font size="4"><b>5,50
    ¤<br><br></b></font></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center">______________</div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"></div>
    <div align="center"><font size="4"></font></div>
    <div align="center"><font size="4"></font></div><div align="center"><font size="4"
    color="#f0f090">frische Bratwurst<br>mit Bratkartoffeln und Gemüse<br></font></div><div
    align="center"></div>
    <div align="center"><font size="4"></font></div>
    <div align="center"><font size="4"></font></div>

    <div align="center"><font size="4">5,50 ¤</font></div>
    <div align="center">____________</div>
    <div align="center"><font size="4"></font></div>
    <div align="center"><font size="4">fruchtiges Hähnchengeschnetzeltes<br>im Reisrand mit
    Salat<br>5,50 ¤<br>---------<br></font></div>
    <div align="center"></div>
    <div align="center"><font size="4">2 Spiegeleier<br>&nbsp;mit Salzkartoffeln und Blattspinat<br>5,50
    ¤</font></div>
    <div align="center"><font size="4"></font></div>
    <div align="center"></div>
    <div align="center"><font size="4">_________</font><br><font
    size="5"><br>Dessert&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1,50
    ¤</font><br><br><br><br></div><font size="4"><br></font>
    <div align="center"><font size="4"><font size="4"></font></font></div><font
    size="4">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br><br></font>
    <div align="center"></div>
    <div align="center"><font size="3"></font></div>
    <div align="center"></div> </td>
    <td width="30"></td>
    </tr>
    </table>

    This page is loaded via the DOM Function loadHTMLFile

    - - -
    Regards Arne
    Arne Pagel, Jul 15, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stylus Studio
    Replies:
    0
    Views:
    633
    Stylus Studio
    Aug 3, 2004
  2. Replies:
    4
    Views:
    651
  3. jkflens
    Replies:
    2
    Views:
    1,444
    jkflens
    May 30, 2006
  4. asd
    Replies:
    0
    Views:
    322
  5. Une Bévue

    xml/xslt external function

    Une Bévue, Feb 4, 2008, in forum: Ruby
    Replies:
    1
    Views:
    142
    Tim Perrett
    Feb 6, 2008
Loading...

Share This Page