extract data from xhtml

Discussion in 'Java' started by Damo_Suzuki, Dec 7, 2006.

  1. Damo_Suzuki

    Damo_Suzuki Guest

    Hi,
    I am in the process of extracting data from a html document. I used
    Jtidy to convert it to XHTML . Now that I have the XHTML how can i
    extract data from it. Say, I wanted to extract a node with the tag <h2
    class ="r">.......</h2> , does anyone know or have sample code to
    achieve this. I've been knocking my head off a brick wall for a few
    days now trying to do this.
    Thanks
     
    Damo_Suzuki, Dec 7, 2006
    #1
    1. Advertising

  2. "Damo_Suzuki" <> schrieb

    > I am in the process of extracting data from a html document. I used
    > Jtidy to convert it to XHTML . Now that I have the XHTML how can i
    > extract data from it.


    As a valid XHTML document is well formed XML, you should be able to parse
    it - either with a DOMParser or SAXParser. Searching for them in Google
    should bring up enough examples how to use them.

    Flo
     
    Flo 'Irian' Schaetz, Dec 7, 2006
    #2
    1. Advertising

  3. Damo_Suzuki

    Damo_Suzuki Guest

    Hi,
    Now that its in XHTML can I use DocumentBuilder to extract data from it
    .. I dont want to write the xhml to a file. my code looks like this :

    tidy.parse(in, System.out);


    DocumentBuilderFactory domFactory =
    DocumentBuilderFactory.newInstance();
    domFactory.setNamespaceAware(true);
    DocumentBuilder builder = domFactory.newDocumentBuilder();
    Document doc = builder.parse(XXXXXXXXXX);

    In the parse method 'in' is the file i want to extract data from. Its
    gotten straight off the web, "JTidied" and output to the console. Can
    I somehow use this as the paramater where all the X's are for the
    DocumentBuilder parse method?
    Thanks
     
    Damo_Suzuki, Dec 7, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    7
    Views:
    901
  2. Damo_Suzuki
    Replies:
    0
    Views:
    405
    Damo_Suzuki
    Dec 9, 2006
  3. chronos3d
    Replies:
    9
    Views:
    786
    Andy Dingley
    Dec 5, 2006
  4. Replies:
    4
    Views:
    319
    Bruno Desthuilliers
    Mar 3, 2007
  5. Usha2009
    Replies:
    0
    Views:
    1,140
    Usha2009
    Dec 20, 2009
Loading...

Share This Page