HTML to XML Conversion - Difficulty with Tidy and TagSoup

Discussion in 'Java' started by Eric, Dec 30, 2003.

  1. Eric

    Eric Guest

    I'm trying to convert html pages to xml and I'm having some difficulty
    with the folowing:

    1. I try to use Tidy but the html that I'm trying to convert to xhtml
    has too many errors and so I spend a lot of time trying to "fix" the
    html before running it through Tidy. I'm using Tidy with -asxml

    2. I've tried using TagSoup with JDOM but the SAXBuilder internally
    tries to set the namespace prefixes and TagSoup does not support that
    internal feature.

    I really would appreciate help from someone who has delt with having
    to crank out lots of html from poorly formatted html. I appreciate
    any help! ;)

    -Eric
    Eric, Dec 30, 2003
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. d davis
    Replies:
    0
    Views:
    469
    d davis
    Apr 27, 2004
  2. Christoph Schneegans

    HTML Tidy in ASP.NET

    Christoph Schneegans, Nov 2, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    7,117
    mthakershi
    Apr 28, 2009
  3. =?utf-8?q?Bj=C3=B6rn_Lindstr=C3=B6m?=

    ElementTree Tidy HTML Tree Builder and comments

    =?utf-8?q?Bj=C3=B6rn_Lindstr=C3=B6m?=, Mar 19, 2005, in forum: Python
    Replies:
    0
    Views:
    354
    =?utf-8?q?Bj=C3=B6rn_Lindstr=C3=B6m?=
    Mar 19, 2005
  4. Chanchal
    Replies:
    1
    Views:
    1,565
    Tom Anderson
    Aug 7, 2009
  5. Dave Boland

    NotePad++ and HTML Tidy question

    Dave Boland, Aug 18, 2009, in forum: HTML
    Replies:
    5
    Views:
    5,062
Loading...

Share This Page