"toXHTML()", where's the XML?

H

hawat.thufir

I'm looking at
<http://xmlserv.com/API/com/xmlserv/app/shared/SODocument.html#toXHTML(javax.xml.transform.Source)>
and <http://jtidy.sourceforge.net/> in trying to get some XML extracted
from XML. I'm reading two "for dummies" books, "XHTML for dummies" and
"XML for dummies", they're not the most current but are sufficient, I'm
sure.

I have some Java code which, using JTidy, reads in a URL and kicks out
the JTidy parsed file. The code is at <http://thufir.lecktronix.net/>,
click on "files" and "JTidy".

Where's the XML? I'm looking for either a "toXHTML" or "toXML" method
(function/routine/sub-program) in JTidy, but can't find it.

At this point I'm staying away from Necko
<http://people.apache.org/~andyc/neko/doc/html/> and Xerces
<http://xml.apache.org/xerces2-j/> simply because I have something with
JTidy, although I might switch to Necko later.

Anyhow, in the jar (which can be downloaded from
<http://thufir.lecktronix.net/> JTidy does parse a URL
(<http://www.yahoo.com/> is hard-coded in) and generates out.html.
Where's the XML, embedded within the XHTML?


thanks,

Thufir
 
H

hawat.thufir

Martin said:
XHTML is XML so I am not sure what you are looking for, presumably the
out.html is JTidy's attempt to create XHTML from http://www.yahoo.com/
for you.

That was my understanding, that "out.html" is XHTML; XHTML being a
super-set of XML. Having said that, I now realize that I've asked the
wrong question, or at least phrased it wrong, sorry.

I want to insert data from "out.html" into a database, such as
Hibernate <http://www.hibernate.org/> or Cocoon
<http://cocoon.apache.org/>.

To do that, I understand that I must first "transform" XML with XSLT,
somehow. I think I need more information about that process in order
to ask useful questions.

So, perhaps better questions are:

What sort of "output" am I looking for from an XSLT transform (in order
to do a database insert)?

Do I need to do the XSLT myself, or can that be done from a database?



Thanks,

Thufir
 
M

Martin Honnen

That was my understanding, that "out.html" is XHTML; XHTML being a
super-set of XML.

XHTML is an XML application meaning that any XHTML document is a
well-formed XML document. But XHTML is not a super-set of XML.

What sort of "output" am I looking for from an XSLT transform (in order
to do a database insert)?

No idea really, standard XSLT output methods are xml, html, and text so
you could use an XSLT stylesheet to process an XHTML document and
transform it to XML or HTML or plain text.
Not sure why you think XSLT helps with a data base insert, unless you
have a data base that stores XML natively/directly, or unless you have a
certain XSLT extension that allows RDBMS access.
Do I need to do the XSLT myself, or can that be done from a database?

I am not familiar with Cocoon or Hibernate, unless someone else shows up
here with expertise on those you are probably better off asking in one
of the dedicated mailing lists or forums offered on the web sites of the
products.
 
H

hawat.thufir

Martin Honnen wrote:
....
No idea really, standard XSLT output methods are xml, html, and text so
you could use an XSLT stylesheet to process an XHTML document and
transform it to XML or HTML or plain text.

This is what I want to do, thanks. I'll look further into that.
Not sure why you think XSLT helps with a data base insert, unless you
have a data base that stores XML natively/directly, or unless you have a
certain XSLT extension that allows RDBMS access.

I was browsing the bookstore and came across "Hibernate: A Developer's
notebook", <http://www.oreilly.com/catalog/hibernate/>.

Perhaps I misunderstood. I'm trying to get XML from XHTML with XSLT.
As I read it, Hibernate and Cocoon will take XML as data. Therefore, I
need to get some XML from the XHTML, then feed the XML to the RBDMS.

....
I am not familiar with Cocoon or Hibernate, unless someone else shows up
here with expertise on those you are probably better off asking in one
of the dedicated mailing lists or forums offered on the web sites of the
products.
....

Ok, will do.



Thanks,

Thufir
 
H

hawat.thufir

Martin Honnen wrote:
....
I am not familiar with Cocoon or Hibernate, unless someone else shows up
here with expertise on those you are probably better off asking in one
of the dedicated mailing lists or forums offered on the web sites of the
products.
....

"Use Cocoon to Create a Well-Formed View of a Web Page, Then Scrape It
for Data"
<http://hacks.oreilly.com/pub/h/2125>

Now to install Cocoon...


-Thufir
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top