parsing XHTML with JavaScript?

jackwootton · Jun 23, 2007

Hello everyone,

I understand that XML can be parsed using JavaScript using the XML
Document object. However, it is possible to parse XHTML using
JavaScript? I currently listen for DOMMutation events, when the
events occur I access the node which was inserted or removed
(event.target). There is only ever about 5 lines of XHTML nested in
the node, however it would be silly for me to parse it manually using
methods like hasChildNodes and parentNode if I there is an object that
could do it for me.

Many thanks,

Jack

RobG · Jun 24, 2007

Hello everyone,

I understand that XML can be parsed using JavaScript using the XML
Document object. However, it is possible to parse XHTML using
JavaScript? I currently listen for DOMMutation events, when the
events occur I access the node which was inserted or removed
(event.target). There is only ever about 5 lines of XHTML nested in
the node, however it would be silly for me to parse it manually using
methods like hasChildNodes and parentNode if I there is an object that
could do it for me.

You haven't said what it is that you are trying to do - that is, why
you are parsing the returned XML. The whole point of XML is to
provide structure, but you want to ignore the structure and parse it
yourself. So I've got to ask, why are you using XML?

Perhaps you should be using JSON: <URL: http://www.json.org/ >

You might find textContent useful, it essentially returns the
innerHTML with all the tags stripped. Or you might want get the
actual node that was modified using the event's relatedNode property.
If you are trying to get attributes of the nodes, you really should
use DOM methods.

jackwootton · Jun 24, 2007

You haven't said what it is that you are trying to do - that is, why
you are parsing the returned XML. The whole point of XML is to
provide structure, but you want to ignore the structure and parse it
yourself. So I've got to ask, why are you using XML?

Perhaps you should be using JSON: <URL:http://www.json.org/>

You might find textContent useful, it essentially returns the
innerHTML with all the tags stripped. Or you might want get the
actual node that was modified using the event's relatedNode property.
If you are trying to get attributes of the nodes, you really should
use DOM methods.

I'm not parsing XML. I'm parsing XHTML.

Martin Honnen · Jun 24, 2007

I'm not parsing XML. I'm parsing XHTML.

XHTML is XML. So an XML parser can deal with it and build an XML DOM, if
the parser knows XHTML then it can even build an XHTML DOM for the XHTML
elements in the namespace http://www.w3.org/1999/xhtml. For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
'<head>',
'<title>Example</title>',
'</head>',
'<body>',
'<p id="p1">Kibology for all.</p>',
'</body>',
'</html>'
].join('\r\n'), 'application/xml');

var ps = xmlDoc.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'p');
var p = ps[0];
alert(p + ': ' + p.id);

and p is an HTMLParagraphpElement having an id property as the parser
has recognized the namespace of the element and built the XHTML DOM.

jackwootton · Jun 24, 2007

[email protected] said:
[email protected] said:

I'm not parsing XML. I'm parsing XHTML.

Click to expand...

XHTML is XML. So an XML parser can deal with it and build an XML DOM, if
the parser knows XHTML then it can even build an XHTML DOM for the XHTML
elements in the namespacehttp://www.w3.org/1999/xhtml. For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
'<head>',
'<title>Example</title>',
'</head>',
'<body>',
'<p id="p1">Kibology for all.</p>',
'</body>',
'</html>'
].join('\r\n'), 'application/xml');

var ps = xmlDoc.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'p');
var p = ps[0];
alert(p + ': ' + p.id);

and p is an HTMLParagraphpElement having an id property as the parser
has recognized the namespace of the element and built the XHTML DOM.

Thank you. That was my original question (can the XML parser cope
XHTML). But now, it seems a bit of an obvious question. Thank you
for your help, I will give it a go.

Jack

Martin Honnen · Jun 24, 2007

That was my original question (can the XML parser cope
XHTML). But now, it seems a bit of an obvious question.

It depends on the browser and its XHTML support. IE for instance uses
MSXML as its XML parser which is completely separate from the HTML
parser. Thus if MSXML encounters an element in the XHTML namespace it
does not do anything special to provide an XHTML DOM element, rather it
builds an XML DOM element only. So IE with MSXML can parse XHTML but it
does not build an XHTML DOM. That way you are for instance not able to
import XHTML element nodes from an MSXML XML DOM document into the HTML DOM.

scripts.contact · Jun 24, 2007

For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString

just fyi, opera also supports DOMParser and your example works fine
here.

parsing XHTML fragment using JavaSCript	14	Sep 5, 2007
Parsing XHTML from mutation event	0	Jul 7, 2007
parsing nested unbounded XML fields with ElementTree	6	Nov 25, 2013
XML Parsing Newbie Madness	7	Apr 12, 2008
javascript in xhtml files: avoid errors with characters > < & ' " ??	14	Jan 8, 2008
XML/XHTML/HTML differences, bugs... and howto	0	Jan 23, 2013
Parsing xhtml with libxml	1	Dec 16, 2005
XML Parsing Problem in Internet Explorer	1	Oct 11, 2008

parsing XHTML with JavaScript?

jackwootton

RobG

jackwootton

Martin Honnen

jackwootton

Martin Honnen

scripts.contact

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads