parsing XHTML with JavaScript?


J

jackwootton

Hello everyone,

I understand that XML can be parsed using JavaScript using the XML
Document object. However, it is possible to parse XHTML using
JavaScript? I currently listen for DOMMutation events, when the
events occur I access the node which was inserted or removed
(event.target). There is only ever about 5 lines of XHTML nested in
the node, however it would be silly for me to parse it manually using
methods like hasChildNodes and parentNode if I there is an object that
could do it for me.

Many thanks,

Jack
 
Ad

Advertisements

R

RobG

Hello everyone,

I understand that XML can be parsed using JavaScript using the XML
Document object. However, it is possible to parse XHTML using
JavaScript? I currently listen for DOMMutation events, when the
events occur I access the node which was inserted or removed
(event.target). There is only ever about 5 lines of XHTML nested in
the node, however it would be silly for me to parse it manually using
methods like hasChildNodes and parentNode if I there is an object that
could do it for me.

You haven't said what it is that you are trying to do - that is, why
you are parsing the returned XML. The whole point of XML is to
provide structure, but you want to ignore the structure and parse it
yourself. So I've got to ask, why are you using XML?

Perhaps you should be using JSON: <URL: http://www.json.org/ >

You might find textContent useful, it essentially returns the
innerHTML with all the tags stripped. Or you might want get the
actual node that was modified using the event's relatedNode property.
If you are trying to get attributes of the nodes, you really should
use DOM methods.
 
J

jackwootton

You haven't said what it is that you are trying to do - that is, why
you are parsing the returned XML. The whole point of XML is to
provide structure, but you want to ignore the structure and parse it
yourself. So I've got to ask, why are you using XML?

Perhaps you should be using JSON: <URL:http://www.json.org/>

You might find textContent useful, it essentially returns the
innerHTML with all the tags stripped. Or you might want get the
actual node that was modified using the event's relatedNode property.
If you are trying to get attributes of the nodes, you really should
use DOM methods.

I'm not parsing XML. I'm parsing XHTML.
 
M

Martin Honnen

I'm not parsing XML. I'm parsing XHTML.

XHTML is XML. So an XML parser can deal with it and build an XML DOM, if
the parser knows XHTML then it can even build an XHTML DOM for the XHTML
elements in the namespace http://www.w3.org/1999/xhtml. For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
'<head>',
'<title>Example</title>',
'</head>',
'<body>',
'<p id="p1">Kibology for all.</p>',
'</body>',
'</html>'
].join('\r\n'), 'application/xml');

var ps = xmlDoc.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'p');
var p = ps[0];
alert(p + ': ' + p.id);

and p is an HTMLParagraphpElement having an id property as the parser
has recognized the namespace of the element and built the XHTML DOM.
 
J

jackwootton

I'm not parsing XML. I'm parsing XHTML.

XHTML is XML. So an XML parser can deal with it and build an XML DOM, if
the parser knows XHTML then it can even build an XHTML DOM for the XHTML
elements in the namespacehttp://www.w3.org/1999/xhtml. For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
'<head>',
'<title>Example</title>',
'</head>',
'<body>',
'<p id="p1">Kibology for all.</p>',
'</body>',
'</html>'
].join('\r\n'), 'application/xml');

var ps = xmlDoc.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'p');
var p = ps[0];
alert(p + ': ' + p.id);

and p is an HTMLParagraphpElement having an id property as the parser
has recognized the namespace of the element and built the XHTML DOM.

Thank you. That was my original question (can the XML parser cope
XHTML). But now, it seems a bit of an obvious question. Thank you
for your help, I will give it a go.

Jack
 
M

Martin Honnen

That was my original question (can the XML parser cope
XHTML). But now, it seems a bit of an obvious question.

It depends on the browser and its XHTML support. IE for instance uses
MSXML as its XML parser which is completely separate from the HTML
parser. Thus if MSXML encounters an element in the XHTML namespace it
does not do anything special to provide an XHTML DOM element, rather it
builds an XML DOM element only. So IE with MSXML can parse XHTML but it
does not build an XHTML DOM. That way you are for instance not able to
import XHTML element nodes from an MSXML XML DOM document into the HTML DOM.
 
Ad

Advertisements

S

scripts.contact

For instance
with Mozilla you can do

var xmlDoc = new DOMParser().parseFromString


just fyi, opera also supports DOMParser and your example works fine
here.
 

Top