Creating a new document object?

A

alex.sherwin

Working on a google homepage module. One of the things I do is
retrieve results from a search into a string. I want to be able to
access all the anchors in this string, (their name, href, and
innerHTML) and using regex for string parsing would be tiresome and
difficult for this.

I've searched far and wide, but I can't seem to find out if its
possible to create a new document object (that doesnt refer to the
current document) with a string of text (or the address of the search
result, doesn't matter to me.)

That way, with this new document object, I can simply use the .anchors
attribute for all my needs.

Is this possible?
 
M

Martin Honnen

One of the things I do is
retrieve results from a search into a string. I want to be able to
access all the anchors in this string, (their name, href, and
innerHTML) and using regex for string parsing would be tiresome and
difficult for this.

I've searched far and wide, but I can't seem to find out if its
possible to create a new document object (that doesnt refer to the
current document) with a string of text


Browser allow you to load a URL in a window or frame and depending on
the same origin policy you can script the contents once it has been loaded.
Pure HTML parsing of complete HTML document markup in a string is
usually not exposed directly as a method, of course there is the good old
frameOrWin.document.open();
frameOrWin.document.write(stringWithHTMLMarkup);
frameOrWin.document.close();
// now access e.g
frameOrWin.document.links

If you have only snippets of HTML markup that would fit into a div then
nowadays you can make use of innerHTML e.g.
var div = document.createElement('div');
div.innerHTML = stringWithHTMLSnippet;
// now access e.g.
div.getElementsByTagName('a')
But of course relative URLs in href attributes will be resolved with the
base URL of the ownerDocument of that div you created.

By now there is also DOMParser and its parseFromString method in
Mozilla, in Opera 8 and later, and I think in Safari 2.01.
But that method in Mozilla takes the content type as the second argument
and in Mozilla only supports XML content types like text/xml or
application/xml or application/xhtml+xml and throws an exception if
text/html is passed in.
Opera seems to happily accept text/html as the content type argument but
I don't get anything parsed according to HTML rules, it looks like the
XML parser is used and that content type argument is ignored.
Not sure what Safari 2.01 does, perhaps someone else can report.

Thus if you have the markup of an XHTML document in a string you can do e.g.

var xmlDocument = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
' <head>',
' <title>Example</title>',
' </head>',
' <body>',
' <p>Kibology for all.</p>',
' <p>All for Kibology.</p>',
' </body>',
'</html>'
].join('\r\n'), 'application/xhtml+xml');
var paragraphs =
xmlDocument.getElementsByTagNameNS(xmlDocument.documentElement.namespaceURI,
'p');
alert('Found ' + paragraphs.length + ' paragraph elements.');

but only the Core DOM is available to access elements.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,158
Latest member
Vinay_Kumar Nevatia
Top