Creating a new document object?

alex.sherwin · Dec 23, 2005

Working on a google homepage module. One of the things I do is
retrieve results from a search into a string. I want to be able to
access all the anchors in this string, (their name, href, and
innerHTML) and using regex for string parsing would be tiresome and
difficult for this.

I've searched far and wide, but I can't seem to find out if its
possible to create a new document object (that doesnt refer to the
current document) with a string of text (or the address of the search
result, doesn't matter to me.)

That way, with this new document object, I can simply use the .anchors
attribute for all my needs.

Is this possible?

Martin Honnen · Dec 23, 2005

One of the things I do is
retrieve results from a search into a string. I want to be able to
access all the anchors in this string, (their name, href, and
innerHTML) and using regex for string parsing would be tiresome and
difficult for this.

I've searched far and wide, but I can't seem to find out if its
possible to create a new document object (that doesnt refer to the
current document) with a string of text

Browser allow you to load a URL in a window or frame and depending on
the same origin policy you can script the contents once it has been loaded.
Pure HTML parsing of complete HTML document markup in a string is
usually not exposed directly as a method, of course there is the good old
frameOrWin.document.open();
frameOrWin.document.write(stringWithHTMLMarkup);
frameOrWin.document.close();
// now access e.g
frameOrWin.document.links

If you have only snippets of HTML markup that would fit into a div then
nowadays you can make use of innerHTML e.g.
var div = document.createElement('div');
div.innerHTML = stringWithHTMLSnippet;
// now access e.g.
div.getElementsByTagName('a')
But of course relative URLs in href attributes will be resolved with the
base URL of the ownerDocument of that div you created.

By now there is also DOMParser and its parseFromString method in
Mozilla, in Opera 8 and later, and I think in Safari 2.01.
But that method in Mozilla takes the content type as the second argument
and in Mozilla only supports XML content types like text/xml or
application/xml or application/xhtml+xml and throws an exception if
text/html is passed in.
Opera seems to happily accept text/html as the content type argument but
I don't get anything parsed according to HTML rules, it looks like the
XML parser is used and that content type argument is ignored.
Not sure what Safari 2.01 does, perhaps someone else can report.

Thus if you have the markup of an XHTML document in a string you can do e.g.

var xmlDocument = new DOMParser().parseFromString([
'<html xmlns="http://www.w3.org/1999/xhtml">',
' <head>',
' <title>Example</title>',
' </head>',
' <body>',
' <p>Kibology for all.</p>',
' <p>All for Kibology.</p>',
' </body>',
'</html>'
].join('\r\n'), 'application/xhtml+xml');
var paragraphs =
xmlDocument.getElementsByTagNameNS(xmlDocument.documentElement.namespaceURI,
'p');
alert('Found ' + paragraphs.length + ' paragraph elements.');

but only the Core DOM is available to access elements.

Creating a regex to get multiple values and print	0	Jan 10, 2021
Creating a direct download div link for pdf file	3	Mar 19, 2023
Creating books with Sets	1	Sep 27, 2022
Is it possible to get some informations from a document in Google Docs and show it on my website ?	0	Nov 19, 2022
New To Javascript - Accessing Data	3	Nov 26, 2023
My New project	1	Mar 13, 2022
Ay suggestions for finding all src attributes in a document ?	6	Feb 13, 2011
Set Document Object	4	Feb 15, 2008

Creating a new document object?

alex.sherwin

Martin Honnen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads