Alternative to documentElement.innerHTML?

Kyle · Jan 25, 2004

I am presently making use of documentElement.innerHTML to retrieve
page contents for manipulation, but I've noticed that the sting value
returned is not identical to the actual page source. Specifically,
attribute assignments that look like:

height=100 width=100

in the real source, look like:

height="100" width="100"

in the returned value from documentElement.innerHTML.

Further complicating things, forms that begin insode a table in this
manner:

<table><form ...><tr><td...><input...></form></td>...

Are returned as:

<table><form ...></form><tr><td...><input...

If I modify the returned value from documentElement.innerHTML, then
write it back to documentElement.innerHTML, many of the forms are
non-functional.

I am interested in any available alternatives that will function in
recent Mozilla releases. Thank you,

-Kyle

Randy Webb · Jan 25, 2004

Kyle said:
I am presently making use of documentElement.innerHTML to retrieve
page contents for manipulation, but I've noticed that the sting value
returned is not identical to the actual page source. Specifically,
attribute assignments that look like:

height=100 width=100

in the real source, look like:

height="100" width="100"

in the returned value from documentElement.innerHTML.

Further complicating things, forms that begin insode a table in this
manner:

<table><form ...><tr><td...><input...></form></td>...

Are returned as:

<table><form ...></form><tr><td...><input...

If I modify the returned value from documentElement.innerHTML, then
write it back to documentElement.innerHTML, many of the forms are
non-functional.

I am interested in any available alternatives that will function in
recent Mozilla releases. Thank you,

Validate your (X)HTML and you will solve a lot of those problems. Along
with dropping tables for layout.

Read the group FAQ, it discusses how to read a text file (2 methods),
which is what you are trying to do.

PeEmm · Jan 25, 2004

Kyle skrev, On 1/25/2004 6:51 AM:

I am presently making use of documentElement.innerHTML to retrieve
page contents for manipulation, but I've noticed that the sting value
returned is not identical to the actual page source. Specifically,
attribute assignments that look like:

height=100 width=100

in the real source, look like:

height="100" width="100"

in the returned value from documentElement.innerHTML.

Further complicating things, forms that begin insode a table in this
manner:

<table><form ...><tr><td...><input...></form></td>...

Are returned as:

<table><form ...></form><tr><td...><input...

If I modify the returned value from documentElement.innerHTML, then
write it back to documentElement.innerHTML, many of the forms are
non-functional.

I am interested in any available alternatives that will function in
recent Mozilla releases. Thank you,

-Kyle

The DOM naturally only functions as expected, if the HTML source is as
expected, i.e. is valid due to standards. The examples you give above
are malformed HTML, so the DOM tries to do something about the mishmash.

Kyle · Jan 25, 2004

Randy Webb said:
Validate your (X)HTML and you will solve a lot of those problems. Along
with dropping tables for layout.

This code is resident in a Mozilla extension, not a page that I've
written. It isn't my HTML that I need to parse so I have no control
over it's validity.

Read the group FAQ, it discusses how to read a text file (2 methods),
which is what you are trying to do.

I don't understand what you mean here. As far as I know, the "file"
does not exist anywhere in the filesystem so this is untrue. I assume
this content is somewhere in memory because "View Source" and Sherlock
plugins make use of the real source without accessing the page a 2nd
time.

Thanks for any input.

--Kyle

Kyle · Jan 25, 2004

PeEmm said:
Kyle skrev, On 1/25/2004 6:51 AM:

The DOM naturally only functions as expected, if the HTML source is as
expected, i.e. is valid due to standards. The examples you give above
are malformed HTML, so the DOM tries to do something about the mishmash.

I should have been more clear. This is a Mozilla Chrome extension, so
I assume that I should have access to the same methods that Mozilla
uses to display the source with "View Source" and retrieve the source
for parsing with Sherlock plugins. Thanks,

--Kyle

Randy Webb · Jan 25, 2004

Kyle said:
This code is resident in a Mozilla extension, not a page that I've
written. It isn't my HTML that I need to parse so I have no control
over it's validity.
Ok.

I don't understand what you mean here. As far as I know, the "file"
does not exist anywhere in the filesystem so this is untrue. I assume
this content is somewhere in memory because "View Source" and Sherlock
plugins make use of the real source without accessing the page a 2nd
time.

My response was in direct relation to the assumption (that is now
incorrect) that you were trying to read the HTML code of an HTML file,
and you wanted the original code, not the rendered code (they are
different).

If you load a page, and then do
javascript:alert(document.documentElement.innerHTML);
In the address bar, and then view the source of the page, on very very
few occasions will they be the same code.

Example:
When I open IE, it opens to about:blank. (actually, all of my browsers
are set to open to about:blank)
View>Source gives this code:
<HTML></HTML>
And thats it.
javascript:alert(document.documentElement.innerHTML);
alerts this:
<HEAD></HEAD>
<BODY></BODY>

In Mozilla, about:blank view>Source gives this code:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head><title></title></head>
<body></body>
</html>

I line broke it for readability.

javascript:alert(document.documentElement.innerHTML);
gives this code:

<head><title></title></head><body></body>

Note the missing DTD and HTML tags.

In order to get the original, written code, of a webpage, into a
variable that the page's javascript can use, you have to read the file
from the server. And the only two ways I know of to do that is with an
HTTPRequestObject or a JAVA applet, hence my suggestion to consult the FAQ.

Whether any of that helps with you trying to read a Mozilla Skin plugin,
I don't know

Lasse Reichstein Nielsen · Jan 25, 2004

Randy Webb said:
If you load a page, and then do
javascript:alert(document.documentElement.innerHTML);
In the address bar, and then view the source of the page, on very very
few occasions will they be the same code.

Yes, browsers build the innerHTML structure from the current structure
of the document, whereas the view-source shows the original source code.
That means that innerHTML is "unparsing" the DOM tree structure, and
it would be surpricing if it gave exactly the same formatting as the
original source, even if the structure was the same.

....

In Mozilla, about:blank view>Source gives this code:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html> ....

javascript:alert(document.documentElement.innerHTML);
gives this code:

<head><title></title></head><body></body>

Note the missing DTD and HTML tags.

Not surpricing since you ask for the *inner*HTML of the HTML element.
If Mozilla supported the "outerHTML" property, you could also show
the HTML tag. The document type element is even harder to find. It
is the first child of the document element (where the HTML element
is the second).

/L

I am using 2 loops, 1 for input and 1 for td. Can we achieve the same functionality with 1 loop in Jquery?	4	Sep 29, 2023
Image shifts to the right when export the page to pdf	4	May 5, 2023
How can I calculate the last payment of the year to be the sum of all previous payments for that year and subtracting it from Research Costs value?	7	Aug 22, 2023
Uncaught ReferenceError: item is not defined at HTMLButtonElement.onclick in the: <button onclick="item.inserir()">Inserir dados</button>	1	Apr 22, 2023
Filter table rows based on multiple checkboxes value	2	Jan 13, 2023
Help needed with thank you message	5	Sep 11, 2021
Help with Visual Lightbox: Scripts	2	May 3, 2023
How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023

Alternative to documentElement.innerHTML?

Kyle

Randy Webb

PeEmm

Kyle

Kyle

Randy Webb

Lasse Reichstein Nielsen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads