Need something like doc.getElementById().innerHTML but wanteverything between body tags

  • Thread starter gimme_this_gimme_that
  • Start date
G

gimme_this_gimme_that

Is there a document command that returns the text between body tags -
something similar to getElementById("myid").innerHTML ?

If you're wondering why I need it. I'm scripting an
InternetExplorer.Application object and parsing the text on a page.

It's not an InternetExplorer.Application question though.

It's a "how do you get all the text between body tags" question.

Thanks.
 
R

RobG

Is there a document command that returns the text between body tags -
something similar to getElementById("myid").innerHTML ?

If you're wondering why I need it. I'm scripting an
InternetExplorer.Application object and parsing the text on a page.

It's not an InternetExplorer.Application question though.

It's a "how do you get all the text between body tags" question.

Use something like:

function getText(el) {
if (typeof el.textContent == 'string') return el.textContent;
if (typeof el.innerText == 'string') return el.innerText;
}

Then call it with:

alert(getText(document.body));

Note that it will also return the content of script elements in some
browsers. Also, some browsers support both innerText and textContent
and may return a different value for each.
 
M

Martin Honnen

Is there a document command that returns the text between body tags -
something similar to getElementById("myid").innerHTML ?

If you're wondering why I need it. I'm scripting an
InternetExplorer.Application object and parsing the text on a page.

You want
document.body.innerHTML
or
document.body.innerText
depending on whether you want the text only (innerText) or the markup
(as serialized by innerHTML).
 
D

dhtml

kangax said:
RobG wrote:
[...]
Note that it will also return the content of script elements in some
browsers.  Also, some browsers support both innerText and textContent
and may return a different value for each.
I just noticed that Safari 2 - 2.0.4 (and maybe earlier) actually has
`innerText` but that `innerText` is always an empty string : /

My mistake.

Safari *does* return "proper" `innerText` but only if an element is
neither hidden nor orphaned (there was an element with "display:none" in
my test). Considering these 2 "issues", the method, unfortunately,
becomes even more complex (still without proper script element handling).

Safari had issues with hidden elements in version 2 (Safari 2).

I'm not sure if such "forking" is worth the trouble. Recursively
collecting node values would produce more consistent results, albeit
being slower.

There are bugs with Safari 2 and hidden elements (including
visibility:
hidden and display: none). If this presents a problem, there are other
ways around it.

For example, the element that needed to be hidden could be positioned
outside the viewport, instead of using visibility: hidden. Another
alternative would be to set the element's visibility to "visible"just
prior to reading the innerText.

A saved reference to the property is faster.

var dom = {};

dom.textContent = "textContent" in document.documentElement ?
"textContent" : "innerText";

alert( el[dom.textContent] );

Note: Posting failed on the news server I usually use:
nntp.motzarella.org

Garrett
 
G

Garrett Smith

kangax said:
dhtml wrote:
[...]
There are bugs with Safari 2 and hidden elements (including
visibility:
hidden and display: none). If this presents a problem, there are other
ways around it.

For example, the element that needed to be hidden could be positioned
outside the viewport, instead of using visibility: hidden. Another

That would confuse screen readers, which skip `display:none` content but
announce the one that's merely positioned outside of the viewport.

I see. So the alternatives are:
visibility: hidden
- problem: innerText is "" in Safari 2
position away from view
- problem: a screen reader will read it
Yes, and not only on element but on all of its ancestors as well (for
obvious reasons). Now that you mentioned `visibility:hidden` (which, as
I just tested, does indeed prevent proper `innerText`) we would also
need to take care of that while traversing ancestors. Each of the
ancestor's `visibility` style values should be saved and then restored -
just like with `display` style values.

I really don't like how complex this solution becomes.

I don't either.

I would address the problem if and when it arises. It could be
special-cased to first show the div, then grab its innerText.

I understand that if the constraint involved potentially unknown
ancestors that this would pose a problem.
A saved reference to the property is faster.

var dom = {};

dom.textContent = "textContent" in document.documentElement ?
"textContent" : "innerText";

alert( el[dom.textContent] );

Interesting. I would expect the opposite - after all `dom.textContent`
needs to be resolved to a string before property lookup occurs
(in `el[dom.textContent]`)

Calling a function would be slower than finding dom.textContent.

A string literal "textContent" would be faster, but would fail in IE.

In scope, a local reference could be saved:-

(function(){

var dom = Lib.dom,
textContent = dom.textContent;
//...
})();

If the script is minified, and the local var |textContent| is used many
times, the renaming of textContent to a one-letter identifier would also
have the effect of making the script a bit smaller.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top