finding text on webpage

D

doug s

How would I access the source of a webpage? I would like to get the source,
then use a regexp to find text on a webpage. For example, retrieving a
counter, or some other text that dynamically changes. I'm implementing this
for firefox.

right now, i have a web page opening that contains that info, but i would
rather have a pop up alert() that shows that info.

thanks for any help. and any code snippets would be great.
 
R

RobG

doug said:
How would I access the source of a webpage? I would like to get the source,
then use a regexp to find text on a webpage. For example, retrieving a
counter, or some other text that dynamically changes. I'm implementing this
for firefox.

right now, i have a web page opening that contains that info, but i would
rather have a pop up alert() that shows that info.

thanks for any help. and any code snippets would be great.

There are at least three ways of getting the text content of the document.

The standards-compliant method is to use document.body.textContent, but
since it's DOM 3 it may not be widely supported beyond Mozilla/Firefox.

The IE-centric way is to use innerText as IE does not support textContent.

A third, reasonably cross-browser method is to use innerHTML and a
regular expression. But it's not a standard and different
implimentations may have small variations in how they've copied it from IE.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title> blah </title>
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">
<script type="text/javascript">

function showText( el )
{
var txt;
if ( el.textContent ) {
txt = 'textContent\n' + el.textContent;
} else if ( el.innerText ){
txt = 'innerText\n' + el.innerText;
} else if ( el.innerHTML ) {
txt = 'innerHTML\n' + el.innerHTML.replace(/<\/?.[^\>]*>/g,'');
}
return txt;
}

</script>
</head>
<body >
<div onclick="alert(showText( document.body ));">
Here is <div>the <span> content. <b>Hi</b></span>
<br>Click me to see the content.
</div>
</div>
</body>
</html>
 
D

doug s

RobG said:
There are at least three ways of getting the text content of the document.

The standards-compliant method is to use document.body.textContent, but
since it's DOM 3 it may not be widely supported beyond Mozilla/Firefox.

The IE-centric way is to use innerText as IE does not support textContent.

A third, reasonably cross-browser method is to use innerHTML and a regular
expression. But it's not a standard and different implimentations may
have small variations in how they've copied it from IE.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title> blah </title>
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">
<script type="text/javascript">

function showText( el )
{
var txt;
if ( el.textContent ) {
txt = 'textContent\n' + el.textContent;
} else if ( el.innerText ){
txt = 'innerText\n' + el.innerText;
} else if ( el.innerHTML ) {
txt = 'innerHTML\n' + el.innerHTML.replace(/<\/?.[^\>]*>/g,'');
}
return txt;
}

</script>
</head>
<body >
<div onclick="alert(showText( document.body ));">
Here is <div>the <span> content. <b>Hi</b></span>
<br>Click me to see the content.
</div>
</div>
</body>
</html>

thanks. that helps. Is there a way to fetch a page, without actually opening
it, and then use showText(el) ? Somehow pass in the url ?

thanks again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top