removing text from HTML but keeping HTML intact

R

Raja Kannan

Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?

I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.
 
R

Robert

Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?

I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.

Here is a way of changing the text. I set the text to blank. This
leaves the html intact. You will notice when all the text is set to
blank, the paragraph remains. Maybe you should remove the text node.

The problem is complicated by the fact you can have imbeded html tags
in the text and you cannot directly address the text element. You can
see how I addressed the paragraph element then scanned for the text.

I am new at this type of coding, so there may be a better way. I
tested this in netscape 7.1 under macos x.


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>change document text</title>

<SCRIPT type="text/javascript">

function changeData(node)
{
alert("node.nodeType = " + node.nodeType +
" node.tagName = " + node.tagName);
removeCharacters(node);

}

function removeCharacters(n)
{
if (n.nodeType == 3 &&
n.parentNode.tagName != "SCRIPT") // Check if n is a 'real' Text
object
{
alert("n.nodeType = " + n.nodeType +
" n.tagName = " + n.tagName);
var theParent = n.parentNode;
alert("theParent.nodeType = " + theParent.nodeType +
" theParent.tagName = " + theParent.tagName);
alert("length of text node = " + n.length +
" '" + escape(n.data) + "'");
n.data = " ";
return;
}

// Otherwise, n may have children whose characters we need to
traverse
var numChars = 0;
for (var m = n.firstChild; m != null; m = m.nextSibling)
{
alert("another. n.nodeType = " + n.nodeType +
" n.tagName = " + n.tagName);
removeCharacters(m);
}
return;
}

</script>
</head>

<body onload="alert('onload complete');
changeData(document.getElementById('p1'));">


<p id="p1">This <i>is</i> paragragh #1.</p>
<p id="p2">This is paragragh #2.</p>
</body>
</html>


Robert
 
D

DU

Raja said:
Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?


Remove text portion? Could be either of these W3C DOM 2 Character Data
methods:

deleteData(offset, count)

or

replaceData(offset, count, arg in DOMString)
I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.

I'm not sure I understand what you're asking. But you'll find plenty of
interactive demo examples in these 2 pages which will work in any/all
modern browsers (MSIE 6, Mozilla 1.x, NS 6.2+, Opera 7.x, K-meleon 0.8+,
Galeon 1.2, Safari 1.x, Konqueror 3.x, etc.):

DOM level 2 CharacterData Interface tests
http://www10.brinkster.com/doctorunclear/HTMLJavascriptCSS/DOM2CharacterData.html

innerHTML versus nodeValue performance comparison
http://www10.brinkster.com/doctorunclear/HTMLJavascriptCSS/innerHTMLvsNodeValue.html

DU
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top