removing text from HTML but keeping HTML intact

Raja Kannan · Jul 10, 2004

Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?

I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.

Robert · Jul 10, 2004

Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?

I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.

Here is a way of changing the text. I set the text to blank. This
leaves the html intact. You will notice when all the text is set to
blank, the paragraph remains. Maybe you should remove the text node.

The problem is complicated by the fact you can have imbeded html tags
in the text and you cannot directly address the text element. You can
see how I addressed the paragraph element then scanned for the text.

I am new at this type of coding, so there may be a better way. I
tested this in netscape 7.1 under macos x.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>change document text</title>

<SCRIPT type="text/javascript">

function changeData(node)
{
alert("node.nodeType = " + node.nodeType +
" node.tagName = " + node.tagName);
removeCharacters(node);

}

function removeCharacters(n)
{
if (n.nodeType == 3 &&
n.parentNode.tagName != "SCRIPT") // Check if n is a 'real' Text
object
{
alert("n.nodeType = " + n.nodeType +
" n.tagName = " + n.tagName);
var theParent = n.parentNode;
alert("theParent.nodeType = " + theParent.nodeType +
" theParent.tagName = " + theParent.tagName);
alert("length of text node = " + n.length +
" '" + escape(n.data) + "'");
n.data = " ";
return;
}

// Otherwise, n may have children whose characters we need to
traverse
var numChars = 0;
for (var m = n.firstChild; m != null; m = m.nextSibling)
{
alert("another. n.nodeType = " + n.nodeType +
" n.tagName = " + n.tagName);
removeCharacters(m);
}
return;
}

</script>
</head>

<body onload="alert('onload complete');
changeData(document.getElementById('p1'));">

<p id="p1">This <i>is</i> paragragh #1.</p>
<p id="p2">This is paragragh #2.</p>
</body>
</html>

Robert

DU · Jul 10, 2004

Raja said:
Is there a way to remove text portion from the HTML keeping the HTML
Tags using the browser, say javascript RegEx or something ?

Remove text portion? Could be either of these W3C DOM 2 Character Data
methods:

deleteData(offset, count)

or

replaceData(offset, count, arg in DOMString)

I have seen lot of examples removing HTML tags to get the text but how
the reverse of it ?

any sample code or any suggestion would be appreciated.

I'm not sure I understand what you're asking. But you'll find plenty of
interactive demo examples in these 2 pages which will work in any/all
modern browsers (MSIE 6, Mozilla 1.x, NS 6.2+, Opera 7.x, K-meleon 0.8+,
Galeon 1.2, Safari 1.x, Konqueror 3.x, etc.):

DOM level 2 CharacterData Interface tests
http://www10.brinkster.com/doctorunclear/HTMLJavascriptCSS/DOM2CharacterData.html

innerHTML versus nodeValue performance comparison
http://www10.brinkster.com/doctorunclear/HTMLJavascriptCSS/innerHTMLvsNodeValue.html

DU

Getting extra blank rows from appending HTML..?	2	Oct 24, 2023
CORS/Express: Getting data from server from domain html	2	Sep 3, 2022
Python client/server that reads HTML body from server	1	Apr 12, 2023
Trying to get JSON data from API into HTML table	7	Feb 1, 2021
Upgrading Company's Internal Record Keeping Systems	0	Sep 24, 2021
Need assistance finetuning HTML, CSS, Javascript - sticky header issue	3	Feb 25, 2022
Html data exchange help	0	Jan 2, 2020
Generate one HTML from API based on the object key language and their value	2	Aug 19, 2022

removing text from HTML but keeping HTML intact

Raja Kannan

Robert

DU

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads