return match using regex

H

hendedav

Gang,

I have been working on this for a few hours and am frustrated
beyond all extent. I have tried to research this on the web as well
with no success. I am trying to match certain contents within a
wrapper div. So for example if the inside of the wrapper div was the
following:

<div id="wrapper">
<a href="#">a great link that contain text and symbols</a>
<div> ... </div>
<div> ... </div>
</div>

I would like to strip out all the internal div's. But because there
can be alot of internal div's, I figured it would be less processor
intensive to just match the first 'a' tag and repopulate the wrapper
div with the match. I am trying to use something like the following
regex:


re = /^<a(.+)</a>/;

with the following statment:

$temp = document.getElementById('wrapper').innerHTML.match(re);

but this is returning the entire contents of the wrapper div. I have
tried variations of the regex and either continue to get the entire
contents or null returns. Any help would greatly be appreciated.
BTW, I can't match to the first \n because the contents may be
touching (ie ...</a><div>...).

Thanks,
Dave
 
G

Geoffrey Summerhayes

Gang,

I have been working on this for a few hours and am frustrated
beyond all extent. I have tried to research this on the web as well
with no success. I am trying to match certain contents within a
wrapper div. So for example if the inside of the wrapper div was the
following:

<div id="wrapper">
<a href="#">a great link that contain text and symbols</a>
<div> ... </div>
<div> ... </div>
</div>

I would like to strip out all the internal div's. But because there
can be alot of internal div's, I figured it would be less processor
intensive to just match the first 'a' tag and repopulate the wrapper
div with the match. I am trying to use something like the following
regex:

re = /^<a(.+)</a>/;

with the following statment:

$temp = document.getElementById('wrapper').innerHTML.match(re);

but this is returning the entire contents of the wrapper div. I have
tried variations of the regex and either continue to get the entire
contents or null returns. Any help would greatly be appreciated.
BTW, I can't match to the first \n because the contents may be
touching (ie ...</a><div>...).

I'm not a big fan of using regexps for parsing HTML.
Getting a bulletproof expression is a major pain.

For example, here's one from 'Mastering Regular
Expressions 2nd edition' by J. Friedl (publisher
O'Reilly) for matching HTML tags:

/<("[^"]*"|'[^']*'|[^'">])*>/


How about...

var container=document.getElementById('wrapper');
var list=[];
while(container.hasChildNodes())
{
if(!('tagName' in container.lastChild)||
(container.lastChild.tagName.match(/^div$/i)==null))
{
list.push(container.lastChild);
}
container.removeChild(container.lastChild);
}
while(list.length>0)
{
container.appendChild(list.pop());
}

It does a little extra work to avoid
the nastiness of dealing with indexing
into a list that's being resized.
 
H

hendedav

I have been working on this for a few hours and am frustrated
beyond all extent. I have tried to research this on the web as well
with no success. I am trying to match certain contents within a
wrapper div. So for example if the inside of the wrapper div was the
following:
<div id="wrapper">
<a href="#">a great link that contain text and symbols</a>
<div> ... </div>
<div> ... </div>
</div>
I would like to strip out all the internal div's. But because there
can be alot of internal div's, I figured it would be less processor
intensive to just match the first 'a' tag and repopulate the wrapper
div with the match. I am trying to use something like the following
regex:
re = /^<a(.+)</a>/;
with the following statment:
$temp = document.getElementById('wrapper').innerHTML.match(re);
but this is returning the entire contents of the wrapper div. I have
tried variations of the regex and either continue to get the entire
contents or null returns. Any help would greatly be appreciated.
BTW, I can't match to the first \n because the contents may be
touching (ie ...</a><div>...).

I'm not a big fan of using regexps for parsing HTML.
Getting a bulletproof expression is a major pain.

For example, here's one from 'Mastering Regular
Expressions 2nd edition' by J. Friedl (publisher
O'Reilly) for matching HTML tags:

/<("[^"]*"|'[^']*'|[^'">])*>/

How about...

var container=document.getElementById('wrapper');
var list=[];
while(container.hasChildNodes())
{
if(!('tagName' in container.lastChild)||
(container.lastChild.tagName.match(/^div$/i)==null))
{
list.push(container.lastChild);
}
container.removeChild(container.lastChild);}

while(list.length>0)
{
container.appendChild(list.pop());

}

It does a little extra work to avoid
the nastiness of dealing with indexing
into a list that's being resized.


Geoff,

Thanks for the reply. I was looking for something less processor
intensive. The inner div's can be in the hundreds. Thats why I was
looking at just isolating the first href tag and replacing the entire
contents of the wrapper div. Any other thoughts would be appreciated.

Thanks,
Dave
 
H

hendedav

On May 18, 11:24 am, (e-mail address removed) wrote:
I'm not a big fan of using regexps for parsing HTML.
Getting a bulletproof expression is a major pain.
For example, here's one from 'Mastering Regular
Expressions 2nd edition' by J. Friedl (publisher
O'Reilly) for matching HTML tags:
/<("[^"]*"|'[^']*'|[^'">])*>/

How about...
var container=document.getElementById('wrapper');
var list=[];
while(container.hasChildNodes())
{
if(!('tagName' in container.lastChild)||
(container.lastChild.tagName.match(/^div$/i)==null))
{
list.push(container.lastChild);
}
container.removeChild(container.lastChild);}
while(list.length>0)
{
container.appendChild(list.pop());

It does a little extra work to avoid
the nastiness of dealing with indexing
into a list that's being resized.

Geoff,

Thanks for the reply. I was looking for something less processor
intensive. The inner div's can be in the hundreds. Thats why I was
looking at just isolating the first href tag and replacing the entire
contents of the wrapper div. Any other thoughts would be appreciated.

Thanks,
Dave



Gang,

I was getting nowhere trying to use regex's, so I just decided to
use a combination of substring's and indexOf's. Everything seems to
be working beautifully. Thanks anyways.

Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top