regular expression help...

I

Ian Richardson

I'm looking to use Javascript to pull apart a page of HTML I have
already fetched. The page contains a table, within which there are rows
containing...

Either:

0000000 - 0000000<some html>00<some html>0

or:

0000000 - 0000000<some html>00 - 00<some html>0

I'm interested in extracting only the numbers (which may not always be
0!), in each case. I bet this can be done using a regular expression (or
two). Can anyone help?

Thanks,

Ian
 
A

Alberto

A possible way, not necessarily the best but a way is:

<script>
var foo="0000000 - 0000000<some html>00<some html>0"
alert(foo.match(/\d+/g))
</script>

You can add properties to the script tag if you prefer, not so important.

I hope that was close to what you may need.
Note that you'd first check to be safer if match actually returned an array
or, if no match, null.

Maybe better solutions will follow. Mine is just one.

ciao
Alberto
http://www.unitedscripters.com/
 
M

Mick White

Ian said:
Either:

0000000 - 0000000<some html>00<some html>0

or:

0000000 - 0000000<some html>00 - 00<some html>0

I'm interested in extracting only the numbers (which may not always be
0!), in each case. I bet this can be done using a regular expression (or
two). Can anyone help?

Thanks,

Ian
You don't necessarily need to use regex, you can use the DOM methods,
for example (without error checking):

var tds=document.getElementsByTagName("TD")

//Then loop through the collection
var numbers=[];
for(var i=0;i<tds.length;i++){
if(!isNaN(tds.item(i).firstChild.data)){
numbers.push(tds.item(i).firstChild.data)
}
}

I'm not sure that the method above is the best at identifying Numbers,
and it will fail if you use tags within the <td> tag, but it is
something to get you thinking.


Mick
 
I

Ian Richardson

Mick said:
You don't necessarily need to use regex, you can use the DOM methods,
for example (without error checking):

I can't use DOM methods if all I'm dealing with is one long string of
HTML which contains the numeric data I wish to extract...

I can't just search through the string looking for numeric data as I'm
likely to pick up other stuff which I don't need.

I was really hoping for a regular expression to do extract multiples of:

7 digits and 7 digits separated by " - "
(HTML I'm not interested in)
Either 2 digits, or 2 digits followed by " - " and another 2 digits
(HTML I'm not interested in)
1 digit

....from within an HTML table, the markup of which can change...

Any other ideas?

Thanks,

Ian
 
M

Michael Winter

[snip]
I was really hoping for a regular expression to do extract multiples of:

7 digits and 7 digits separated by " - "
(HTML I'm not interested in)
Either 2 digits, or 2 digits followed by " - " and another 2 digits
(HTML I'm not interested in)
1 digit

...from within an HTML table, the markup of which can change...

Alberto's suggestion is a good start, but it depends on whether the
mark-up can contain numbers itself. If it can, there needs to be more
structure in the expression. Try:

/(\d{7}) - (\d{7})\x3c.+\x3e(\d{2})( - (\d{2}))?\x3c.+\x3e(\d)/

When used with the exec() method, it will return an array that contains:

element 0 - ignore
1 - first group of seven digits
2 - second group
3 - first group of two digits
4 - ignore
5 - second group of two digits
(undefined if they don't exist)
6 - final digit

This appears to be safe, even if numbers appear in the separating HTML,
but without live data, I can't test properly.

Give it a try. If it doesn't work and you can't modify the expression,
show the exact test case you used and we will see what we can do.

Good luck,
Mike
 
S

Shawn Milo

Ian Richardson said:
I'm looking to use Javascript to pull apart a page of HTML I have
already fetched. The page contains a table, within which there are rows
containing...

Either:

0000000 - 0000000<some html>00<some html>0

or:

0000000 - 0000000<some html>00 - 00<some html>0
<snip>


Try this:

<script type="text/javascript">


var reStrip = new RegExp("([0-9]{7}) -
([0-9]{7})<[^>]+>(([0-9]{2})( - ([0-9]{2}))?)<[^>]+>([0-9]{1})",
'gi');

var strTest = '';

strTest = '0000000 - 0000000<some html>00<some html>0';
if (strTest.match(reStrip)){

alert('Match!');

}else{

alert('No Match!');

}




// The first number will be returned as $1, as in:
// strTest.replace(reStrip, '$1')
//
// Just be aware that in the case of the second set of numbers (
// ('00 - 00' instead of '00', you will have additional values
(one for each
// matching set of parethesis..


strTest = '0000000 - 0000000<some html>00 - 00<some html>0';
if (strTest.match(reStrip)){

alert('Match!');

}else{

alert('No Match!');

}



</script>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,069
Latest member
SimplyleanKetoReviews

Latest Threads

Top