searching for null in pattern

B

Brian Andrus

I am trying to parse an html line that contains lots of noise. I have
gotten it down mostly, but am having trouble with exact matches.

The part I am looking at will be
*align="right"><b>##.###</b></td>*

where * is much useless stuff and ##.### is the part I am trying to
extract.
The issue:

examples of what ##.### might be:
9.31
0.17
8.345
12.72

So I need to get something that will return the value whethere there
is 1 or 2 digits before the decimal and 2 or 3 digits after.

I have

/^*align="right"><b>*([0-9]\.[0-9][0-9])/

which will work for #.##, but not the other possibilities. How do I
get all the possibilities in a single search? When I try using *
inside the parentheses, I get an error.

Brian Andrus
 
B

Brian Wakem

Brian said:
I am trying to parse an html line that contains lots of noise. I have
gotten it down mostly, but am having trouble with exact matches.

The part I am looking at will be
*align="right"><b>##.###</b></td>*

where * is much useless stuff and ##.### is the part I am trying to
extract.
The issue:

examples of what ##.### might be:
9.31
0.17
8.345
12.72

So I need to get something that will return the value whethere there
is 1 or 2 digits before the decimal and 2 or 3 digits after.

I have

/^*align="right"><b>*([0-9]\.[0-9][0-9])/

which will work for #.##, but not the other possibilities. How do I
get all the possibilities in a single search? When I try using *
inside the parentheses, I get an error.

Brian Andrus


\d{1,2}\.\d{2,3}
 
B

Brian Andrus

Brian Wakem said:
\d{1,2}\.\d{2,3}

Perfect! Thanks.. Now for the learning part..heh.

I can figure d{x,y} must mean x or y digits.
what to the backslashes do in this case?

Brian Andrus
 
B

Brad Baxter

Perfect! Thanks.. Now for the learning part..heh.

I can figure d{x,y} must mean x or y digits.
what to the backslashes do in this case?

\d (including the backslash) means [0-9], i.e., digits

{x,y} means at least x and no more than y of what immediately precedes,
i.e., it's a range

.. is a metacharacter that matches any character (with some
qualifications).

\. escapes that metacharacter to match the literal character '.'

See also perldoc perlre

Regards,

Brad
 
B

Ben Morrow

I can figure d{x,y} must mean x or y digits.
what to the backslashes do in this case?

Have you read perldoc perlre? Why not?

'\d' means 'match a digit'. '{x,y}' means 'match the preceding atom
(in this case '\d') between x and y times'.

/d{2,3}/ would match 'dd' or 'ddd', but not '123'.

Ben
 
B

Ben Morrow

Brad Baxter said:
\d (including the backslash) means [0-9], i.e., digits

As someone pointed out before wrt [[:lower:]], \d is different from
[0-9]. Try

% perl -Mcharnames=viacode -le'for(0x0..0xffff) { print \
charnames::viacode $_ if chr =~ /\d/ }'

with 5.8. (The above should be all on one line.)

Ben
 
B

Brad Baxter

Brad Baxter said:
\d (including the backslash) means [0-9], i.e., digits

As someone pointed out before wrt [[:lower:]], \d is different from
[0-9]. Try

% perl -Mcharnames=viacode -le'for(0x0..0xffff) { print \
charnames::viacode $_ if chr =~ /\d/ }'

with 5.8. (The above should be all on one line.)

:) I am more than happy to concede that point. I guess I can only plead
that my reply was in the spirit of:

http://www.perldoc.com/perl5.8.0/pod/perlintro.html

which states, to my mind admirably considering its purpose, "This
introductory document does not aim to be complete. It does not even aim to
be entirely accurate."

I agree that 'match a digit' is more accurate.

Regards,

Brad
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top