Returning substring of regex match

T

Taras_96

Hi everyone,

Is it possible to ask for a substring, or substrings, of a regular
expression match to be returned in a regular expression?

EG:

Say you want a list of the days in the dates for some input with some
defined prefix:

eg: Give me the days of all dates that are prefixed with the word DATE

DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07

Would return 22 and 25, but not 1 or 26.

Thanks

Taras
 
G

Gunnar Hjalmarsson

Taras_96 said:
Is it possible to ask for a substring, or substrings, of a regular
expression match to be returned in a regular expression?

Yes. Use capturing parenteses.
 
P

Paul Lalli

Say you want a list of the days in the dates for some input
with some defined prefix:

eg: Give me the days of all dates that are prefixed with the
word DATE

DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07

Would return 22 and 25, but not 1 or 26.

Is this what you're looking for?

#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>) {
print "$1\n" if m!DATE(\d+)/\d+/\d+$!;
}
__DATA__
DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07


Take a look at
perldoc perlretut
perldoc perlre
perldoc perlreref

Paul Lalli
 
M

Mirco Wahab

Taras_96 said:
Is it possible to ask for a substring, or substrings, of a regular
expression match to be returned in a regular expression?
Say you want a list of the days in the dates for some input with some
defined prefix:
eg: Give me the days of all dates that are prefixed with the word DATE

DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07

Would return 22 and 25, but not 1 or 26.

You may "literally" put your question into a
regular expression, like:

...
my @input = qw'
DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07
';

print map /(?<=^DATE)\d+/g, @input;
...


The regular expression asks
for a numerical value (the day)
when encountering the \d+ thing, *if*
it was preceded by the letters DATE,
the (?<= ... ) is called "positive
lookbehind assertion".

The /g modifier in list context (map)
extracts the matched expression \d+,
(which is the day) - without capturing
parentheses.

The result of the above expression would
then be an "array of the extracted hits
(days)", which is printed instantaneously
here.

Regards

M.
 
G

Gunnar Hjalmarsson

Mirco said:
...
my @input = qw'
DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07
';

print map /(?<=^DATE)\d+/g, @input;
...

The qw() operator puts 'DATENO' and '26/5/07' in two separate elements.
 
M

Mirco Wahab

Gunnar said:
The qw() operator puts 'DATENO' and '26/5/07' in two separate elements.

Oh yea, you are right.

First, I thought of using _DATA_ but
abstained from it because I didn't
want to replicate Pauls code parts ...

For the OP, the input data needs
to be written like:

...
my @input = (
'DATE22/5/07',
'1/5/07',
'DATE25/5/07',
'DATENO 26/5/07',
);
...

Sorry,

M.
 
D

Dr.Ruud

Mirco Wahab schreef:
the input data needs
to be written like:

...
my @input = (
'DATE22/5/07',
'1/5/07',
'DATE25/5/07',
'DATENO 26/5/07',
);

Alternative:

my @input = split /\n/, <<'-- ';
DATE22/5/07
1/5/07
DATE25/5/07
DATENO 26/5/07
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top