Looping scalar and regex'ing it

M

Moltar

Hi,


I want to go through a scalar in a loop and find strings that match a certain regex.
For example:

$string = 'Hi there *foo*, some stuff *bar* followed by more stuff and another *foo*.';

I need to loop through and get all the occurances of two asterics and "remember" the stuff in between
them. I can get the regex going, but I can't figure out how to do it in a loop.


Please help!

TIA!
 
J

John Bokma

Moltar said:
Hi,


I want to go through a scalar in a loop and find strings that match a certain regex.
For example:

$string = 'Hi there *foo*, some stuff *bar* followed by more stuff and another *foo*.';

I need to loop through and get all the occurances of two asterics and "remember" the stuff in between
them. I can get the regex going, but I can't figure out how to do it in a loop.


Please help!

my(@stuff) = $string =~ /\*([^*]+)\*/g;

Assuming at least one none-* should be between **. (Otherwise replace +
with *)
 
M

Moltar

Thank you John,

Hmm.. I was actually thinking of looping. Because I need to do other stuff with it later. I will describe it
in more details:

I have a text file of the following structure:

*Header 1*
- item 1
- item 2
- item 3

*Header 2
- item 1
- item 2
- item 3

I want to go through the text file and find a header. After I found the header - store it in a hash as a key.
After I want to process all the items untill the next header and store them all in the pointer array in the
hash with key "header whatever". Then I come to the next header, etc...

Then $headers{Header1}->[1] will have value of 'item 1'.

I can figure out how to do all that. What stalls me is going in a loop through scalar. Maybe there is
some 2 step solution? Going through scalar twice? Three times?

Thank you!


Moltar said:
Hi,


I want to go through a scalar in a loop and find strings that match a
certain regex. For example:

$string = 'Hi there *foo*, some stuff *bar* followed by more stuff
and another *foo*.';

I need to loop through and get all the occurances of two asterics
and "remember" the stuff in between them. I can get the regex going,
but I can't figure out how to do it in a loop.


Please help!

my(@stuff) = $string =~ /\*([^*]+)\*/g;

Assuming at least one none-* should be between **. (Otherwise replace
+ with *)
 
J

John Bokma

Moltar said:
Thank you John,

Hmm.. I was actually thinking of looping. Because I need to do other stuff with it later. I will describe it
in more details:

I have a text file of the following structure:

*Header 1*
- item 1
- item 2
- item 3

*Header 2
- item 1
- item 2
- item 3

I want to go through the text file and find a header. After I found the header - store it in a hash as a key.
After I want to process all the items untill the next header and store them all in the pointer array in the
hash with key "header whatever". Then I come to the next header, etc...

Then $headers{Header1}->[1] will have value of 'item 1'.

I can figure out how to do all that. What stalls me is going in a loop through scalar. Maybe there is
some 2 step solution? Going through scalar twice? Three times?

my %headers;
my $line;
my $header;
my $value;
while (defined($line = <FILE>)) {

chomp($line);
if (($value) = $line =~ /\*([^*)\*/) {
$header = $value;
next;
}
push(@{$headers->{$header}}, $line);
}

if your lines indeed consist of - item 1 and you want it change the push
line to:

if (($value) = $line =~ /- (.*)/) {
push(@{$headers->{$header}}, $value);
next; # we are done (optional)
}
 
M

Michael P. Broida

Moltar said:
Hmm.. I was actually thinking of looping. Because I need to do other stuff with it later.

NOTE: I haven't tested this! But it fits the docs
and posts here that I've read. Posted just to
give you some ideas.

You can use the "/g" on the string match in a loop with a scalar
result instead of an array. Each time you do the string match,
it will remember the previous point and work from there.

In other words, something like: (AGAIN: UNTESTED!)

while ($onematch = $string =~ /wha(tev)er/g)
{
work with $onematch
}

(Or drop the "$onematch =" and work with $1.)

When there are no further matches in the string, it will return
undef and exit the loop (unless I've screwed it up). Anyway,
that fits the docs for "/g" with a scalar result.

NOTE: Doing other regex work inside the loop will (I'm pretty sure)
screw it up: it will lose its place in $string. Or I could be
wrong on that, too. Well, maybe this helps a tiny bit. :)

Mike
 
T

Tad McClellan

Michael P. Broida said:
You can use the "/g" on the string match in a loop with a scalar
result instead of an array. Each time you do the string match,
^^^^^^^^^^^^^^^^^^^

( instead of a _list_ )

it will remember the previous point and work from there.


and the pos() function will tell you where it left off.

NOTE: Doing other regex work inside the loop will (I'm pretty sure)
screw it up: it will lose its place in $string. Or I could be
wrong on that, too.


You can do other pattern matching against _other_ strings with
no problem.

If you do mattern matching with $string though, then the match
position is _supposed to_ move.

The position info resides with the string being matched against,
rather than with the pattern to be matched.

You can use the \G anchor in all of the patterns that match
against $string to force it to start off where the last
one left off.
 
M

Moltar

Hey your code works fine, but I do not have access to file handler anymore %). As I said, I only have a
scalar (string) to work with. :(

Thank you for your help.
 
M

Michael P. Broida

Tad said:
^^^^^^^^^^^^^^^^^^^

( instead of a _list_ )

Oops, yes. I need to be more precise. :)
and the pos() function will tell you where it left off.

I didn't know that. Good info.
You can do other pattern matching against _other_ strings with
no problem.

If you do mattern matching with $string though, then the match
position is _supposed to_ move.

The position info resides with the string being matched against,
rather than with the pattern to be matched.

You can use the \G anchor in all of the patterns that match
against $string to force it to start off where the last
one left off.

That's what I wasn't straight on. I didn't know the
"scope" of the /g positioning info. Thanks for
clearing that up. Now I'll be able to use /g
without problems. :)

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top