search "window" pattern matching

Cheez · Jan 11, 2004

Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

$mystring = "thetextinherewillbefairlyrandom";

I want to capture chunks of text and place them in an array or hash
table. If possible, I want to make a regex that will start at the
first letter and capture letters 1 - 5, in this case $capture =
"thete". Then, I want this window to shift 1 letter so that the next
captured string is letters 2 - 6, or $capture= "hetex" and so on until
the end of the line. Can anyone offer up a sample regex would
accomplish this task?

Thanks,
Cheez

==============================================

My idea is this (although it doesn't work):

$mystring = "thetextinherewillbefairlyrandom";

$length = scalar ($mystring);

while ($counter < $length) {

$_ =~ /\w[$counter-$counter+4]/; # 'capture' regex

push @newarray; $counter++; # regex capture window increments by
1
# pushing chunks into array
}

foreach (@newarray) { #sample output

print "$newarray";

}

Randal L. Schwartz · Jan 11, 2004

Cheez> Hello, hard to desribe my question in a clear way. I want to process
Cheez> a string that looks like this:

Cheez> $mystring = "thetextinherewillbefairlyrandom";

Cheez> I want to capture chunks of text and place them in an array or hash
Cheez> table. If possible, I want to make a regex that will start at the
Cheez> first letter and capture letters 1 - 5, in this case $capture =
Cheez> "thete". Then, I want this window to shift 1 letter so that the next
Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on until
Cheez> the end of the line. Can anyone offer up a sample regex would
Cheez> accomplish this task?

Use string lookahead, so they can be overlapping:

while ($mystring =~ /(?=.{5})/sg) {
push @result, $1;
}

print "Just another Perl hacker,"

Toby · Jan 11, 2004

Cheez said:
Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

$mystring = "thetextinherewillbefairlyrandom";

I want to capture chunks of text and place them in an array or hash

perldoc -f substr

maybe what you're looking for.

gnari · Jan 11, 2004

Randal L. Schwartz said:
Cheez> Hello, hard to desribe my question in a clear way. I want to process
Cheez> a string that looks like this:

Cheez> $mystring = "thetextinherewillbefairlyrandom";

Cheez> I want to capture chunks of text and place them in an array or hash
Cheez> table. If possible, I want to make a regex that will start at the
Cheez> first letter and capture letters 1 - 5, in this case $capture =
Cheez> "thete". Then, I want this window to shift 1 letter so that the next
Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on until
Cheez> the end of the line. Can anyone offer up a sample regex would
Cheez> accomplish this task?

Use string lookahead, so they can be overlapping:

while ($mystring =~ /(?=.{5})/sg) {
push @result, $1;
}

or use pos(),
or more likely, use substr()

gnari

Marc Bissonnette · Jan 11, 2004

(e-mail address removed) (Cheez) wrote in @posting.google.com:

Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

$mystring = "thetextinherewillbefairlyrandom";

I want to capture chunks of text and place them in an array or hash
table. If possible, I want to make a regex that will start at the
first letter and capture letters 1 - 5, in this case $capture =
"thete". Then, I want this window to shift 1 letter so that the next
captured string is letters 2 - 6, or $capture= "hetex" and so on until
the end of the line. Can anyone offer up a sample regex would
accomplish this task?

Thanks,
Cheez

==============================================

My idea is this (although it doesn't work):

$mystring = "thetextinherewillbefairlyrandom";

$length = scalar ($mystring);

while ($counter < $length) {

$_ =~ /\w[$counter-$counter+4]/; # 'capture' regex

push @newarray; $counter++; # regex capture window increments by
1
# pushing chunks into array
}

foreach (@newarray) { #sample output

print "$newarray";

}

Lemme take a crack at it:

#!/usr/bin/perl
use strict;
use warnings;
my $mystring = "thetextinherewillbefairlyrandom";
# get the length of $mystring:
my $length = length $mystring;
# set / declare the counter:
my $counter=0;
# set / declare the array:
my @newarray;
# while the counter is less than the length of $mystring, grab bits of
text:
while ($counter < $length) {
# grab 5 characters from the last position used within $mystring
my $tempstring = substr $mystring,$counter,5;
# dump it into @newarray:
push @newarray,$tempstring;
# increment the counter and loop again
++ $counter;
}
for (@newarray) {
print "$_\n";
}

output:

thete
hetex
etext
texti
extin
xtinh
tinhe
inher
nhere
herew
erewi
rewil
ewill
willb
illbe
llbef
lbefa
befai
efair
fairl
airly
irlyr
rlyra
lyran
yrand
rando
andom
ndom
dom
om
m

Randal L. Schwartz · Jan 11, 2004

gnari> or use pos(),
gnari> or more likely, use substr()

Uh, why? Any solution with pos and substr is likely to be a lot
more complex than this simple regex.

Or are you of the habit of replacing simple solutions with complex
ones for the helluvit?

print "Just another Perl hacker,"

Tad McClellan · Jan 11, 2004

Marc Bissonnette said:
# get the length of $mystring:
my $length = length $mystring;
# set / declare the counter:
my $counter=0;
# set / declare the array:
my @newarray;

Comments that repeat what is already said in the code are worse
than no comments.

They are distracting, plus you have to remember to change stuff
in 2 places, the code and the comment that repeats the code.
(they have a very good chance of getting out-of-sync)

gnari · Jan 11, 2004

Randal L. Schwartz said:
gnari> or use pos(),
gnari> or more likely, use substr()

Uh, why? Any solution with pos and substr is likely to be a lot
more complex than this simple regex.

Or are you of the habit of replacing simple solutions with complex
ones for the helluvit?

sometimes

I just have the impression that a substr() solution is
easier for a beginner to understand and change, if
necessary.
Also, it is allways good to rub in the TMWTDI.

On the other hand, maybe the OP really just wanted
to know if there was a *regexp* solution. In that case,
he will just ignore my comment.

gnari

John W. Krahn · Jan 11, 2004

Randal L. Schwartz said:
Cheez> Hello, hard to desribe my question in a clear way. I want to process
Cheez> a string that looks like this:

Cheez> $mystring = "thetextinherewillbefairlyrandom";

Cheez> I want to capture chunks of text and place them in an array or hash
Cheez> table. If possible, I want to make a regex that will start at the
Cheez> first letter and capture letters 1 - 5, in this case $capture =
Cheez> "thete". Then, I want this window to shift 1 letter so that the next
Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on until
Cheez> the end of the line. Can anyone offer up a sample regex would
Cheez> accomplish this task?

Use string lookahead, so they can be overlapping:

while ($mystring =~ /(?=.{5})/sg) {
push @result, $1;
}

(?=) doesn't capture. You probably meant /(?=(.{5}))/sg

John

Cheez · Jan 12, 2004

Blown away at how useful c.l.p.m is for a newbie perl dude. I thanks
all again for the replies. I think Gnari made a point about $substr
being easier to understand for newbies... Yes! I have Java
background so it's always nice to see a friendly face (substring)!

God is in the regex's though

Cheers,
Cheez

Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

[snip]

Randal L. Schwartz · Jan 12, 2004

John> (?=) doesn't capture. You probably meant /(?=(.{5}))/sg

Brainlapse. yes. Thanks.

gnari · Jan 12, 2004

Cheez said:
Blown away at how useful c.l.p.m is for a newbie perl dude. I thanks
all again for the replies. I think Gnari made a point about $substr

minor nitpick #1: it is substr() not $substr (function, not vatiable)

being easier to understand for newbies... Yes! I have Java
background so it's always nice to see a friendly face (substring)!

God is in the regex's though
indeed.

(e-mail address removed) (Cheez) wrote in message

Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

Click to expand...

[snip]

minor nitpick #2:
what you did here is called top-posting: you made a follow-up,
and quoted the message you are following-up on below.
this practice is frowned-upon in this newsgroup.
this case it is not serious, because you did not actually quote the whole
article below.

gnari

Anno Siegel · Jan 12, 2004

Randal L. Schwartz said:
gnari> or use pos(),
gnari> or more likely, use substr()

Uh, why? Any solution with pos and substr is likely to be a lot
more complex than this simple regex.

Or are you of the habit of replacing simple solutions with complex
ones for the helluvit?

Are you? Why loop when list context does the same thing?

my @result2 = $mystring =~ /(?=(.{5}))/sg;

Anno

Marc Bissonnette · Jan 12, 2004

(e-mail address removed) (Tad McClellan) wrote in

Comments that repeat what is already said in the code are worse
than no comments.

They are distracting, plus you have to remember to change stuff
in 2 places, the code and the comment that repeats the code.
(they have a very good chance of getting out-of-sync)

Good point; I was trying to be extra-thorough in showing the OP what I was
trying to do (which was, of course, way longer than Randall's one-liner).

I comment my own code usually with only a single comment for each
subroutine, or blocks that I know I'd need a reminder on in the future.

Out of curiosity, is there a resource or guideline on the web for 'proper'
perl commenting ?

A google search for
perl "proper comment" code
didn't seem to turn anything up that was completely relevant.

A. Sinan Unur · Jan 12, 2004

Out of curiosity, is there a resource or guideline on the web for
'proper' perl commenting ?

A google search for
perl "proper comment" code
didn't seem to turn anything up that was completely relevant.

How about perldoc perlstyle?

Marc Bissonnette · Jan 12, 2004

How about perldoc perlstyle?

Thank you - Reading it now

A. Sinan Unur · Jan 12, 2004

Thank you - Reading it now

Well, I must have been confused because it says nothing about comments. I
found the following page the contents of which I thought came from
perldoc perlstyle.

http://www.perl.com/language/style/slide5.html

Marc Bissonnette · Jan 12, 2004

Well, I must have been confused because it says nothing about
comments. I found the following page the contents of which I thought
came from perldoc perlstyle.

http://www.perl.com/language/style/slide5.html

I think that bit is complimentary to perldoc perlstyle - or the other way
around. From what I get out of the two - if one follows the advice of
perldoc perlstyle along with decent perl itself, then excessive, or even
frequent, comments should be completely avoidable, as they will be
unnecessary.

Ben Morrow · Jan 12, 2004

[article references removed 'cos it was getting silly

]

Marc Bissonnette said:
I think that bit is complimentary to perldoc perlstyle - or the other way
around. From what I get out of the two - if one follows the advice of
perldoc perlstyle along with decent perl itself, then excessive, or even
frequent, comments should be completely avoidable, as they will be
unnecessary.

This was written wrt C, not Perl, but I tend to follow this from
/usr/src/linux/Documentation/CodingStyle:
| Chapter 5: Commenting
|
| Comments are good, but there is also a danger of over-commenting.
| NEVER try to explain HOW your code works in a comment: it's much
| better to write the code so that the _working_ is obvious, and it's
| a waste of time to explain badly written code.
|
| Generally, you want your comments to tell WHAT your code does, not
| HOW. Also, try to avoid putting comments inside a function body: if
| the function is so complex that you need to separately comment parts
| of it, you should probably go back to chapter 4 for a while. You
| can make small comments to note or warn about something particularly
| clever (or ugly), but try to avoid excess. Instead, put the
| comments at the head of the function, telling people what it does,
| and possibly WHY it does it.

Ben

Marc Bissonnette · Jan 12, 2004

Marc Bissonnette said:
[article references removed 'cos it was getting silly ]

Marc Bissonnette said:

I think that bit is complimentary to perldoc perlstyle - or the other
way around. From what I get out of the two - if one follows the
advice of perldoc perlstyle along with decent perl itself, then
excessive, or even frequent, comments should be completely avoidable,
as they will be unnecessary.

Click to expand...

This was written wrt C, not Perl, but I tend to follow this from
/usr/src/linux/Documentation/CodingStyle:
| Chapter 5: Commenting
|
| Comments are good, but there is also a danger of over-commenting.
| NEVER try to explain HOW your code works in a comment: it's much
| better to write the code so that the _working_ is obvious, and it's
| a waste of time to explain badly written code.
|
| Generally, you want your comments to tell WHAT your code does, not
| HOW. Also, try to avoid putting comments inside a function body: if
| the function is so complex that you need to separately comment parts
| of it, you should probably go back to chapter 4 for a while. You
| can make small comments to note or warn about something particularly
| clever (or ugly), but try to avoid excess. Instead, put the
| comments at the head of the function, telling people what it does,
| and possibly WHY it does it.

That's a good guideline and pretty much what I've been following to date
- i.e. comments at the beginning of subroutines that go into more detail
that what the subroutine name already suggests.

My over-commenting in the NG was my own fault - should have known better
to simply follow what works best in the real app, too

I'm going to re-review the perldoc perlstyle, just to see if there's
anything I've been missing. Overall, I think my code is fairly decent - I
can go back to almost all of my stuff over the years and still have
relatively little problems understanding what I was getting at in the
code, even if I've since learned much more efficient manners of doing it.

Confusion about the smart matching operator	11	Jul 23, 2010
Spanning Lines & Dynamic Pattern Matching	5	Oct 25, 2004
Regex testing and UTF8 awarenes or Regex and numeric pattern matching	2	Mar 10, 2009
FAQ 6.12 Can I use Perl regular expressions to match balanced text?	0	Jan 9, 2011
Pattern Matching	3	Jul 19, 2004
Continue to Search file after matching a value	4	Feb 13, 2007
How do I get address of scalars?	1	Jan 13, 2013
[perl-python] string pattern matching	9	Feb 1, 2005

search "window" pattern matching

Cheez

Randal L. Schwartz

Toby

gnari

Marc Bissonnette

Randal L. Schwartz

Tad McClellan

gnari

John W. Krahn

Cheez

Randal L. Schwartz

gnari

Anno Siegel

Marc Bissonnette

A. Sinan Unur

Marc Bissonnette

A. Sinan Unur

Marc Bissonnette

Ben Morrow

Marc Bissonnette

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads