regular expression question

B

Bart Van Loon

Hello,

is it possible to make a regular expression match for the following
situation:

I have a string, looking like 'foobarbarbar'. I don't know what foo is,
nor what bar is, the only thing I know is that I have a string X,
concatenated an undefined number of times with a string Y. My goal is to
find out how many times this string Y (bar) is repeted, without knowing
what it exactly is.

Something like ^.*(.*)*$ maybe?

greetings,
BBBart
 
G

Greg Bacon

: I have a string, looking like 'foobarbarbar'. I don't know what foo
: is, nor what bar is, the only thing I know is that I have a string X,
: concatenated an undefined number of times with a string Y. My goal is
: to find out how many times this string Y (bar) is repeted, without
: knowing what it exactly is.

% cat try
#! /usr/local/bin/perl

use warnings;
use strict;

my $X = 'foo';
my $Y = 'bar';

my $re = qr/^ $X ($Y)+ $/x;

for (qw/ foo foobarbarbar foobar barbarfoo /) {
if (/$re/) {
print "Match for [$_]\n";
}
else {
print "No match for [$_]\n";
}
}
% ./try
No match for [foo]
Match for [foobarbarbar]
Match for [foobar]
No match for [barbarfoo]

You may need to use quotemeta, depending on how you want to match.

Hope this helps,
Greg
 
S

Sam Holden

Hello,

is it possible to make a regular expression match for the following
situation:

I have a string, looking like 'foobarbarbar'. I don't know what foo is,
nor what bar is, the only thing I know is that I have a string X,
concatenated an undefined number of times with a string Y. My goal is to
find out how many times this string Y (bar) is repeted, without knowing
what it exactly is.

Something like ^.*(.*)*$ maybe?

What's the point of a regular expression which matches all
strings (assumming /s is used)?

And do you use (.*)* because you want some extra added slowness?

My naive approach would be to use something like (assumming "undefined
number" means 2 or more - since otherwise it doesn't make sense):

$_='foobarbarbar';
if (/^(.+?)((.+)\3+)$/s) {
print "X is $1\n";
print "Y is $3\n";
print "There are " . length($2)/length($3) . " repetitions of Y\n";
}

But that fails to match. However, /^(foo)((.+)\3+)$/ matches (and is what
I tested the code with).

Since (.+?) can match "foo", I think the first regular expression should
match (after a fair bit of backtracking - guess the second .+ might be
more efficient as .+?, but that's beside the point).

echo "foobarbarbar" | egrep '^(.+)((.+)\3+)$'

outputs "foobarbarbar" (I realise that \1 will be "foobar" and \2 "barbar"
so the match is slightly different), so it seems to do what I would expect.

Is this a bug in perl? Or is my knowedge of regexes faulty?

I suspect the second, but (other than embaressment) it can't hurt to ask.
 
J

Janek Schleicher

Bart Van Loon wrote at Wed, 30 Jul 2003 10:35:32 +0000:
is it possible to make a regular expression match for the following
situation:

I have a string, looking like 'foobarbarbar'. I don't know what foo is,
nor what bar is, the only thing I know is that I have a string X,
concatenated an undefined number of times with a string Y. My goal is to
find out how many times this string Y (bar) is repeted, without knowing
what it exactly is.

Something like ^.*(.*)*$ maybe?

perl -e '$_ = "foobarbarbar"; /((.*)\2+)$/; print length($1)/length($2)'


Greetings,
Janek
 
J

Jay Flaherty

Janek said:
perl -e '$_ = "foobarbarbar"; /((.*)\2+)$/; print length($1)/length($2)'

perl -e '$_ = "foobarbarbarbar"; /((.*)\2+)$/; print length($1)/length($2)'
returns 2 (barbar x 2)
So it seems to work with odd number of concats only.

Jay
 
J

Janek Schleicher

Jay Flaherty wrote at Thu, 31 Jul 2003 13:14:50 -0400:
perl -e '$_ = "foobarbarbarbar"; /((.*)\2+)$/; print length($1)/length($2)'
returns 2 (barbar x 2)
So it seems to work with odd number of concats only.

No, it only returns prime numbers.
In fact "barbarbarbar" is "barbar" x 2.
And only prime number occurences can't be divided:

But if it is meant that the shortest repeating part should be counted,
you can use instead the non greedy version:

perl -e '$_ = "foobarbarbarbar"; /((.*?)\2+)$/; print length($1)/length($2)'
^

Greetings,
Janek
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top