help with a regex

donebrowsers · Mar 12, 2008

I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

I currently have this regex (which is close but doesn't quite work):
/(^.*)-(.*)($-\w)?/

It currently gives me:
[zoo] [2.10.1p1]
[mutt-1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd-5.2.3] [no_x11]

As you can see I'm having issues with the flavor (the last part of the
package name which not every package has) part. Any help?

John W. Krahn · Mar 12, 2008

donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

$ echo "zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11" | \
perl -lne'
print join " ", map $_ ? "[$_]" : (), split /-?(\d[\w.]*\d)-?/, $_, 2
'
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

John

donebrowsers · Mar 12, 2008

While that works and I appreciate it, I was just using the []s as a
placeholder. I'm actually using PHP's <a href="http://us3.php.net/
manual/en/function.preg-match.php">preg_match()</a> function which
uses PERL style regular expressions. I submitted it to this group
because PERL programmers tend to be better with regular expressions
than anyone else.

This function essentially matches parts and adds them to an array with
[0] matching the whole string, [1]... matching the ()s. So what I
really have is:
array(3) {
[0]=>
string(12) "zoo-2.10.1p1"
[1]=>
string(3) "zoo"
[2]=>
string(8) "2.10.1p1"
}
array(3) {
[0]=>
string(23) "mutt-1.4.2.3-compressed"
[1]=>
string(12) "mutt-1.4.2.3"
[2]=>
string(10) "compressed"
}
array(3) {
[0]=>
string(22) "lha-1.14i.ac20050924.1"
[1]=>
string(3) "lha"
[2]=>
string(18) "1.14i.ac20050924.1"
}
array(3) {
[0]=>
string(19) "mysql-server-5.0.45"
[1]=>
string(12) "mysql-server"
[2]=>
string(6) "5.0.45"
}
array(3) {
[0]=>
string(19) "p5-Archive-Tar-1.30"
[1]=>
string(14) "p5-Archive-Tar"
[2]=>
string(4) "1.30"
}
array(3) {
[0]=>
string(20) "php5-gd-5.2.3-no_x11"
[1]=>
string(13) "php5-gd-5.2.3"
[2]=>
string(6) "no_x11"
}

What I want is ie:
array(4) {
[0]=>
string(20) "php5-gd-5.2.3-no_x11"
[1]=>
string(13) "php5-gd"
[2]=>
"5.2.3"
[3]=>
string(6) "no_x11"
}

Thanks for your suggestion though, sorry for the confusion.

Uri Guttman · Mar 12, 2008

d> While that works and I appreciate it, I was just using the []s as a
d> placeholder. I'm actually using PHP's <a href="http://us3.php.net/
d> manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular expressions
d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was valid
perl and that will likely be all you will get here.

uri

donebrowsers · Mar 12, 2008

Fine. How would I use perl to do what I am trying to do? Strip out
those parts and add them to an array?

Uri Guttman · Mar 12, 2008

d> Fine. How would I use perl to do what I am trying to do? Strip out
d> those parts and add them to an array?

what was wrong with the answer you got? it split the package names into
the parts you wanted. grabbing those and making them into arrays is
trivial. just assign the grabs to an array or put the split into [] to
make an anon array. then you can build up the data structure from that.

uri

Martijn Lievaart · Mar 12, 2008

I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

I currently have this regex (which is close but doesn't quite work):
/(^.*)-(.*)($-\w)?/

It currently gives me:
[zoo] [2.10.1p1]
[mutt-1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd-5.2.3] [no_x11]

As you can see I'm having issues with the flavor (the last part of the
package name which not every package has) part. Any help?

How do you determine what part is what in p5-Archive-Tar-1.30? The easy
solution (non-greedy regexpen, look it up in perldoc perlre) will
misparse this entry, it will give [p5-Archive] [Tar] [1.30].

Assuming the version always starts with a digit, and never has a dash,
you may want to try (untested):
/^(.*?)-(\d[^-]*)(?:-(.*))?$/

HTH,
M4

donebrowsers · Mar 12, 2008

That's it! Thanks Martijn.

Jürgen Exner · Mar 12, 2008

donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

The examples are nice, but an additional verbal description of what you are
trying to do would help a lot.
Are you trying to split the string at each dash (minus sign) and store the
pieces in an array?

That is trivial:
@pieces = split /-/, $string;

jue

donebrowsers · Mar 12, 2008

Yes and no. There are -s in the name of the package, for example php5-
core. I want the package name php5-core, and the version, 5.2.3; and
if there is a flavor, for example the mutt package, I want that
separate. Martijn's regex works perfectly.

Ben Morrow · Mar 12, 2008

Quoth donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]

Assuming the 'flavour' never contains a dot, and 'version' always does
(otherwise it's impossible to distinguish 'flavour' from 'version'
without more information)

my @pkg = /^(.+?)-([^-]+\.[^-]+)(?:-([^.-]+))?$/;

or, better,

my @pkg = m{
^ (.+?) - # name
( [^-]+ \. [^-]+ ) # version must contain a dot
(?: - ( [^.-]+ ) )? $ # optional flavour musn't
}x;

It would probably be simpler to do this in multiple steps:

my $flavour = s/-([^.-]+)$//;
my $version = s/-([^-]+)$//;
my $name = $_;

(correct for American spellings to taste

).

Doesn't OpenBSD provide tools to do this sort of thing, that reference
the package database and know what the right answer is?

Ben

donebrowsers · Mar 12, 2008

Unfortunately as far as I can tell no. I've searched and can't find
anything.

More info on the naming schene:
The stem part identifies the package. It may contain some
dashes, but
its form is mostly conventional. For instance, japanese packages
usually
start with a `ja' prefix, e.g., "ja-kterm-6.2.0".

The version part starts at the first digit that follows a `-',
and goes
on up to the following `-', or to the end of the package name, if
no fla-
vor modifier is present. It is highly recommended that all
packages have
a version number. Normally, the version number directly matches
the
original software distribution version number, or release date.
In case
there are substantial changes in the OpenBSD package, a patch
level mark-
er should be appended, e.g., `p1', `p2 ...' For example, assuming
that
the screen package for release 2.8 was named "screen-2.9.8" and
that an
important security patch led to a newer package, the new package
would be
called "screen-2.9.8p1". Obviously, these specific markers are
reserved
for OpenBSD purposes.

Flavored packages will also contain a list of flavors after the
version
identifier, in a canonical order determined by FLAVORS in the
correspond-
ing port's Makefile. For instance, kterm has an xaw3d flavor:
"ja-kterm-
xaw3d".

Note that, to uniquely identify the version part, no flavor shall
ever
start with a digit. Usually, flavored packages are slightly
different
versions of the same package that offer very similar
functionalities.

Ben Morrow · Mar 12, 2008

Quoth donebrowsers said:
Unfortunately as far as I can tell no. I've searched and can't find
anything.

The version part starts at the first digit that follows a `-',
and goes on up to the following `-', or to the end of the package
name, if no fla- vor modifier is present.

Flavored packages will also contain a list of flavors after the
version identifier, in a canonical order determined by FLAVORS in
the correspond- ing port's Makefile.

You didn't say there could be more than one flavour.

Note that, to uniquely identify the version part, no flavor shall
ever start with a digit.

So why didn't you post that the first time? The simple solution now
becomes

my @parts = split /-/, $pkgname;
my @flavours;
unshift @flavours, pop @parts while @parts[-1] =~ /^\D/;
my $version = pop @parts;
my $name = join '-', @parts;

Translating that into an evil regex I leave as an exercise. Probably you
can just add a * in the right place in one of the ones you have already
been given.

Ben

Steve K. · Mar 16, 2008

Uri said:
d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.

1) What right do you have to speak for everyone? You should of said "and
that will likely be all you will get from me" as there are plenty of
people who actually offer help without the attitude people like your
self feel they must attach. You may not like PHP, but it has a place,
just as Perl does.

2) Yes we all know it's Perl. The world will cease to rotate properly on
it's axis and we will die well in advance of 2012 (right...) if someone
says PERL... one could always leave a friendly little note about it if
it really bothers you that much rather than the rude fucktardery your
types like to share.

Uri Guttman · Mar 16, 2008

SK> Uri Guttman said:
"d" == donebrowsers <[email protected]> writes:

Click to expand...

d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which

Click to expand...

d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
SK> 1) What right do you have to speak for everyone? You should of
SK> said "and that will likely be all you will get from me" as there
SK> are plenty of people who actually offer help without the attitude
SK> people like your self feel they must attach. You may not like PHP,
SK> but it has a place, just as Perl does.

not in this group. that is the whole point. this group is about perl and
not php. you can find php help over there --->

SK> 2) Yes we all know it's Perl. The world will cease to rotate
SK> properly on it's axis and we will die well in advance of 2012
SK> (right...) if someone says PERL... one could always leave a
SK> friendly little note about it if it really bothers you that much
SK> rather than the rude fucktardery your types like to share.

i would rather correct it when and how i please. you can uncorrect it as
you wish.

uri

Tad J McClellan · Mar 16, 2008

Steve K. said:
Uri said:

"d" == donebrowsers <[email protected]> writes:

Click to expand...

d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.

Click to expand...

1) What right do you have to speak for everyone? You should of said "and

^^^^^^^^^
^^^^^^^^^

"Who would cross the Bridge of Death must answer me these questions three"

John W. Krahn · Mar 16, 2008

Tad said:
Steve K. said:

Uri said:

d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.

Click to expand...

1) What right do you have to speak for everyone? You should of said "and

Click to expand...

^^^^^^^^^
^^^^^^^^^

"Who would cross the Bridge of Death must answer me these questions three"

Blue. No yel-- Auuuuuuuugh!

John

Help with Regex (UserName, Email)	0	Jul 26, 2007
Regex help...pretty please?	4	Aug 23, 2006
need some help excluding with file::find::rule	6	Jan 22, 2010
Can I get a little help with my program? (string searching and regex)	0	Jan 8, 2009
Need help with first program to connect to mysql database via apacheand python.	1	Feb 7, 2008
Ruby on Windows with mysql and net	6	Jan 23, 2008
Help sought Perl with a bit of REGEX	1	Jul 22, 2006
Help with a "Post" procedure	0	Jun 19, 2004

help with a regex

donebrowsers

John W. Krahn

donebrowsers

Uri Guttman

donebrowsers

Uri Guttman

Martijn Lievaart

donebrowsers

Jürgen Exner

donebrowsers

Ben Morrow

donebrowsers

Ben Morrow

Steve K.

Uri Guttman

Tad J McClellan

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads