help with a regex

D

donebrowsers

I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

I currently have this regex (which is close but doesn't quite work):
/(^.*)-(.*)($-\w)?/

It currently gives me:
[zoo] [2.10.1p1]
[mutt-1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd-5.2.3] [no_x11]

As you can see I'm having issues with the flavor (the last part of the
package name which not every package has) part. Any help?
 
J

John W. Krahn

donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

$ echo "zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11" | \
perl -lne'
print join " ", map $_ ? "[$_]" : (), split /-?(\d[\w.]*\d)-?/, $_, 2
'
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]



John
 
D

donebrowsers

While that works and I appreciate it, I was just using the []s as a
placeholder. I'm actually using PHP's <a href="http://us3.php.net/
manual/en/function.preg-match.php">preg_match()</a> function which
uses PERL style regular expressions. I submitted it to this group
because PERL programmers tend to be better with regular expressions
than anyone else.

This function essentially matches parts and adds them to an array with
[0] matching the whole string, [1]... matching the ()s. So what I
really have is:
array(3) {
[0]=>
string(12) "zoo-2.10.1p1"
[1]=>
string(3) "zoo"
[2]=>
string(8) "2.10.1p1"
}
array(3) {
[0]=>
string(23) "mutt-1.4.2.3-compressed"
[1]=>
string(12) "mutt-1.4.2.3"
[2]=>
string(10) "compressed"
}
array(3) {
[0]=>
string(22) "lha-1.14i.ac20050924.1"
[1]=>
string(3) "lha"
[2]=>
string(18) "1.14i.ac20050924.1"
}
array(3) {
[0]=>
string(19) "mysql-server-5.0.45"
[1]=>
string(12) "mysql-server"
[2]=>
string(6) "5.0.45"
}
array(3) {
[0]=>
string(19) "p5-Archive-Tar-1.30"
[1]=>
string(14) "p5-Archive-Tar"
[2]=>
string(4) "1.30"
}
array(3) {
[0]=>
string(20) "php5-gd-5.2.3-no_x11"
[1]=>
string(13) "php5-gd-5.2.3"
[2]=>
string(6) "no_x11"
}

What I want is ie:
array(4) {
[0]=>
string(20) "php5-gd-5.2.3-no_x11"
[1]=>
string(13) "php5-gd"
[2]=>
"5.2.3"
[3]=>
string(6) "no_x11"
}

Thanks for your suggestion though, sorry for the confusion.
 
U

Uri Guttman

d> While that works and I appreciate it, I was just using the []s as a
d> placeholder. I'm actually using PHP's <a href="http://us3.php.net/
d> manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular expressions
d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was valid
perl and that will likely be all you will get here.

uri
 
D

donebrowsers

Fine. How would I use perl to do what I am trying to do? Strip out
those parts and add them to an array?
 
U

Uri Guttman

d> Fine. How would I use perl to do what I am trying to do? Strip out
d> those parts and add them to an array?

what was wrong with the answer you got? it split the package names into
the parts you wanted. grabbing those and making them into arrays is
trivial. just assign the grabs to an array or put the split into [] to
make an anon array. then you can build up the data structure from that.

uri
 
M

Martijn Lievaart

I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

I currently have this regex (which is close but doesn't quite work):
/(^.*)-(.*)($-\w)?/

It currently gives me:
[zoo] [2.10.1p1]
[mutt-1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd-5.2.3] [no_x11]

As you can see I'm having issues with the flavor (the last part of the
package name which not every package has) part. Any help?

How do you determine what part is what in p5-Archive-Tar-1.30? The easy
solution (non-greedy regexpen, look it up in perldoc perlre) will
misparse this entry, it will give [p5-Archive] [Tar] [1.30].

Assuming the version always starts with a digit, and never has a dash,
you may want to try (untested):
/^(.*?)-(\d[^-]*)(?:-(.*))?$/

HTH,
M4
 
J

Jürgen Exner

donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]
[mysql-server] [5.0.45]
[p5-Archive-Tar] [1.30]
[php5-gd] [5.2.3] [no_x11]

The examples are nice, but an additional verbal description of what you are
trying to do would help a lot.
Are you trying to split the string at each dash (minus sign) and store the
pieces in an array?

That is trivial:
@pieces = split /-/, $string;

jue
 
D

donebrowsers

Yes and no. There are -s in the name of the package, for example php5-
core. I want the package name php5-core, and the version, 5.2.3; and
if there is a flavor, for example the mutt package, I want that
separate. Martijn's regex works perfectly.
 
B

Ben Morrow

Quoth donebrowsers said:
I have the following dataset:
zoo-2.10.1p1
mutt-1.4.2.3-compressed
lha-1.14i.ac20050924.1
mysql-server-5.0.45
p5-Archive-Tar-1.30
php5-gd-5.2.3-no_x11

There are package listings on an OpenBSD machine. I want to parse the
package name, the version, and if there is a flavor, that too. I want
the following:
[zoo] [2.10.1p1]
[mutt] [1.4.2.3] [compressed]
[lha] [1.14i.ac20050924.1]

Assuming the 'flavour' never contains a dot, and 'version' always does
(otherwise it's impossible to distinguish 'flavour' from 'version'
without more information)

my @pkg = /^(.+?)-([^-]+\.[^-]+)(?:-([^.-]+))?$/;

or, better,

my @pkg = m{
^ (.+?) - # name
( [^-]+ \. [^-]+ ) # version must contain a dot
(?: - ( [^.-]+ ) )? $ # optional flavour musn't
}x;

It would probably be simpler to do this in multiple steps:

my $flavour = s/-([^.-]+)$//;
my $version = s/-([^-]+)$//;
my $name = $_;

(correct for American spellings to taste :) ).

Doesn't OpenBSD provide tools to do this sort of thing, that reference
the package database and know what the right answer is?

Ben
 
D

donebrowsers

Unfortunately as far as I can tell no. I've searched and can't find
anything.

More info on the naming schene:
The stem part identifies the package. It may contain some
dashes, but
its form is mostly conventional. For instance, japanese packages
usually
start with a `ja' prefix, e.g., "ja-kterm-6.2.0".

The version part starts at the first digit that follows a `-',
and goes
on up to the following `-', or to the end of the package name, if
no fla-
vor modifier is present. It is highly recommended that all
packages have
a version number. Normally, the version number directly matches
the
original software distribution version number, or release date.
In case
there are substantial changes in the OpenBSD package, a patch
level mark-
er should be appended, e.g., `p1', `p2 ...' For example, assuming
that
the screen package for release 2.8 was named "screen-2.9.8" and
that an
important security patch led to a newer package, the new package
would be
called "screen-2.9.8p1". Obviously, these specific markers are
reserved
for OpenBSD purposes.

Flavored packages will also contain a list of flavors after the
version
identifier, in a canonical order determined by FLAVORS in the
correspond-
ing port's Makefile. For instance, kterm has an xaw3d flavor:
"ja-kterm-
xaw3d".

Note that, to uniquely identify the version part, no flavor shall
ever
start with a digit. Usually, flavored packages are slightly
different
versions of the same package that offer very similar
functionalities.
 
B

Ben Morrow

Quoth donebrowsers said:
Unfortunately as far as I can tell no. I've searched and can't find
anything.
The version part starts at the first digit that follows a `-',
and goes on up to the following `-', or to the end of the package
name, if no fla- vor modifier is present.
Flavored packages will also contain a list of flavors after the
version identifier, in a canonical order determined by FLAVORS in
the correspond- ing port's Makefile.

You didn't say there could be more than one flavour.

Note that, to uniquely identify the version part, no flavor shall
ever start with a digit.

So why didn't you post that the first time? The simple solution now
becomes

my @parts = split /-/, $pkgname;
my @flavours;
unshift @flavours, pop @parts while @parts[-1] =~ /^\D/;
my $version = pop @parts;
my $name = join '-', @parts;

Translating that into an evil regex I leave as an exercise. Probably you
can just add a * in the right place in one of the ones you have already
been given.

Ben
 
S

Steve K.

Uri said:
d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.

1) What right do you have to speak for everyone? You should of said "and
that will likely be all you will get from me" as there are plenty of
people who actually offer help without the attitude people like your
self feel they must attach. You may not like PHP, but it has a place,
just as Perl does.

2) Yes we all know it's Perl. The world will cease to rotate properly on
it's axis and we will die well in advance of 2012 (right...) if someone
says PERL... one could always leave a friendly little note about it if
it really bothers you that much rather than the rude fucktardery your
types like to share.
 
U

Uri Guttman

SK> Uri Guttman said:
"d" == donebrowsers <[email protected]> writes:
d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
SK> 1) What right do you have to speak for everyone? You should of
SK> said "and that will likely be all you will get from me" as there
SK> are plenty of people who actually offer help without the attitude
SK> people like your self feel they must attach. You may not like PHP,
SK> but it has a place, just as Perl does.

not in this group. that is the whole point. this group is about perl and
not php. you can find php help over there --->

SK> 2) Yes we all know it's Perl. The world will cease to rotate
SK> properly on it's axis and we will die well in advance of 2012
SK> (right...) if someone says PERL... one could always leave a
SK> friendly little note about it if it really bothers you that much
SK> rather than the rude fucktardery your types like to share.

i would rather correct it when and how i please. you can uncorrect it as
you wish.

uri
 
T

Tad J McClellan

Steve K. said:
Uri said:
"d" == donebrowsers <[email protected]> writes:

d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.

1) What right do you have to speak for everyone? You should of said "and
^^^^^^^^^
^^^^^^^^^

"Who would cross the Bridge of Death must answer me these questions three"
 
J

John W. Krahn

Tad said:
Steve K. said:
Uri said:
d> While that works and I appreciate it, I was just using the []s as
a d> placeholder. I'm actually using PHP's <a
href="http://us3.php.net/ d>
manual/en/function.preg-match.php">preg_match()</a> function which
d> uses PERL style regular expressions. I submitted it to this group
d> because PERL programmers tend to be better with regular
expressions d> than anyone else.

it is Perl, never PERL. preg is NOT perl, nor is it compatible with
perl. that is why we use Perl and not php. the answer you got was
valid perl and that will likely be all you will get here.
1) What right do you have to speak for everyone? You should of said "and
^^^^^^^^^
^^^^^^^^^

"Who would cross the Bridge of Death must answer me these questions three"

Blue. No yel-- Auuuuuuuugh!


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top