Newbie: How to I extract word

M

Mav

Hi, there

I got string like:
$string= "------ Build started: Project: Myproject, Config: Debug ABC
------";

I would like print out only anything in between "Project:" to ",", in
this case it is "Myproject" in perl.

Any idea?

Thanks,
M
 
G

GM

Mav said:
Hi, there

I got string like:
$string= "------ Build started: Project: Myproject, Config: Debug ABC
------";

I would like print out only anything in between "Project:" to ",", in
this case it is "Myproject" in perl.

Any idea?

Thanks,
M

The following assumes you have no whitespace in your project names:

my $project = $string =~ /Project: (\S+),/;
 
D

Dale Henderson

GM> The following assumes you have no whitespace in your project
GM> names:

GM> my $project = $string =~ /Project: (\S+),/;

The following makes no such assumption:

my $project = $string =~ /Project: ([^,]+),/;

:)
 
J

Josef Moellers

Dale said:
GM> The following assumes you have no whitespace in your project
GM> names:

GM> my $project = $string =~ /Project: (\S+),/;

The following makes no such assumption:

my $project = $string =~ /Project: ([^,]+),/;

Nitpicking: That's not what Mav wrote and it's not even correct:

my $project;
($project = $string) =~ s/.*Project:([^,]+),.*/$1/;

At least with Perl v5.8.1
without the parentheses around the assignment, the value of $project is 1
with the "my", I get "Can't declare scalar assignment in "my" at - line
2, near ") =~""
without the substitution, I get the entire string,
without the .*'s, I get another string.

You can also specify "non-greedyness":
($project = $string) =~ s/.*Project:(.*?),.*/$1/;
 
M

Mav

Josef Moellers said:
Dale said:
"GM" == GM <[email protected]> writes:


GM> Mav said:
Hi, there I got string like: $string= "------ Build started:
Project: Myproject, Config: Debug ABC ------"; I would like
print out only anything in between "Project:" to ",", in this
case it is "Myproject" in perl. Any idea? Thanks, M

GM> The following assumes you have no whitespace in your project
GM> names:

GM> my $project = $string =~ /Project: (\S+),/;

The following makes no such assumption:

my $project = $string =~ /Project: ([^,]+),/;

Nitpicking: That's not what Mav wrote and it's not even correct:

my $project;
($project = $string) =~ s/.*Project:([^,]+),.*/$1/;

At least with Perl v5.8.1
without the parentheses around the assignment, the value of $project is 1

with the "my", I get "Can't declare scalar assignment in "my" at - line
2, near ") =~""
without the substitution, I get the entire string,
without the .*'s, I get another string.

You can also specify "non-greedyness":
($project = $string) =~ s/.*Project:(.*?),.*/$1/;

I tried it, and it seems it doesn't work, I need to do the following(silly way)

#orgianl string
$string= "------ Build started: Project: Myproject, Config: Debug ABC ------";

#Get the "MyProject" index
$w = rindex($string,",");
$b = rindex($string,"t: ");

if ($w > $b) {
$projName = substr($_,$b+2,$w-$b-2); #get "Myproject" out
print "$projName\n";
}

Is that a better way?
Thanks,
Mav
 
D

Dale Henderson

JM> Dale Henderson said:
The following makes no such assumption:
my $project = $string =~ /Project: ([^,]+),/;

JM> Nitpicking: That's not what Mav wrote and it's not even
JM> correct:

What isn't what Mav wrote?

As to the correctness, I know better than to post untested code.
The correct version is:

my ($project) = $string =~ /Project: ([^,]+),/;


JM> my $project;
JM> ($project = $string) =~s/.*Project:([^,]+),.*/$1/;

JM> At least with Perl v5.8.1 without the parentheses around the
JM> assignment, the value of $project is 1

Yes. the problem is without the parenthesis, the replacement is
being evaluated in scalar context and returning the number of
matches. Not the first match which is what you want.

An equivalent way to do this is:

(my $project = $string) =~ s/.*Project:([^,]+),.*/$1/;

However, this makes $project pointless since the replacement
modifies $string to be the $1 and then $project is assigned the
value of $string.

Note also your solution leaves a leading space returning
" Myproject" not "Myproject" which is what the OP requested but
at the same time requested everything between "Project:" and
",". So we have a specification error. I suspect the OP wanted to
eliminate leading spaces (and possibly trailing ones) which can
be done with

my ($project)=$string=~/Project:\s*([^,]+),/

JM> with the "my", I get "Can't declare scalar assignment in "my"
JM> at - line 2, near ") =~"" without the substitution, I get the
JM> entire string,

That's because your assigning $project to $string and ignoring the
match.

JM> without the .*'s, I get another string.

You need the .*'s to delete the rest of the string. Your
essentially replacing $string with $1.


JM> You can also specify "non-greedyness": ($project = $string) =~
JM> s/.*Project:(.*?),.*/$1/;

A negated character class is the "right" answer in this case. For
one reason its more efficient. For a discussion of why you should
choose a negated character class over a non-greedy regex see
"Mastering Regular Expressions" (Owl) 1st edition pgs 226-227.


The way I would normally implement this is something like:

my $project;

if($string=~/Project:\s+([^,]+),/){
$project=$1;
}else{
print "Bad string: $string\n"

}

But this may not be necessary in this case.
 
J

Joe Smith

Dale said:
An equivalent way to do this is:

(my $project = $string) =~ s/.*Project:([^,]+),.*/$1/;

However, this makes $project pointless since the replacement
modifies $string to be the $1 and then $project is assigned the
value of $string.

No, it does not.

It copies the value of $string to the variable $project first, and then
performs the substitution on $project, leaving $string untouched.

-Joe
 
D

Dale Henderson

JS> Dale Henderson said:
An equivalent way to do this is: (my $project = $string) =~
s/.*Project:([^,]+),.*/$1/; However, this makes $project
pointless since the replacement modifies $string to be the $1
and then $project is assigned the value of $string.

JS> No, it does not.

JS> It copies the value of $string to the variable $project first,
JS> and then performs the substitution on $project, leaving
JS> $string untouched.

Guess I should I have tested that too.

Thanks for the correction.
 
M

Mav

Thanks all, I think I should have said at the beginnig print the
string out, instead extract..
if ($line =~ /Project:(\s\w*)\,/) {
print "HERE:$1\n";
}
Thanks all,
Mav

Dale Henderson said:
JS> Dale Henderson said:
An equivalent way to do this is: (my $project = $string) =~
s/.*Project:([^,]+),.*/$1/; However, this makes $project
pointless since the replacement modifies $string to be the $1
and then $project is assigned the value of $string.

JS> No, it does not.

JS> It copies the value of $string to the variable $project first,
JS> and then performs the substitution on $project, leaving
JS> $string untouched.

Guess I should I have tested that too.

Thanks for the correction.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top