Newbie question on split, and also awk.

Colin B. · May 16, 2006

Hey all;

After years of avoiding perl in favour of ksh+sed+awk+grep+etc., I'm
finally digging into it. So far everything I've mucked with, I've managed
to get working after a certain amount of self-abuse, with the exception of
two things.

1) I'm pulling data lists from the OS, and parsing them in perl.
The minimum required code to illustrate my problem is this:

$group = shift(@ARGV) || die;
@nislist = split(/[:,]/,`nismatch $group group.org_dir`);
chomp @nislist;
print scalar @nislist, "\n";

And the data for two different groups is:
group1:*:109:
group2:*:20:user1

The problem is that despite reading that split will discard blank trailing
fields, I get four fields from BOTH of the above groups. There are no
spaces at the end of the first group, so I'm assuming that the fourth field
in group1 is a newline. That also leads me to believe that chomp $str where
$str="\n" results in $str="", but not $str=<undef>. Is this correct?

So I came up with a workaround:

my $nislist = `nismatch $group group.org_dir`;
chomp $nislist;
@nislist = split(/[:,]/,$nislist);

Is there anything less ugly than this? I would have liked to do something
like:
chomp (@nislist = split...);

but it complains and doesn't work.

Thanks,
Colin

A. Sinan Unur · May 16, 2006

Re: Newbie question on split, and also awk.

Please read the posting guidelines for this group. Also, I do not see what
awk has to do with Perl or what your post has to do with awk.

use strict;
use warnings;

missing.

$group = shift(@ARGV) || die;
@nislist = split(/[:,]/,`nismatch $group group.org_dir`);
chomp @nislist;
print scalar @nislist, "\n";

And the data for two different groups is:
group1:*:109:
group2:*:20:user1

The problem is that despite reading that split will discard blank
trailing fields, I get four fields from BOTH of the above groups.

I don't get that.

Of course, I do not have a program called nismatch on my computer.

#!/usr/bin/perl

use strict;
use warnings;

while ( my $nismatch = <DATA> ) {
chomp $nismatch;
my @nislist = split /[:,]/, $nismatch;
local $" = ' + ';
print "@nislist\n";
}

__DATA__
group1:*:109:
group2:*:20:user1

Sinan

--
A. Sinan Unur <[email protected]>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

John W. Krahn · May 16, 2006

Colin said:
After years of avoiding perl in favour of ksh+sed+awk+grep+etc., I'm
finally digging into it. So far everything I've mucked with, I've managed
to get working after a certain amount of self-abuse, with the exception of
two things.

1) I'm pulling data lists from the OS, and parsing them in perl.
The minimum required code to illustrate my problem is this:

$group = shift(@ARGV) || die;
@nislist = split(/[:,]/,`nismatch $group group.org_dir`);
chomp @nislist;
print scalar @nislist, "\n";

And the data for two different groups is:
group1:*:109:
group2:*:20:user1

The problem is that despite reading that split will discard blank trailing
fields, I get four fields from BOTH of the above groups. There are no
spaces at the end of the first group, so I'm assuming that the fourth field
in group1 is a newline.

You could just include the newline in the character class:

my @nislist = split /[:,\n]+/, `nismatch $group group.org_dir`;

That also leads me to believe that chomp $str where
$str="\n" results in $str="", but not $str=<undef>. Is this correct?

Yes.

John

Juha Laiho · May 18, 2006

Colin B. said:
After years of avoiding perl in favour of ksh+sed+awk+grep+etc., I'm
finally digging into it.

Sounds familiar. I did the same.

So far everything I've mucked with, I've managed to get working after
a certain amount of self-abuse, with the exception of two things.

1) I'm pulling data lists from the OS, and parsing them in perl.
The minimum required code to illustrate my problem is this:

$group = shift(@ARGV) || die;
@nislist = split(/[:,]/,`nismatch $group group.org_dir`);
chomp @nislist;
print scalar @nislist, "\n";

And the data for two different groups is:
group1:*:109:
group2:*:20:user1

The problem is that despite reading that split will discard blank trailing
fields, I get four fields from BOTH of the above groups. There are no
spaces at the end of the first group, so I'm assuming that the fourth field
in group1 is a newline.
Correct.

That also leads me to believe that chomp $str where
$str="\n" results in $str="", but not $str=<undef>. Is this correct?

Why didn't you test? One way to test whether something is a newline or not
is to print it between two colons (or other characters; colons seem to
be my preference), like:

perl -e '$str="\n"; print ":$str:\n"; chomp $str; print ":$str:\n";'

So I came up with a workaround:

my $nislist = `nismatch $group group.org_dir`;
chomp $nislist;
@nislist = split(/[:,]/,$nislist);

Is there anything less ugly than this?

I would consider that readable, not ugly - if I remember correctly that
nismatch only returns 1 line per execution.

You chomp that one line (i.e. remove the trailing newline from that line).
After that, you split the result. Now, if you consider it ugly that you
have one extra variable here, get over that. You can even restrict the
variable scope to just the relevant area:

(@nislist declared somewhere outside the block)
....
{
my $nislist = `nismatch $group group.org_dir`;
chomp $nislist;
@nislist = split(/[:,]/,$nislist);
}
....
($nislist declared only within the brace-enclosed block)

Depending on what you're going to do with the results from
nismatch, you might want to put the processing into a small
separate method in any case. I'd be tempted to write the
method to return a (hash) structure separating
- group name
- group passwrd field
- group id
- group member list (perhaps as a hash, if the order of members
is not significant; this'd make it easy to query whether or
not some given user is a member of the group)

Peter J. Holzer · May 18, 2006

Colin said:
After years of avoiding perl in favour of ksh+sed+awk+grep+etc., I'm
finally digging into it.

I did the same 10 years ago. But I still write shell-scripts sometimes.

So far everything I've mucked with, I've managed to get working after
a certain amount of self-abuse, with the exception of two things.

1) I'm pulling data lists from the OS, and parsing them in perl.
The minimum required code to illustrate my problem is this:

$group = shift(@ARGV) || die;
@nislist = split(/[:,]/,`nismatch $group group.org_dir`);
chomp @nislist;

Others have already answered your question, but depending on what you
are trying to do you may want to replace these two lines with

@nislist = getgrnam($group);

hp

String split question.	9	Aug 8, 2008
Split and RegEx Help	6	Aug 31, 2004
split problem	6	Sep 20, 2004
Flushing and multiple pipes	2	Jul 20, 2012
Ruport combing tables and writing to PDF	0	Aug 27, 2009
split on '' (and another for split -1)	10	Dec 27, 2004
FAQ 4.31 How can I split a [character] delimited string except when inside [character]?	0	Apr 13, 2011
Recursion Usage and Concepts - Newbie Question	65	Oct 12, 2007

Newbie question on split, and also awk.

Colin B.

A. Sinan Unur

John W. Krahn

Juha Laiho

Peter J. Holzer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads