Don't understand this syntax -

L

Larry

I'm a novice trying to learn some perl, and am trying to understand some
syntax used in a formmail script to seperate a list of names.

The use is - (split /\s*,\s*/, $recipient)

I understand split, but can't see why the \s*,\s* is used as the marker. It
seems to evaluate to s,s so why the escapes and asterisks?

I'm sure it's quite simple, but I can't see it??

Thanks much,
Larry L
 
S

Sherm Pendley

Larry said:
I'm a novice trying to learn some perl, and am trying to understand some
syntax used in a formmail script to seperate a list of names.

First off, understand this: If you're referring to Matt Wright's formmail,
don't even bother trying to understand it. It's horribly written, and the
only useful purpose it has is to serve as an example of how *not* to write
Perl.
The use is - (split /\s*,\s*/, $recipient)

I understand split, but can't see why the \s*,\s* is used as the marker.
It seems to evaluate to s,s so why the escapes and asterisks?

Check the docs for the function - "perldoc -f split" and see what it says.
Notice how the first argument is listed as /PATTERN/? That means it's a
regex, so the escapes and asterisks have the same meaning they do in other
regexes. Let's break it down:

/ # Begin the pattern
\s* # Match any number (including zero) of whitespace (\s)
# characters
, # Match a comma
\s* # Again, any number of whitespace characters
/ # End pattern

So "," " ," " , " would all match the pattern.

Have a look at the following perldocs for more about regexes:

perldoc perlrequick
perldoc perlretut
perldoc perlre

sherm--
 
I

ioneabu

Sherm said:
Check the docs for the function - "perldoc -f split" and see what it says.
Notice how the first argument is listed as /PATTERN/? That means it's a
regex, so the escapes and asterisks have the same meaning they do in other
regexes. Let's break it down:

if I have a string:

my $string = "hello, how are you?";
my @string = split ' ', $string;

The extra white spaces are all removed as if I had used /\s+/, except,
the docs say that they are not quite the same because leading white
space is removed with ' '. By 'leading white space' do they mean
before each word or just at the beginning of the string? Because this
produces completely different results:

my $string = "hello::::how::::are:::::::you?";
my @string = split ':', $string;

now you get a bunch of null fields throughout the array.

There seems to be more to the special properties of ' ' then the docs
state.

wana
 
S

Sherm Pendley

if I have a string:

my $string = "hello, how are you?";
my @string = split ' ', $string;

The extra white spaces are all removed as if I had used /\s+/, except,
the docs say that they are not quite the same because leading white
space is removed with ' '. By 'leading white space' do they mean
before each word or just at the beginning of the string?

The beginning of the string. Try it and see:

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my $string = ' Hello, I am fine, thanks.';

my @string = split ' ', $string;
print Dumper(\@string);

my @string2 = split /\s+/, $string;
print Dumper(\@string2);

Notice the difference. Splitting on ' ' doesn't give an empty element at the
beginning of the list, whereas /\s+/ does give them.
There seems to be more to the special properties of ' ' then the docs
state.

Not at all. The docs state:

... If PATTERN is also omitted, splits on whitespace (after skipping
any leading whitespace). ...

Later on, they say:

As a special case, specifying a PATTERN of space (' ') will
split on white space just as "split" with no arguments does.
Thus, "split(' ')" can be used to emulate awk's default behav-
ior, whereas "split(/ /)" will give you as many null initial
fields as there are leading spaces.

sherm--
 
I

ioneabu

Sherm said:
Not at all. The docs state:

... If PATTERN is also omitted, splits on whitespace (after skipping
any leading whitespace). ...

Later on, they say:

As a special case, specifying a PATTERN of space (' ') will
split on white space just as "split" with no arguments does.
Thus, "split(' ')" can be used to emulate awk's default behav-
ior, whereas "split(/ /)" will give you as many null initial
fields as there are leading spaces.

so

my @string = split ' ', $string;

is like

$string =~ s/^\s*//;
my @string = split /\s+/, $string;

which is two special properties that make it different from 'c' where c
is any single character other than a single white space character. The
first special property is the trimming of leading white space. The
second is to match one or more white space characters as the delimiter.
I just thought that maybe this wasn't clear in the doc's explanation.


The comparison to awk is wasted on me because I don't have experience
with it. I only started learning how to use a Unix-type system
relatively recently. It's too bad it took Apple so long to adopt Jobs'
concept of a Unix OS for the Mac or I might have had an additional 15
years or so experience with it. I know, there's Linux, but hardware
support was poor until now.

wana
 
S

Sherm Pendley

so

my @string = split ' ', $string;

is like

$string =~ s/^\s*//;
my @string = split /\s+/, $string;

Except that it doesn't change $string, so it's more like:

$string =~ m/^\s*(.*)/;
my @string = $1 ? split(/\s+/, $1) : ();
It's too bad it took Apple so long to adopt Jobs'
concept of a Unix OS for the Mac or I might have had an additional 15
years or so experience with it.

There was MacPerl. There was also MachTen, although that wasn't all that
great an environment to work in.

But anyway - I don't see the reference to awk as being all that important.
"Splits on whitespace (after skipping any leading whitespace)" is pretty
clear all by itself, and you don't have to be familiar with awk or any
other Unix-isms to understand that.

sherm--
 
L

Larry

First off, understand this: If you're referring to Matt Wright's formmail,
don't even bother trying to understand it. It's horribly written, and the
only useful purpose it has is to serve as an example of how *not* to write
Perl.


Check the docs for the function - "perldoc -f split" and see what it says.
Notice how the first argument is listed as /PATTERN/? That means it's a
regex, so the escapes and asterisks have the same meaning they do in other
regexes. Let's break it down:

/ # Begin the pattern
\s* # Match any number (including zero) of whitespace (\s)
# characters
, # Match a comma
\s* # Again, any number of whitespace characters
/ # End pattern

So "," " ," " , " would all match the pattern.

Have a look at the following perldocs for more about regexes:

perldoc perlrequick
perldoc perlretut
perldoc perlre

sherm--

Sherm,

Thanks much. It was, of course "casual to the most obvious observer". I had
actually looked at numerous lists of the common escape characters, but none
had the \s on them. I've used lots of others, but apparantly just never needed
a space before, so didn't know that one.

Back to studying regexes a little more closely.

Larry L
 
S

Sherm Pendley

Larry said:
had actually looked at numerous lists of the common escape characters, but
none had the \s on them. I've used lots of others, but apparantly just
never needed a space before, so didn't know that one.

A bit of a nit-pick: "\s" doesn't just match "a space", it matches *any*
whitespace character. That includes spaces, tabs, carriage returns,
newlines, and possibly other whitespace characters I'm forgetting at the
moment.

sherm--
 
J

John W. Krahn

Sherm said:
A bit of a nit-pick: "\s" doesn't just match "a space", it matches *any*
whitespace character. That includes spaces, tabs, carriage returns,
newlines, and possibly other whitespace characters I'm forgetting at the
moment.

\s covers those four and the formfeed character. If you want the vertical tab
as well, you have to use the POSIX character class [:space:].


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top