print LIST vs print join "", LIST

B

Brian McCauley

In a recent article by Dan Sugalski there's this code[*]:

foreach my $node (@nodes) {
my (@lines) = process_node(@$node);
print join("", @lines);
}

Is that print line a matter of style, or is it a better choice over the much simpler

print @lines;

That would depend on whether or not it is safe to assume that $,=''

(A lot of commonly used modules _do_ assume this so IMHO it's best never
to muck with $,)

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
J

James Willmore

In a recent article by Dan Sugalski there's this code[*]:

foreach my $node (@nodes) {
my (@lines) = process_node(@$node);
print join("", @lines);
}

Is that print line a matter of style, or is it a better choice over the much simpler

print @lines;

IMHO, if you just want the list, use the later code. If you want to print
the list in some format (like using newlines to separate the elements of
the list), use the former.

#print newlines after each element of the list
print join("\n", @list),"\n";

#print ':' between each element of the list
print join(':', @list),"\n";

#print the list
print @list,"\n";

Notice the newlines at the end of each example. I find it "nice" to put
that in ... try them without to see why it's "nice" to do this :)

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
There are three kinds of lies: Lies, Damn Lies, and Statistics.
-- Disraeli
 
X

Xavier Noria

Brian McCauley said:
That would depend on whether or not it is safe to assume that $,=''

(A lot of commonly used modules _do_ assume this so IMHO it's best never
to muck with $,)

I think that's unlikely to be the point: He controls the code he is
explaining, no mention to $, is made and does not look like it's
modified. Moreover, the canonical way to deal with global variables
like that is not to touch them globally, but to use local() if you
need to in some exceptional place. I wouldn't bet that's the reason
(though I could loss of course).

Doing that as defensive programming in case you afterwards modify $,
accidentally looks too convoluted as well to me.

-- fxn
 
X

Xavier Noria

In a recent article by Dan Sugalski there's this code[*]:

foreach my $node (@nodes) {
my (@lines) = process_node(@$node);
print join("", @lines);
}

Is that print line a matter of style, or is it a better choice over the much simpler

print @lines;

?

Now that I think about it, I kind of recall that Uri commented in a
side note in his talk in the past YAPC::Europe in Paris that

print "string1" . "string2" . ... . "stringn";

is more efficient than

print "string1", "string2", ..., "stringn";

I am not 100% sure though, is that right? Does that translate to that
idiom with join() as well? If that was the case I would bet that's the
reason.

-- fxn
 
T

Tassilo v. Parseval

Also sprach Xavier Noria:
In a recent article by Dan Sugalski there's this code[*]:

foreach my $node (@nodes) {
my (@lines) = process_node(@$node);
print join("", @lines);
}

Is that print line a matter of style, or is it a better choice over the much simpler

print @lines;

?

Now that I think about it, I kind of recall that Uri commented in a
side note in his talk in the past YAPC::Europe in Paris that

print "string1" . "string2" . ... . "stringn";

is more efficient than

print "string1", "string2", ..., "stringn";

I am not 100% sure though, is that right? Does that translate to that
idiom with join() as well? If that was the case I would bet that's the
reason.

It appears to be right. My intuitive assumption would have been that the
list-print is more efficient, but apparently not. This benchmark:

use Benchmark qw/cmpthese/;

my @ary = ("foo") x 5;

cmpthese(-2, {
ary => sub {
print STDERR @ary;
},
concat => sub {
print STDERR $ary[0] . $ary[1] . $ary[2] . $ary[3] . $ary[4];
},
joined => sub {
print STDERR join '', @ary;
},
list => sub {
print STDERR $ary[0], $ary[1], $ary[2], $ary[3], $ary[4];
},
});

with stderr piped to /dev/null results in:

Rate list ary concat joined
list 170559/s -- -12% -48% -58%
ary 194564/s 14% -- -40% -52%
concat 325517/s 91% 67% -- -20%
joined 407659/s 139% 110% 25% --

As it looks, the join-approach is the least costly one. Another slight
surprise for me.

Tassilo
 
A

Anno Siegel

Xavier Noria said:
In a recent article by Dan Sugalski there's this code[*]:

foreach my $node (@nodes) {
my (@lines) = process_node(@$node);
print join("", @lines);
}

Is that print line a matter of style, or is it a better choice over the much simpler

print @lines;

?

Now that I think about it, I kind of recall that Uri commented in a
side note in his talk in the past YAPC::Europe in Paris that

print "string1" . "string2" . ... . "stringn";

is more efficient than

print "string1", "string2", ..., "stringn";

I am not 100% sure though, is that right? Does that translate to that
idiom with join() as well? If that was the case I would bet that's the
reason.

Oh dear... I'm afraid we're looking for more reason than there is in
a casual choice of idioms.

"join '', @lines" is *the* way in Perl to put array elements together
in a string. That you can get a printout that looks the same without
(explicit) join is something else.

If anything, there is a difference of purpose. If the purpose of the
program is to print out the result, "print @lines" is the idiom of choice.
If its purpose is to return the string, as in a sub, "join '', @lines" is
the operative code, and "print" is a courtesy. The second view is the
broader one.

Here we come back to Uri's argument, but not for efficiency reasons but
on the general principle of "Print rarely, print late". If a piece of
code prints something, you can't take it back, and you have to bend over
backwards to make the printout invisible if you don't want it. So don't
do it, delay the decision to print to the latest possible time. The
same goes for warnings (of course) and dying in general-purpose subs.

"Die rarely, die late" is a recommendable principle too.

Anno
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
Here we come back to Uri's argument, but not for efficiency reasons but
on the general principle of "Print rarely, print late". If a piece of
code prints something, you can't take it back, and you have to bend over
backwards to make the printout invisible if you don't want it. So don't
do it, delay the decision to print to the latest possible time. The
same goes for warnings (of course) and dying in general-purpose subs.

"Die rarely, die late" is a recommendable principle too.

Incidentally, a couple of hours ago I browsed through
http://www.extremeperl.org/, a book about Extreme Programming in Perl.

On reading a corresponding paragraph, it reminded me of XP's principle
of dying as early as possible. So there appear to be at least two
schools of dying now.

Tassilo
 
A

Anno Siegel

Tassilo v. Parseval said:
Also sprach Anno Siegel:


Incidentally, a couple of hours ago I browsed through
http://www.extremeperl.org/, a book about Extreme Programming in Perl.

On reading a corresponding paragraph, it reminded me of XP's principle
of dying as early as possible. So there appear to be at least two
schools of dying now.

What is "XP's principle of dying early"? Does "XP" mean what I think it
does? And that's supposed to be an argument? :)

Seriously, I know that dying early is sometimes promoted, arguing that
it is useless and potentially harmful to carry on after something
essential went wrong. However, whether it's useless to continue is
for the user to decide, there may be alternatives outside the scope
of the program that dies (and the programmer who wrote the "die").

This view reflects a bit the program-centric view that a coder
necessarily assumes while working on a problem. "When *this* fails, we
can't do anything for the user. Better bail out." Indeed, it does
save the user the effort of checking for errors and thus is a service
for the user, or can been seen as such.

But doing too much is a design error when it takes a responsibility
from the user they would rather keep, and premature dying belongs in
this category. Programmers who design on-the-fly are particularly
prone to this error. I know this from undisclosed sources :)

In reality, I don't remember many cases where failure of some component
resulted in serious corruption of anything. I mean failure in a
recognizable sense where it *could* have died, not silent malfunction.
I do remember cases where a job wasn't completed because some minor
component chose a trivial error to "bail out", and it's a major nuisance.

So I'm for dying rarely and late.

Anno


It does save
the user from checking return values, so much is true.
 
A

Anno Siegel

Tassilo v. Parseval said:
Also sprach Anno Siegel:


Incidentally, a couple of hours ago I browsed through
http://www.extremeperl.org/, a book about Extreme Programming in Perl.

On reading a corresponding paragraph, it reminded me of XP's principle
of dying as early as possible. So there appear to be at least two
schools of dying now.

What is "XP's principle of dying early"? Does "XP" mean what I think it
does? And that's supposed to be an argument? :)

Seriously, I know that dying early is sometimes promoted, arguing that
it is useless and potentially harmful to carry on after something
essential went wrong. However, whether it's useless to continue is
for the user to decide, there may be alternatives outside the scope
of the program that dies (and the programmer who wrote the "die").

This view reflects a bit the program-centric view that a coder
necessarily assumes while working on a problem. "When *this* fails, we
can't do anything for the user. Better bail out." Indeed, it does
save the user the effort of checking for errors and thus is a service
for the user, or can been seen as such.

But doing too much is a design error when it takes a responsibility
from the user they would rather keep, and premature dying belongs in
this category. Programmers who design on-the-fly are particularly
prone to this error. I know this from undisclosed sources :)

In reality, I don't remember many cases where failure of some component
resulted in serious corruption of anything. I mean failure in a
recognizable sense where it *could* have died, not silent malfunction.
I do remember cases where a job wasn't completed because some minor
component chose a trivial error to "bail out", and it's a major nuisance.

So I'm for dying rarely and late.

Anno
 
U

Uri Guttman

TvP> Also sprach Xavier Noria:
TvP> It appears to be right. My intuitive assumption would have been that the
TvP> list-print is more efficient, but apparently not. This benchmark:

of course i am right!!

the lesson from that section of the talk is

print rarely, print late.

you can see the slides from that section of my talk at:

http://stemsystems.com/slides/common_sense/slides/slide-0401.html

i don't show a benchmark but i did them like tassilo did. print is very
slow. now, i don't fuss over it in small scripts or in things where it
doesn't matter. but in larger scripts, daemons, high efficiency things,
socket stuff, i avoid print as much as possible. for whole files,
file::slurp is faster and cleaner. for sockets, sysread/syswrite is
better.

uri
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
What is "XP's principle of dying early"? Does "XP" mean what I think it
does? And that's supposed to be an argument? :)

XP being Extreme Programming. Say, Anno, you didn't really think that it
might have been something Redmondish? ;-)

Meanwhile, I have found the exact spot where I picked up this statement
for everyone to check:

http://www.extremeperl.org/bk/coding-style

Grep for 'fails fast'.
Seriously, I know that dying early is sometimes promoted, arguing that
it is useless and potentially harmful to carry on after something
essential went wrong. However, whether it's useless to continue is
for the user to decide, there may be alternatives outside the scope
of the program that dies (and the programmer who wrote the "die").

This view reflects a bit the program-centric view that a coder
necessarily assumes while working on a problem. "When *this* fails, we
can't do anything for the user. Better bail out." Indeed, it does
save the user the effort of checking for errors and thus is a service
for the user, or can been seen as such.

I was rather thinking of libraries than programs. If you have a
function/method that is supposed to do destructive I/O and it is passed
garbage, it is certainly better to abort immediately than carrying on.
Otherwise it might happen that the function does half of the I/O and
then eventually dies leaving things in an inconsistent state.
But doing too much is a design error when it takes a responsibility
from the user they would rather keep, and premature dying belongs in
this category. Programmers who design on-the-fly are particularly
prone to this error. I know this from undisclosed sources :)

I am always in favour of design on-the-fly. I still have vivid memories
of the kind of planning-hoops I had to go through at university. There
was one lab in particular, where we had to program a rather largish Java
application involving a GUI. It all began with writing boring
requirement specifications that were afterwards modelled into UML. As I
recall it, the class diagram I came up with did impress mostly everyone.
However, it turned out to be unimplementable in Swing because it assumed
an event-model that simply didn't exist in this widget set.

So I had to change everything. In the end, I secretly changed my UML
diagrams to fit my final program and hoped that no one would notice
(they didn't, of course). Had I designed the thing on-the-fly, the
result would have been very similar without the many hours of overhead
needed to create an UML diagram that no one cared about or wanted to
see.

So I prefer the much more practical XP approach which essentially says:
"Implement the first thing that works and that you can think of and then
refactor". Perl is a wonderful language for that.
So I'm for dying rarely and late.

There's another thing to take into account: A program dying early sure
shows the best runtime performance of them all. ;-)

Tassilo
 
T

Tassilo v. Parseval

Also sprach Uri Guttman:
TvP> Also sprach Xavier Noria:

TvP> It appears to be right. My intuitive assumption would have been that the
TvP> list-print is more efficient, but apparently not. This benchmark:

of course i am right!!

the lesson from that section of the talk is

print rarely, print late.

you can see the slides from that section of my talk at:

http://stemsystems.com/slides/common_sense/slides/slide-0401.html

i don't show a benchmark but i did them like tassilo did. print is very
slow. now, i don't fuss over it in small scripts or in things where it
doesn't matter. but in larger scripts, daemons, high efficiency things,
socket stuff, i avoid print as much as possible. for whole files,
file::slurp is faster and cleaner. for sockets, sysread/syswrite is
better.

The rule is probably to keep away from system-calls as long as possible
anyway. Especially the I/O-ish ones are slow just because I/O is slow
compared with calculations a CPU can carry out by itself.

In the context of this discussion however, we always had only one
print....on the surface at least.

Now, looking at the pp_print's source I can see why

print LIST;

is slower than

print join '', LIST;

Perl calls Perl_do_print() for every item in the list resulting in one
system-call per list element. This is even true for arrays, so perl will
not translate

print @ARY;

into

print join $,, @ARY;

In fact, it will print the array in a particularly wasteful way, namely:

for (0 .. $#ARY) {
print $ARY[$_];
print $, if $_ < $#ARY;
}

Good that you made me aware of that. I made some assumptions on the way
perl would possibly handle these cases that turn out to be very wrong.

Tassilo
 
A

Anno Siegel

Tassilo v. Parseval said:
Also sprach Anno Siegel:


XP being Extreme Programming. Say, Anno, you didn't really think that it
might have been something Redmondish? ;-)

Oh. Okay, sorry :) Extreme Programming didn't come to mind.
Meanwhile, I have found the exact spot where I picked up this statement
for everyone to check:

http://www.extremeperl.org/bk/coding-style

Grep for 'fails fast'.

I have taken a look. I wouldn't necessarily conclude from that remark
that "failing fast" is a tenet of Extreme Programming. It simply states
the old argument that it is better to die than to risk doing damage.

In the particular case I tend to agree. The function that "fails
fast" here is the plan() function of Test.pm, whose only purpose is
to prepare for a number of tests to be run. In that capacity, deciding
not to run the tests when the plan is in error is entirely reasonable.
If someone really wants to run plan() for some other purpose, they can
wrap eval() around it.
I was rather thinking of libraries than programs. If you have a
function/method that is supposed to do destructive I/O and it is passed
garbage, it is certainly better to abort immediately than carrying on.
Otherwise it might happen that the function does half of the I/O and
then eventually dies leaving things in an inconsistent state.

To elaborate, when I say "die late" i don't mean to return an incomplete
or defective result with no indication. "Die late" implies that the
user *can* decide to die later, meaning that there must be an error
indicator. So the user can avoid the possible damage, but the
responsibility is his.

Someone could have decided to let system calls in Perl die in case of
errors, instead of returning a boolean success indicator. That would
save clpm a lot of admonitions that begin with "Always, yes *always*...",
reminding people of the responsibility they fail to take. We would also
see a whole lot of "eval { open(...) }; if ( $@ ) { ... }". I like it
the way it is.
I am always in favour of design on-the-fly. I still have vivid memories
of the kind of planning-hoops I had to go through at university. There
was one lab in particular, where we had to program a rather largish Java
application involving a GUI. It all began with writing boring
requirement specifications that were afterwards modelled into UML. As I
recall it, the class diagram I came up with did impress mostly everyone.
However, it turned out to be unimplementable in Swing because it assumed
an event-model that simply didn't exist in this widget set.

So I had to change everything. In the end, I secretly changed my UML
diagrams to fit my final program and hoped that no one would notice
(they didn't, of course). Had I designed the thing on-the-fly, the
result would have been very similar without the many hours of overhead
needed to create an UML diagram that no one cared about or wanted to
see.

I don't have a formal education in CS, and anecdotes like this tend to
ease my regret about that. From afar I have watched some diagramming
techniques come and go (flowcharts were discarded in the 70s, Nassi-
Shneiderman in th 80s). UML seems to be an OO equivalent -- I haven't
looked closer yet. They have always been put forward as a planning
tool, and there has always been a tendency to use them as an analysis
tool, i.e. draw the diagram for an already written program to understand
its structure.

What happened to you seems quite typical. What use is a perfect plan
when it turns out that there is no reasonable implementation under
practical constraints.
So I prefer the much more practical XP approach which essentially says:
"Implement the first thing that works and that you can think of and then
refactor". Perl is a wonderful language for that.

Indeed. However, it doesn't hurt to become aware of the pitfalls *this*
approach has. One of them is cramming too much functionality into a routine
you're writing, just because it's a good occasion (and also because the
ways the routine will be used aren't quite clear yet). Programmers like to
implement, it's the designer's job to decide what it worth implementing.
When the designer and the coder are the same person, at the same time,
that job becomes harder.

Anno
 
U

Uri Guttman

AS> Someone could have decided to let system calls in Perl die in case of
AS> errors, instead of returning a boolean success indicator. That would
AS> save clpm a lot of admonitions that begin with "Always, yes *always*...",
AS> reminding people of the responsibility they fail to take. We would also
AS> see a whole lot of "eval { open(...) }; if ( $@ ) { ... }". I like it
AS> the way it is.

and there is a module for that IIRC. it traps system calls and dies on
failure. so you have to wrap the calls in eval{} to trap them. not worth
the effort IMO unless you are an exception style addict.

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top