subroutines, prototyping, and pass by reference question

E

Eric

Hello,

I have a Perl application that I am working on optimizing. One thing I
am trying to do is implement subroutine calls by reference when
possible. However, after reading perlref, perlreftut, perlsub, and lots
more, I still am a unclear with how to proceed.

Here is an example of things I am trying to do...some suggestions on
where to implement references would greatly appreciated.

-------------------------------------------------------------------------------

use strict;
use warnings;

# Open the specified file and return the contents
sub open_template($) {
my $file = shift;
open(TMP, $file) || die("cannot open $file: $!");
my @code = <TMP>;
close(TMP);
return @code;
}

# Check the template for expected values
sub check_template(@) {
my @code = @_;
my $valid;
my $value = "value to find";
foreach my $line (@code) {
# check for match in $line
$valid = 1 if /$value/;
}
die("value not found in template") if !$valid;
}

# Replace tags with content
sub parse_template(@) {
my @code = @_;
my $html;
foreach (@code) {
# check for match in $line
# ex: s/<myCustomTag>/$myCustomValue/g;
$html .= "$_\n";
}
return $html;
}

# Path to the template
my $template = "path to file";

# Get the contents of the template
my @html = open_template($template);

# Validate template
check_template(@html);

# Print the parsed template
print parse_template(@html);

-------------------------------------------------------------------------------

It would seem to me that I would want open_template to return a
reference to @code. Then I would pass that reference to
check_template(). Then that reference would be passed to
parse_template(). Does that sound right? It seems like I am currently
creating multiple copies of the same data. Anyway, enough
rambling...comments and code suggestions welcomed =)

Thanks,
Eric
 
G

Gunnar Hjalmarsson

Eric said:
I have a Perl application that I am working on optimizing. One thing I
am trying to do is implement subroutine calls by reference when
possible. However, after reading perlref, perlreftut, perlsub, and lots
more, I still am a unclear with how to proceed.

Did you read all those, and are still not able to give it a try?
Honestly, I find that hard to believe. Anyway, please find some comments
below.
Here is an example of things I am trying to do...some suggestions on
where to implement references would greatly appreciated.

-------------------------------------------------------------------------------

use strict;
use warnings;

# Open the specified file and return the contents
sub open_template($) {
my $file = shift;
open(TMP, $file) || die("cannot open $file: $!");
my @code = <TMP>;

Better to slurp the file into a scalar variable

my $code = do { local $/; said:
close(TMP);
return @code;

return $code;
}

# Check the template for expected values
sub check_template(@) {
my @code = @_;
my $valid;
my $value = "value to find";
foreach my $line (@code) {
# check for match in $line
$valid = 1 if /$value/;
}
die("value not found in template") if !$valid;
}

Assuming a scalar ref, that sub can be simplified to e.g.

sub check_template {
my $ref = shift;
my $value = 'value to find';
die 'value not found in template'
unless index($$ref, $value) >= 0;
}
# Replace tags with content
sub parse_template(@) {
my @code = @_;
my $html;
foreach (@code) {
# check for match in $line
# ex: s/<myCustomTag>/$myCustomValue/g;
$html .= "$_\n";
}
return $html;
}

Since you appear to be dealing with HTML, it would not be
very smart to use < and > as template variable delimiters.
See also comments below.

sub parse_template {
my ($html, $vars) = @_;
$$html =~ s/%(\w+)%/$vars->{$1}/g;
}
# Path to the template
my $template = "path to file";

# Get the contents of the template
my @html = open_template($template);

my $html = open_template($template);
# Validate template
check_template(@html);
check_template(\$html);

# Print the parsed template
print parse_template(@html);

Pass hash with template variables, too.
Don't print HTML to STDOUT without converting certain characters.

my %vars = (
myCustomTag1 => 'myCustomValue1',
myCustomTag2 => 'myCustomValue2',
);

parse_template(\$html, \%vars);

for ($html) {
s/&/&amp;/g;
s/"/&quot;/g;
s/</&lt;/g;
s/>/&gt;/g;
print;
}
It would seem to me that I would want open_template to return a
reference to @code. Then I would pass that reference to
check_template(). Then that reference would be passed to
parse_template(). Does that sound right? It seems like I am currently
creating multiple copies of the same data.

Yes, avoiding multiple instances of the same data is one reason to pass
by reference.

Finally it should be noted that for simple templates, or as an exercise,
a simple templating 'system' as the above may do. For more advanced
applications, there are a lot of CPAN modules availabe that may be useful.
 
G

Gunnar Hjalmarsson

Gunnar said:
Don't print HTML to STDOUT without converting certain characters.

Well, that's true for data, but of course not for the HTML markup. Sorry
for the confusion.
parse_template(\$html, \%vars);

for ($html) {
s/&/&amp;/g;
s/"/&quot;/g;
s/</&lt;/g;
s/>/&gt;/g;
print;
}

This is an attempt to correct myself, assuming that the template
variables only contain data, and not markup:

for ( values %vars ) {
s/&/&amp;/g;
s/"/&quot;/g;
s/</&lt;/g;
s/>/&gt;/g;
}

parse_template(\$html, \%vars);

print $html;
 
E

Eric

Gunnar said:
Well, that's true for data, but of course not for the HTML markup. Sorry
for the confusion.


This is an attempt to correct myself, assuming that the template
variables only contain data, and not markup:

for ( values %vars ) {
s/&/&amp;/g;
s/"/&quot;/g;
s/</&lt;/g;
s/>/&gt;/g;
}

parse_template(\$html, \%vars);

print $html;

Perhaps I should have been more clear. I have tried working with
references and all of the documentation does great on showing me *how*
to do it. The reason for my post was more to figure out the *why* and
*when*. I provided a sample so that any comments would make more sense
in the context of something I have tried as opposed to an example in
the documentation.

I appreciate your comments and will continue to research.

Thanks,
Eric
 
G

Gunnar Hjalmarsson

Eric said:
Perhaps I should have been more clear. I have tried working with
references and all of the documentation does great on showing me *how*
to do it. The reason for my post was more to figure out the *why* and
*when*. I provided a sample so that any comments would make more sense
in the context of something I have tried as opposed to an example in
the documentation.

I appreciate your comments and will continue to research.

As regards why, preventing copying of data is one reason, as you
mentioned. Another why/when is when you want changes affect the
referenced variable.

More experienced programmers may be able to give other examples.
Nevertheless, there are no firm rules on when to pass by reference. As
usual it's a trade-off between various goals, e.g. efficiency,
maintainability, readability...
 
P

Paul Lalli

Eric said:
Perhaps I should have been more clear. I have tried working with
references and all of the documentation does great on showing me *how*
to do it. The reason for my post was more to figure out the *why* and
*when*. I provided a sample so that any comments would make more sense
in the context of something I have tried as opposed to an example in
the documentation.

There are three main reasons to use references:

1) To avoid copying a large piece of data when passing into a
subroutine:

my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string($string);
sub process_string {
my $arg = shift;
#we have now copied our 8 million character string
# ... do stuff with the $arg;
}

Using references:
my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string(\$string);
sub process_string {
my $arg_ref = shift;
#we now have a reference to our one and only 8 million character
string
# ... do stuff with $$arg_ref;
}

The difference, of course, is that in the first example, any changes
you make to $arg are independent of the original $string. In the
second block, any changes you make to $$arg_ref will be reflected in
$string when the subroutine ends.

2) Passing multiple arrays/hashes and keeping them distinct.

my @foo = (1..5);
my @bar = ('a' .. 'e');
process_arrays(@foo, @bar);
sub process_arrays {
my @args = @_;
#all ten values are now in @args. There is no way of knowing
#where 'foo' stopped and 'bar' began in the original. We have one
flat
#array which contains 10 values
}

Using references:
my @foo = (1..5);
my @bar = ('a' .. 'e');
process_arrays(\@foo, \@bar);
sub process_arrays {
my ($arr1, $arr2) = @_;
#We now have the five 'foo' values in @$arr1, and the five 'bar'
#values in @$arr2. We can use these two sets of values distinctly
#from each other.
}

3) Building multi-dimensional structures.
There is no way to create, for example, a two-dimensional array without
using references. Arrays hold scalars. Arrays cannot hold arrays. It
simply doesn't work:
my @not_two_d = (
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
);
#much like the above example, @not_two_d is one array containing 25
#elements. No way of knowing where one "group" ended and the next
began.

Using references:
my @two_d = (
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
);
#@two_d is now one array that contains five elements. Each element is
a
#reference to an array that contains five elements. We can access
individual
#"cells" as in $two_d[2][4] to access the 5th element of the array
referenced by
# the third element of @two_d


I hope this explanation is helpful to you. If you have not already,
please take a look at
perldoc perllol
perldoc perldsc
and
perldoc perlsub
for more information

Paul Lalli
 
E

Eric

Paul said:
Eric said:
Perhaps I should have been more clear. I have tried working with
references and all of the documentation does great on showing me *how*
to do it. The reason for my post was more to figure out the *why* and
*when*. I provided a sample so that any comments would make more sense
in the context of something I have tried as opposed to an example in
the documentation.

There are three main reasons to use references:

1) To avoid copying a large piece of data when passing into a
subroutine:

my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string($string);
sub process_string {
my $arg = shift;
#we have now copied our 8 million character string
# ... do stuff with the $arg;
}

Using references:
my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string(\$string);
sub process_string {
my $arg_ref = shift;
#we now have a reference to our one and only 8 million character
string
# ... do stuff with $$arg_ref;
}

The difference, of course, is that in the first example, any changes
you make to $arg are independent of the original $string. In the
second block, any changes you make to $$arg_ref will be reflected in
$string when the subroutine ends.

2) Passing multiple arrays/hashes and keeping them distinct.

my @foo = (1..5);
my @bar = ('a' .. 'e');
process_arrays(@foo, @bar);
sub process_arrays {
my @args = @_;
#all ten values are now in @args. There is no way of knowing
#where 'foo' stopped and 'bar' began in the original. We have one
flat
#array which contains 10 values
}

Using references:
my @foo = (1..5);
my @bar = ('a' .. 'e');
process_arrays(\@foo, \@bar);
sub process_arrays {
my ($arr1, $arr2) = @_;
#We now have the five 'foo' values in @$arr1, and the five 'bar'
#values in @$arr2. We can use these two sets of values distinctly
#from each other.
}

3) Building multi-dimensional structures.
There is no way to create, for example, a two-dimensional array without
using references. Arrays hold scalars. Arrays cannot hold arrays. It
simply doesn't work:
my @not_two_d = (
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
( 'x', 'x', 'x', 'x', 'x') ,
);
#much like the above example, @not_two_d is one array containing 25
#elements. No way of knowing where one "group" ended and the next
began.

Using references:
my @two_d = (
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
[ 'x', 'x', 'x', 'x', 'x'] ,
);
#@two_d is now one array that contains five elements. Each element is
a
#reference to an array that contains five elements. We can access
individual
#"cells" as in $two_d[2][4] to access the 5th element of the array
referenced by
# the third element of @two_d


I hope this explanation is helpful to you. If you have not already,
please take a look at
perldoc perllol
perldoc perldsc
and
perldoc perlsub
for more information

Paul Lalli

Thanks Paul...your examples and descriptions are very helpful!

The only thing I am still unclear on is prototyping and how to define
the subroutine. For example, on:
process_arrays(\@foo, \@bar);
sub process_arrays {
my ($arr1, $arr2) = @_;
#We now have the five 'foo' values in @$arr1, and the five 'bar'
#values in @$arr2. We can use these two sets of values distinctly
#from each other.
}

would I have:
sub process_arrays(\@\@) {
# code
}

On a side note...does Prototyping have any advantages other than
enforcing calls to subroutines? Should I bother using them? Sorry for
all the questions, just trying to figure this all out.

Thanks,
Eric
 
P

Paul Lalli

Eric said:
The only thing I am still unclear on is prototyping and how to define
the subroutine. For example, on:
process_arrays(\@foo, \@bar);
sub process_arrays {
my ($arr1, $arr2) = @_;
#We now have the five 'foo' values in @$arr1, and the five 'bar'
#values in @$arr2. We can use these two sets of values distinctly
#from each other.
}

would I have:
sub process_arrays(\@\@) {
# code
}

You can do that. However, if you do that, the subroutine call syntax
changes. If you use a prototype to tell Perl that the subroutine will
take two actual arrays, then you just pass the arrays themselves, and
Perl automatically creates the references:

sub process_arrays(\@\@) {
my ($arr1, $arr2) = @_;
#process the arrays...
}
process_arrays(@foo, @bar);

On a side note...does Prototyping have any advantages other than
enforcing calls to subroutines?

As you see above, prototypes also do automatic conversions for you in
certain cases (creating a reference to an array or hash with a
prototype char of \@ or \%, or passing the size of an array with a
prototype char of $).
Should I bother using them?

I rarely do. It's a matter of personal preference. If you are sure
your routines should only be called with exactly the given number,
type, and order of arguments, go ahead and use them. If you want your
subroutines to be more flexible, don't.

Four important things to keep in mind:
(1) Calling a subroutine using the & completely disables prototype
checking:
sub my_fctn($\%\@) { ... }
my_fctn(@bar); #compile time error
&my_fctn(@bar); #no error

(2) Prototypes have no effect on method calls
sub my_method($\%\@) { ... }
$obj->my_method(@bar); #no error

(3) Empty parentheses are an empty prototype, and should not be used
unless you want that prototype:
sub my_fctn () { . . . }
my_fctn('foo', 'bar'); #compile time error

(4) In order for a prototype to be used, the subroutine cannot be
called before the subroutine is declared (it can still be defined
later, if you prefer):
my_fctn('foo', 'bar'); #no error, but a warning if warnings are
enabled.
sub my_fctn(\@) { . . . }

Hope this helps,
Paul Lalli
 
E

Eric

Paul said:
You can do that. However, if you do that, the subroutine call syntax
changes. If you use a prototype to tell Perl that the subroutine will
take two actual arrays, then you just pass the arrays themselves, and
Perl automatically creates the references:

sub process_arrays(\@\@) {
my ($arr1, $arr2) = @_;
#process the arrays...
}
process_arrays(@foo, @bar);



As you see above, prototypes also do automatic conversions for you in
certain cases (creating a reference to an array or hash with a
prototype char of \@ or \%, or passing the size of an array with a
prototype char of $).


I rarely do. It's a matter of personal preference. If you are sure
your routines should only be called with exactly the given number,
type, and order of arguments, go ahead and use them. If you want your
subroutines to be more flexible, don't.

Four important things to keep in mind:
(1) Calling a subroutine using the & completely disables prototype
checking:
sub my_fctn($\%\@) { ... }
my_fctn(@bar); #compile time error
&my_fctn(@bar); #no error

(2) Prototypes have no effect on method calls
sub my_method($\%\@) { ... }
$obj->my_method(@bar); #no error

(3) Empty parentheses are an empty prototype, and should not be used
unless you want that prototype:
sub my_fctn () { . . . }
my_fctn('foo', 'bar'); #compile time error

(4) In order for a prototype to be used, the subroutine cannot be
called before the subroutine is declared (it can still be defined
later, if you prefer):
my_fctn('foo', 'bar'); #no error, but a warning if warnings are
enabled.
sub my_fctn(\@) { . . . }

Hope this helps,
Paul Lalli

Very helpful...thanks again.

-Eric
 
X

xhoster

Paul Lalli said:
There are three main reasons to use references:

1) To avoid copying a large piece of data when passing into a
subroutine:

my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string($string);
sub process_string {
my $arg = shift;
#we have now copied our 8 million character string
# ... do stuff with the $arg;
}

Using references:
my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string(\$string);
sub process_string {
my $arg_ref = shift;
#we now have a reference to our one and only 8 million character
string
# ... do stuff with $$arg_ref;
}

It is what you do inside the subroutine, not the actual passing to the
sub, that determines whether the data is copies or not.

Using references a different way:

my $string = 'sometext' x 1_000_000;
#we now have an 8 million character string;
process_string($string); # not, no reference here
sub process_string {
my $arg_ref = \$_[0];
#we now have a reference to our one and only 8 million character string
# ... do stuff with $$arg_ref;
}


Xho
 
E

Eric

Tad said:
I can't find Tom Christiansen's "Prototypes Considered Harmful"
paper on any of the "usual" sites.

Anybody know where it is on perl.com or perl.org or useperl.org or...?


I found this archived copy though:

http://library.n0i.net/programming/perl/articles/fm_prototypes/

Thanks for the link Tad...interesting reading. I'm going to not use
prototypes...if nothing else, just to elimitate any confusion for me =)

As for references, I've gone through my code and made some adjustments.
Everything still works, so thanks for everyones input.

-Eric
 
A

Anno Siegel

Tad McClellan said:
I can't find Tom Christiansen's "Prototypes Considered Harmful"
paper on any of the "usual" sites.

Anybody know where it is on perl.com or perl.org or useperl.org or...?

The actual title is "Far More Than Everything You've Ever Wanted to
Know about Prototypes in Perl". A quick search doesn't find that
on the major Perl sites either, but lots of other hits.

Anno
 
B

Bart Lateur

Tad said:
I can't find Tom Christiansen's "Prototypes Considered Harmful"
paper on any of the "usual" sites.

Anybody know where it is on perl.com or perl.org or useperl.org or...?

It used to be on perl.com, but it has been removed because it's too old
(it dates from 1999). All that's left is an empty stub.

It used to be available from CPAN as well, but it has been removed there
too for the same reason.

Sometimes blind rules are just stupid.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top