Performance & Perl

R

robic0

Or C?

(Disclaimer: I have not delved into the Perl source)

How close does Perl follow C as far as primatives?
Can I rely on Perl language constructs to follow C when
writing performance code?

In a runtime race will a Perl for loop time exactly the same as a
C for loop? When the race begins, if not, why not?

Any divergence in stack processing Perl to C?

Is there such a thing as a Perl "temporary" on the stack?

Why would I need to care about any of this on a "higher level"
language, or is Perl a higher level language?
 
G

Gregory Toomey

Or C?

(Disclaimer: I have not delved into the Perl source)

How close does Perl follow C as far as primatives?
Can I rely on Perl language constructs to follow C when
writing performance code?

Like comparing apples & oranges.

C is compiled & is not garbage collected, Perl 5 "interprets" a parse tree &
is garbage collected.

gtoomey
 
X

xhoster

Or C?

(Disclaimer: I have not delved into the Perl source)

How close does Perl follow C as far as primatives?

Um, not at all. Perl primitives are scalar, self-resizing arrays,
and hashes. C primitives are floats, doubles, ints, bytes, etc, and
primitive arrays.

Can I rely on Perl language constructs to follow C when
writing performance code?

I have no idea what that means.
In a runtime race will a Perl for loop time exactly the same as a
C for loop?

No. A C for loop won't even time *exactly* the same as a C for loop.
When the race begins, if not, why not?

Why would they?
Any divergence in stack processing Perl to C?

I'm not exactly sure what that means, but I'm pretty sure the
answer is yes.
Is there such a thing as a Perl "temporary" on the stack?

Why would I need to care about any of this on a "higher level"
language,

Um, I don't know. Why is purply my favorite color?
or is Perl a higher level language?

Higher than what? C? Yes.

Xho
 
T

Tassilo v. Parseval

Also sprach (e-mail address removed):
Or C?

(Disclaimer: I have not delved into the Perl source)

How close does Perl follow C as far as primatives?

Sometimes closely, but more often it doesn't follow C at all.

Some operators are implemented in terms of their C counterparts (most
notably all the bitwise operators).

The biggest differences are probably with arrays (that resize
dynamically) and strings. Perl strings are implemented on top of C
char-pointers and yet are infinitely smarter than C strings (e.g. they
transparently morph from one encoding into the other, ideally with the
programmer not even noticing it).

And there are of course hash-tables, something that C doesn't have at
all.
Can I rely on Perl language constructs to follow C when
writing performance code?

Some generic wisdom can probably be applied to writing both C and Perl
code. One such rule is to avoid the copying around of data although it's
sometimes not obvious what Perl code will result in a copy.

One difference for example is how C and Perl handle structures. One
common consideration with C is the order of struct-members that has an
impact both on memory consumption and performance due to memory
alignment issues:

struct A {
char a;
int32_t b;
char c[3];
}; /* 12 bytes on machines that align on 4-byte boundaries */

struct B {
char a;
char c[3];
int32_t b;
}; /* only 8 bytes */

In Perl, hashes are ordinarily used for modelling structs. They have
very different characteristics. One is that access to the members is
resolved at runtime whereas the members of a C structure have a fixed
memory offset from the base address which the compiler knows about.

On the other hand, certain C constructs might end up being faster in
Perl. Perl's index() is likely to be faster than C's strstr() as it uses
Boyer-Moore on the inside and attaches these information to the
variable.

Likewise length() versus strlen(): Perl's length() happens in constant
time as the length of each string is stored in the underlying
data-structure. C on the other hand has to loop over each character to
find the terminating '\0'.

Another thing to keep in mind is that function calls in Perl are slow.
In C they are also slower than inlined code, but with Perl the penalty
is by an order of a few magnitudes more severe. It's even worse with
method calls as the resolution of every method happens at runtime
(although perl does cache the information). In C++ method dispatch can
often by figured out at compile-time unless it's dynamic dispatch. Perl
only has dynamic dispatch.

I believe the best way to improve the performance of Perl programs is by
using the Benchmark module to find out which of the conceivable versions
performs better.
In a runtime race will a Perl for loop time exactly the same as a
C for loop? When the race begins, if not, why not?

The C loop wins. But then the Perl for-loop is more than just a loop:
It's a generic iterator construct that iterates over any list-alike
thing.
Any divergence in stack processing Perl to C?

Perl only uses the stack for passing arguments to and from functions and
methods. Unlike in C, it has a variadic size. That's why Perl functions
don't need a prototype. And if they have a prototype, it's for an
entirely different purpose.
Is there such a thing as a Perl "temporary" on the stack?

Perl knows about temporaries but they are not created on the stack but
instead on a separate thing called a scratchpad of which each block gets
its own. These temporaries aren't blindly destroyed when the block is
left. Instead, their reference-count is decremented. In the below
$temporary isn't freed because set_tmp() and get_tmp() still reference
it in their bodies. That means Perl has real closures:

{
my $temporary;
sub set_tmp { $temporary = shift; }
sub get_tmp { return $temporary; }
}
set_tmp 42;
print get_tmp;
Why would I need to care about any of this on a "higher level"
language, or is Perl a higher level language?

Perl is a higher level language because you never have to worry about
any of those menial tasks that a C programmer constantly needs to be
aware of. That comes at a price so a Perl program can easily be 50 times
slower than a well-written C equivalent.

But there are ways to couple the convenience of Perl with the
performance of C. Many modules on the CPAN were written in C or C++ (or
rather in a dialect called XS which facilitates the passing of data from
Perl to C and vice versa). There is also Inline::C.

Toying around with any of these is an excellent and fun way to learn
about the inner workings of Perl because you'll inevitably learn how the
Perl stack works, how Perl's reference-counting is used to do
garbage-collection, how Perl's primitive data-types work and so on and
so forth.

Tassilo
 
R

robic0

Also sprach (e-mail address removed):
Or C?

(Disclaimer: I have not delved into the Perl source)

How close does Perl follow C as far as primatives?

Sometimes closely, but more often it doesn't follow C at all.

Some operators are implemented in terms of their C counterparts (most
notably all the bitwise operators).

The biggest differences are probably with arrays (that resize
dynamically) and strings. Perl strings are implemented on top of C
char-pointers and yet are infinitely smarter than C strings (e.g. they
transparently morph from one encoding into the other, ideally with the
programmer not even noticing it).

And there are of course hash-tables, something that C doesn't have at
all.
Can I rely on Perl language constructs to follow C when
writing performance code?

Some generic wisdom can probably be applied to writing both C and Perl
code. One such rule is to avoid the copying around of data although it's
sometimes not obvious what Perl code will result in a copy.

One difference for example is how C and Perl handle structures. One
common consideration with C is the order of struct-members that has an
impact both on memory consumption and performance due to memory
alignment issues:

struct A {
char a;
int32_t b;
char c[3];
}; /* 12 bytes on machines that align on 4-byte boundaries */

struct B {
char a;
char c[3];
int32_t b;
}; /* only 8 bytes */

In Perl, hashes are ordinarily used for modelling structs. They have
very different characteristics. One is that access to the members is
resolved at runtime whereas the members of a C structure have a fixed
memory offset from the base address which the compiler knows about.

On the other hand, certain C constructs might end up being faster in
Perl. Perl's index() is likely to be faster than C's strstr() as it uses
Boyer-Moore on the inside and attaches these information to the
variable.

Likewise length() versus strlen(): Perl's length() happens in constant
time as the length of each string is stored in the underlying
data-structure. C on the other hand has to loop over each character to
find the terminating '\0'.

Another thing to keep in mind is that function calls in Perl are slow.
In C they are also slower than inlined code, but with Perl the penalty
is by an order of a few magnitudes more severe. It's even worse with
method calls as the resolution of every method happens at runtime
(although perl does cache the information). In C++ method dispatch can
often by figured out at compile-time unless it's dynamic dispatch. Perl
only has dynamic dispatch.

I believe the best way to improve the performance of Perl programs is by
using the Benchmark module to find out which of the conceivable versions
performs better.
In a runtime race will a Perl for loop time exactly the same as a
C for loop? When the race begins, if not, why not?

The C loop wins. But then the Perl for-loop is more than just a loop:
It's a generic iterator construct that iterates over any list-alike
thing.
Any divergence in stack processing Perl to C?

Perl only uses the stack for passing arguments to and from functions and
methods. Unlike in C, it has a variadic size. That's why Perl functions
don't need a prototype. And if they have a prototype, it's for an
entirely different purpose.
Is there such a thing as a Perl "temporary" on the stack?

Perl knows about temporaries but they are not created on the stack but
instead on a separate thing called a scratchpad of which each block gets
its own. These temporaries aren't blindly destroyed when the block is
left. Instead, their reference-count is decremented. In the below
$temporary isn't freed because set_tmp() and get_tmp() still reference
it in their bodies. That means Perl has real closures:

{
my $temporary;
sub set_tmp { $temporary = shift; }
sub get_tmp { return $temporary; }
}
set_tmp 42;
print get_tmp;
Why would I need to care about any of this on a "higher level"
language, or is Perl a higher level language?

Perl is a higher level language because you never have to worry about
any of those menial tasks that a C programmer constantly needs to be
aware of. That comes at a price so a Perl program can easily be 50 times
slower than a well-written C equivalent.

But there are ways to couple the convenience of Perl with the
performance of C. Many modules on the CPAN were written in C or C++ (or
rather in a dialect called XS which facilitates the passing of data from
Perl to C and vice versa). There is also Inline::C.
I use XML::Xerces which wrapps apache xerces-c_2_3_0.dll.
The writer used a XSbot (I think) to auto generate the perl
dll and package. Like you say its just for passing parameters.
Xerces is very powerfull, however, I had to go to the C prototype
site and pick out functionality as the Perl interface writer
just auto-generated Perl wrappers. I had to write custom callbacks,
impose the performance SAX parser I use. Currently, its being used
to validate schema, 100% acurately. I will be using it to write
out xml soon, it has to be picked out. The Apache Perl interface
writer does not want to provide this, but it can be done through
trial and error given the C proto's. Its all there, maybe a thousand
functions (not sure). The schema validation I do though is w3c 2
compliant, 100% accurate. Xerces has to be manually installed,
which is kind of a pain, there is almost NO documentation (as its
just a auto-generated blind wrapper). Thats too bad as I personally
think XML Perl interface should be alot more robust and its actually
extremely lacking in both functionality and ease of use. Ok, its
basically schizoid because of the mainstream XML C community (w3c)
are in this nodal/sax dicotomy dissary. Because of anonymouse
(unamed) arrays in Perl it has a hard time reading - writing
XML. Workarounds are there, but its not for the "faint of heart".
Oh well, big thanks Tassilo for the info. I wish I had more time
to get deeper into this.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top