Inside-out objects are slow! (or how to accelerate OO Perl?)

Koszalek Opalek · Mar 17, 2008

I was about to redesign my class to use inside-out objects.
I was hoping that except for a cleaner design it will also result in
some performance boost.
(I thought that 10 hashes with approx. 1000 keys will be faster than
1000 hashes with 10 keys.)

However, a simple experiment reveals that the opposite is true.
Inside-out objects are approximately 3 times slower in this example --
and it gets worse as the number of object grows.

bash-3.2$ time ./makeregularobj.pl
real 0m0.156s
user 0m0.093s
sys 0m0.000s

bash-3.2$ time ./makeinsideoutobj.pl
real 0m0.437s
user 0m0.358s
sys 0m0.015s

I attach the two files below. Any comments?

Apart from inside-out objects what other techniques could be used to
accelerate OO Perl?
I looked at the fields module but it has been removed from Perl 5.10.

#------ makeregularobj.pl

#!/usr/bin/perl
use strict;

my $no_obj = $ARGV[0] || 10_000;

{
package P;

sub new
{
my $_class = shift;

my %self;

$self{field0}++;
$self{field1}++;
$self{field2}++;
$self{field3}++;
$self{field4}++;
$self{field5}++;
$self{field6}++;
$self{field7}++;
$self{field8}++;
$self{field9}++;
bless \%self, $_class;
}

};

my @objs;
for (1 .. $no_obj) {
push @objs, P->new();
};

print "Created $no_obj objects (blessed hashes), data stored in ten
fields inside a hash.\n";

#------ makeinsideoutobj.pl
#!/usr/bin/perl
use strict;

my $no_obj = $ARGV[0] || 10_000;

{
my %field0;
my %field1;
my %field2;
my %field3;
my %field4;
my %field5;
my %field6;
my %field7;
my %field8;
my %field9;
{
package P;

sub new
{
my $_class = shift;

my $self = 1;

$field0{\$self}++;
$field1{\$self}++;
$field2{\$self}++;
$field3{\$self}++;
$field4{\$self}++;
$field5{\$self}++;
$field6{\$self}++;
$field7{\$self}++;
$field8{\$self}++;
$field9{\$self}++;
bless \$self, $_class;
};
};
};

P->import();

my @objs;
for (1 .. $no_obj) {
push @objs, P->new();
};

print "Created $no_obj objects (blessed scalars), data stored in ten
inside-out hashes\n";

david · Mar 17, 2008

I was about to redesign my class to use inside-out objects.
I was hoping that except for a cleaner design it will also result in
some performance boost.
(I thought that 10 hashes with approx. 1000 keys will be faster than
1000 hashes with 10 keys.)

However, a simple experiment reveals that the opposite is true.
Inside-out objects are approximately 3 times slower in this example --
and it gets worse as the number of object grows.

bash-3.2$ time ./makeregularobj.pl
real 0m0.156s
user 0m0.093s
sys 0m0.000s

bash-3.2$ time ./makeinsideoutobj.pl
real 0m0.437s
user 0m0.358s
sys 0m0.015s

I attach the two files below. Any comments?

Apart from inside-out objects what other techniques could be used to
accelerate OO Perl?
I looked at the fields module but it has been removed from Perl 5.10.

#------ makeregularobj.pl

#!/usr/bin/perl
use strict;

my $no_obj = $ARGV[0] || 10_000;

{
package P;

sub new
{
my $_class = shift;

my %self;

$self{field0}++;
$self{field1}++;
$self{field2}++;
$self{field3}++;
$self{field4}++;
$self{field5}++;
$self{field6}++;
$self{field7}++;
$self{field8}++;
$self{field9}++;
bless \%self, $_class;

}
};

my @objs;
for (1 .. $no_obj) {
push @objs, P->new();

};

print "Created $no_obj objects (blessed hashes), data stored in ten
fields inside a hash.\n";

#------ makeinsideoutobj.pl
#!/usr/bin/perl
use strict;

my $no_obj = $ARGV[0] || 10_000;

{
my %field0;
my %field1;
my %field2;
my %field3;
my %field4;
my %field5;
my %field6;
my %field7;
my %field8;
my %field9;
{
package P;

sub new
{
my $_class = shift;

my $self = 1;

$field0{\$self}++;
$field1{\$self}++;
$field2{\$self}++;
$field3{\$self}++;
$field4{\$self}++;
$field5{\$self}++;
$field6{\$self}++;
$field7{\$self}++;
$field8{\$self}++;
$field9{\$self}++;
bless \$self, $_class;

};
};
};

P->import();

my @objs;
for (1 .. $no_obj) {
push @objs, P->new();

};

print "Created $no_obj objects (blessed scalars), data stored in ten
inside-out hashes\n";

I think that the point of inside-out objects is not to make OO safer
and not faster, The real question is why do you need to make perl oo
faster. In most cases it is fast enough. There are pathological cases
where you have to construct billions of objects. In this case you may
choose a non oo solution (This is also true for compiled languages).
The most important lesson I learned as programmer is "don;t optimize,
profile". It can be that the bottle neck is in a place you never
thought.

Best regards,
David

P.S. In insisde out object you have to write a destructor to clean the
hashes

xhoster · Mar 17, 2008

Koszalek Opalek said:
I was about to redesign my class to use inside-out objects.
I was hoping that except for a cleaner design it will also result in
some performance boost.
(I thought that 10 hashes with approx. 1000 keys will be faster than
1000 hashes with 10 keys.)

10 hashes with 100_000 keys might be more memory efficient than
100_000 hashes with 10 keys each, but I see no reason to think they
would be faster.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Koszalek Opalek · Mar 17, 2008

10 hashes with 100_000 keys might be more memory efficient than
100_000 hashes with 10 keys each,

Yes, that's what I read here:
http://www.perlfoundation.org/perl5/index.cgi?inside_out_object

but I see no reason to think they
would be faster.

That was just a shot in the dark, I was hoping for some
less-memory -> fewer-cache-misses -> better-overall-speed
effect.

Koszalek

Koszalek Opalek · Mar 17, 2008

I think that the point of inside-out objects is not to make OO safer
and not faster, The real question is why do you need to make perl oo
faster. In most cases it is fast enough.

So far, it has always been so for me...

There are pathological cases
where you have to construct billions of objects. In this case you may
choose a non oo solution (This is also true for compiled languages).

The thing is what is pathological for Perl?
Looks like this is thousands, not billions.

My module takes approx 1 sec to execute (on
the largest input). It has to create a few
thousand objects in the process. There is

1) Tree (only one)
2) Nodes (approximately 2ooo in the tree, and
then ~2ooo outside the Tree)
3) Streams (basically arrays holding references to Nodes
with some convenience methods, also a few thousands).

1 second is OK, 2 seconds would also be OK
but only just; so I am afraid that gives
me a very little safety margin for the future.

The thing is I never expected performance to be
a problem in the first place!

The most important lesson I learned as programmer is "don;t optimize,
profile". It can be that the bottle neck is in a place you never
thought.

Sure, I ran the code through the profiler, I fixed some
brain-damage in the algorithm. That cut the execution
time by half. I can earn more by doing a few ugly things
like replacing $node->getproto() with $node->{proto} but
it looks like I'll never be able to go down to say 0.2 sec.

K.

xhoster · Mar 17, 2008

Koszalek Opalek said:
Yes, that's what I read here:
http://www.perlfoundation.org/perl5/index.cgi?inside_out_object

That was just a shot in the dark, I was hoping for some
less-memory -> fewer-cache-misses -> better-overall-speed
effect.

You might be seeing the opposite. With regular objects, you access
one hash ten times, and then do that 100_000 times with a different hash
each time. With your inside out, are your accessing ten different hashes
once, then repeating that 100_000 times. It may be less memory, but it has
worse locality of reference. (But Perl has so much indirection that I
rarely consider locality of reference to be possible with it.)

Although it looks to me like another big time sink is in hashing
the scalar reference. The regular method uses compile time literal
strings, and the hashing is done just once.

I don't see a way to get away from the locality of reference thing while
using inside out objects. For the hashing, you could get much (but not
complete) benefit by do:

my $self=1;
my $rself=\$self;
my $rrself="$rself";

And then using $rrself instead of retaking the reference to $self each time
you set a hash. (There is a optimization that lets $rrself keep its hashed
value cached for faster access. This optimization is not
implemented/possible for $self, nor apparently for $rself.)

But this is just for fun. For serious work, I'd probably drop OO
altogether before I spent time micro-optimizing it.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Koszalek Opalek · Mar 17, 2008

(..)
_

So, if speed is that important to you: DO NOT USE OBJECTS. (In fact, DO NOT
USE PERL AT ALL).

Wow, I would never expected that in perl.misc (at least not
in capital letters;-)

Actually, I've been entertaining the idea of rewriting the module
in C/C++ for some time. Chances are it would make it blindingly
fast, but the ease and convenience of Perl (and the ability to
hack/extend it by anyone) is not something I want to give up
lightly.

Anyway, I did some C/C++ vs Perl comparisons and that's what I
got for
- empty for loop
- calling an object method (simple accessor)
- modeling a struct with a hash

I looks that what really sucks is the last case, i.e. modeling a C
struct
with a hash (0.5 s vs 175 sec). Sure, comparing an STL map to a Perl
hash
would be more apples-to-apples (but hey, I would not need a map in C+
+).

-=-=- loop.exe -=-=-

real 0m3.666s
user 0m3.588s
sys 0m0.000s

-=-=- loop.pl -=-=-

real 1m29.700s
user 1m28.967s
sys 0m0.078s

-=-=- objmeth.exe -=-=-

real 0m54.974s
user 0m54.412s
sys 0m0.031s

-=-=- objmeth.pl -=-=-

real 8m42.731s
user 8m38.921s
sys 0m0.405s

-=-=- struct.exe -=-=-

real 0m0.562s
user 0m0.452s
sys 0m0.015s

-=-=- struct.pl -=-=-

real 2m54.990s
user 2m53.129s
sys 0m0.093s

===== loop.c =====

int main()
{
volatile unsigned i;

for (i=0; i< (1 << 30); i++);
};

=========================
===== loop.pl =====

#!/usr/bin/perl
my $i;

for ($i=0; $i< (1 << 30); $i++) {};

=========================
===== objmeth.cpp =====

class C {
public:

C (int i)
{
data = i;
};

int get ()
{
return data;
};

int set (int i)
{
data = i;
};

private:
volatile int data;
};

int main()
{
C c0( 0 );
C c1( 1 );
C c2( 2 );
C ct( 0 );
volatile int i;

for (i=0; i<100000; i++) {
ct.set( c0.get() );
c0.set( c1.get() );
c1.set( c2.get() );
c2.set( ct.get() );
.....

=========================
===== objmeth.pl =====

#!/usr/bin/perl
use strict;

{
package P;

sub new {
my $self = {data => $_[1]};
bless $self, $_[0];
};

sub get {
return $_[0]->{data};
}

sub set {
$_[0]->{data} = $_[1];
};
};

my $o0 = P->new(0);
my $o1 = P->new(1);
my $o2 = P->new(2);
my $ot = P->new(0);
my $i;

for ($i=0; $i<100_000; $i++) {

$ot->set( $o0->get() );
$o0->set( $o1->get() );
$o1->set( $o2->get() );
$o2->set( $ot->get() );
.....

=========================
===== struct.c =====

int main()
{
typedef struct {
unsigned a;
unsigned b;
unsigned c;
unsigned d;
unsigned f;
} s_t;
volatile s_t s;

unsigned i;

s.a = 1;
s.b = 2;
s.c = 3;
s.d = 4;
s.f;

for (i = 0; i<10000; i++) {
s.f = s.a;
s.a = s.b;
s.b = s.c;
s.c = s.d;
s.a = s.f;
....

=========================
===== struct.pl =====

#!/usr/bin/perl
use strict;

my %h = ( a=> 0, b => 0, c => 0, d => 0, f => 0 );
my $i;

$h{a} = 1;
$h{b} = 2;
$h{c} = 3;
$h{d} = 4;

for ($i=0; $i<10000; $i++)
{
$h{f} = $h{a};
$h{a} = $h{b};
$h{b} = $h{c};
$h{c} = $h{d};
$h{a} = $h{f};
....

Koszalek Opalek · Mar 17, 2008

Although it looks to me like another big time sink is in hashing
the scalar reference. The regular method uses compile time literal
strings, and the hashing is done just once.

How nice I found out before actually reworking the class

Koszalek

xhoster · Mar 17, 2008

Koszalek Opalek said:
(..)
_

Wow, I would never expected that in perl.misc (at least not
in capital letters;-)

I think you must be new here. People here very frequently point out when
they feel Perl just isn't the right tool for the job. (Although we
sometimes disagree and argue about when exactly that is.) Maybe groups
dedicated to other languages are more religious.

Actually, I've been entertaining the idea of rewriting the module
in C/C++ for some time. Chances are it would make it blindingly
fast, but the ease and convenience of Perl (and the ability to
hack/extend it by anyone) is not something I want to give up
lightly.

I wouldn't consider Perl as a good language for the criteria that anyone be
able to hack/extend code written in it. I would think Java would be more
along those lines.

Anyway, I did some C/C++ vs Perl comparisons and that's what I
got for
- empty for loop
- calling an object method (simple accessor)
- modeling a struct with a hash

I looks that what really sucks is the last case, i.e. modeling a C
struct
with a hash (0.5 s vs 175 sec).

I think you might be missing the point. Yes, a struct-element assignment
in C is about as fast a atomic variable assignment is in C, while in Perl
a hash-element assignment is substantially slower (~2.5x, it looks like)
than a pure-scalar assignment. But the real issue is that any assignment
in Perl, even if not a hash element, is ~100 slower than any assignment in
C.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Joost Diepenmaat · Mar 17, 2008

Koszalek Opalek said:
(..)
_

Wow, I would never expected that in perl.misc (at least not
in capital letters;-)

Well, it's true.

Method calls and object instantiation are
especially slow compared to C++. If you're working with more than a few
hundred thousand objects, and you can capture the functionality in a
small XS interface (for instance, don't iterate over objects in perl
space) you can expect a very dramatic increase in speed (and also
probably a pretty good reduction in memory use). Usually you don't need
it, but when you do...

Actually, I've been entertaining the idea of rewriting the module
in C/C++ for some time. Chances are it would make it blindingly
fast, but the ease and convenience of Perl (and the ability to
hack/extend it by anyone) is not something I want to give up
lightly.

You should only put those parts that really matter in C/C++ space. All
the annoying stuff (loading databases, parsing files, user interface
etc) is much better handled in perl. XS takes some getting used to but
once you're familiar with it it's really pretty easy to move parts of
your perl code to XS (except for those damn reference counts).

Anyway, I did some C/C++ vs Perl comparisons and that's what I
got for
- empty for loop
- calling an object method (simple accessor)
- modeling a struct with a hash

Those kinds of tests are *usually* not very relevant unless your objects
are so simple that you'd be much better off not using objects at all
(even if you go for C++).

I looks that what really sucks is the last case, i.e. modeling a C
struct
with a hash (0.5 s vs 175 sec).

C structs are actually more like arrays. You could try modelling with an
array instead;

# this should inline the field names
use constant {
FIELDA => 0,
FIELDB => 1,
};

my @o = ['val1','val2'];

print $o[FIELDA];
print $o[FIELDB];

I don't expect that to be *much* faster, but at least it gives you
some typo-checking.

(but hey, I would not need a map in C+ +).

Don't be too sure.

Ted Zlatanov · Mar 17, 2008

KO> Actually, I've been entertaining the idea of rewriting the module in
KO> C/C++ for some time. Chances are it would make it blindingly fast,
KO> but the ease and convenience of Perl (and the ability to hack/extend
KO> it by anyone) is not something I want to give up lightly.

Try Inline::C. It's a very good compromise between C and Perl.

Ted

xhoster · Mar 17, 2008

Ted Zlatanov said:
On Mon, 17 Mar 2008 11:18:47 -0700 (PDT) Koszalek Opalek

KO> Actually, I've been entertaining the idea of rewriting the module in
KO> C/C++ for some time. Chances are it would make it blindingly fast,
KO> but the ease and convenience of Perl (and the ability to hack/extend
KO> it by anyone) is not something I want to give up lightly.

Try Inline::C. It's a very good compromise between C and Perl.

I'm a fan of Inline::C, but I don't see it being very good for this type of
OO, where you have a very large number of very light-weight objects. To
get good benefits, you would probably have to push the container of the
objects--not just the objects themselves--plus all operations operating on
the container, down into C.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Koszalek Opalek · Mar 18, 2008

However, a simple experiment reveals that the opposite is true.
Inside-out objects are approximately 3 times slower in this example --
and it gets worse as the number of object grows.

bash-3.2$ time ./makeregularobj.pl
real 0m0.156s
user 0m0.093s
sys 0m0.000s

bash-3.2$ time ./makeinsideoutobj.pl
real 0m0.437s
user 0m0.358s
sys 0m0.015s

Wooow, look at this:

bash-3.2$ time perl ./makeregularobj.pl 100000
Created 100000 objects (blessed hashes), data stored in ten fields
inside a hash.

real 0m0.734s
user 0m0.655s
sys 0m0.015s

bash-3.2$ time perl ./makeinsideoutobj.pl 100000
Created 100000 inside-out objects (blessed scalars).

real 0m0.436s
user 0m0.343s
sys 0m0.031s

Wet your appetite. I will send details in the evening.
Until then ;-)

Koszalek

Ted Zlatanov · Mar 18, 2008

KO> Actually, I've been entertaining the idea of rewriting the module in
KO> C/C++ for some time. Chances are it would make it blindingly fast,
KO> but the ease and convenience of Perl (and the ability to hack/extend
KO> it by anyone) is not something I want to give up lightly.
x> I'm a fan of Inline::C, but I don't see it being very good for this type of
x> OO, where you have a very large number of very light-weight objects. To
x> get good benefits, you would probably have to push the container of the
x> objects--not just the objects themselves--plus all operations operating on
x> the container, down into C.

It really depends on the application, but generally I agree (except I
didn't recommend Inline::C for OO work, and I wouldn't).

In this case these operations and containers may be small enough (in
terms of lines of code) that Inline::C makes sense. Regardless OO
should probably be avoided in a performance-critical application at the
deeper layers. It's fine at the upper layers, wherever you don't have
to worry about algorithmic complexity, e.g. to handle configuration
management.

Ted

Koszalek Opalek · Mar 18, 2008

On 17 Mar 2008 21:28:09 GMT (e-mail address removed) wrote:

KO> Actually, I've been entertaining the idea of rewriting the module in
KO> C/C++ for some time. Chances are it would make it blindingly fast,
KO> but the ease and convenience of Perl (and the ability to hack/extend
KO> it by anyone) is not something I want to give up lightly.

x> I'm a fan of Inline::C, but I don't see it being very good for this type of
x> OO, where you have a very large number of very light-weight objects. To
x> get good benefits, you would probably have to push the container of the
x> objects--not just the objects themselves--plus all operations operating on
x> the container, down into C.

It really depends on the application, but generally I agree (except I
didn't recommend Inline::C for OO work, and I wouldn't).

I had a look at it today, had "blah" working immediately (somehow
I never bother to printf hello world, I printf blah). Unfortunately
it's not part of the default installation, requires a C compiler
(not an issue on *nix but more so on Windows; even people with VS
installed usually do not have the environment - vsvars32.bat - set),
and does not work on paths containing spaces (at least it did not
for me).

Overall, looks pretty cool though. Maybe I'll use it one day but
as Xho pointed out with so many light-weight objects doing little
things and communicating with each other I would probably end up
hiding the whole thing in C/C++.

Koszalek

Koszalek Opalek · Mar 18, 2008

Wooow, look at this:

bash-3.2$ time perl ./makeregularobj.pl 100000
Created 100000 objects (blessed hashes), data stored in ten fields
inside a hash.

real 0m0.734s
user 0m0.655s
sys 0m0.015s

bash-3.2$ time perl ./makeinsideoutobj.pl 100000
Created 100000 inside-out objects (blessed scalars).

real 0m0.436s
user 0m0.343s
sys 0m0.031s

Wet your appetite. I will send details in the evening.
Until then ;-)

Instead of using the object address to index the inside-out
fields, I used an ID.

#!/usr/bin/perl
use strict;

my $no_obj = $ARGV[0] || 10_000;

{
#inside-out arrays
my @field0;
my @field1;
# ...
# ...
my @field9;
{
package P;

my $cnt = 0;
sub new
{
my $_class = shift;

my $self = $cnt++;
bless \$self, $_class;

$field0[$self]=2;
$field1[$self]++;
# ....
# ....
$field9[$self]++;

return \$self;
};

sub getf0
{
return $field0[${$_[0]}];
};

};
};

You can do the same thing using inside-out hashes instead of arrays
but arrays are somewhat faster -- both for creation (the larger the
number of objects the better the ratio) and for data access.

One benchmark I ran (for reading data from one field) was:
1m42.289s for hashes
1m23.724s for arrays
I guess most of that time is spend in calling the methods,
not in accessing the data inside an array/a hash.

The problem with arrays is garbage collection.

For me it does not really matter, I can do Class->cleanup() after
I've done the processing. (cleanup() would empty all arrays).

I was thinking I could also collect the numbers (id's) of
DESTROY-ed objects and then splice() the arrays when the number
reaches a certain threshold (e.g. 1000). Then the existing
objects would have to be renumbered. The problem is that
I cannot store the object references in some array in the
class because if I do, they will never be DESTROY-ed. So it
is a catch-22 unless we can work-around it with some low-level
magic.

Anyway, given 4x performance (compared to 'regular' inside-out
objects) and thread-safety I think that this might be an
interesting solution.

Koszalek

Koszalek Opalek · Mar 18, 2008

Instead of using the object address to index the inside-out
fields, I used an ID.

Actually I had to do the very same thing in a somewhat different
scenario. I overloaded the "" operator and after that I could no
longer compare objects using ==; I had to also overload == (Look
for "Side-effects of overloading quotes" elsewhere in the group).
Someone pointed that out (i.e. pointed that I have to overload ==)
and suggested using refaddr for comparison.

That worked but the drop in performance was visible immediately.
So I resorted to comparing ID's instead of refaddr. (I already
had ID's in my objects). And that was so much quicker...

Koszalek

How to parse a fixed length record	7	Jul 29, 2005
unpack query	7	Nov 24, 2003
Cannot read data from server easily?	1	Jul 7, 2010
Text::CSV and Mysql - invalid number of columns	3	Feb 22, 2007
Text::CSV and Mysql	2	Feb 23, 2007
newbie question: gridview, how to automatically update all fields	1	Jan 23, 2008
OO Perl - How to maintain class state when using inherited methods?	2	Aug 22, 2006
Object Oriented programs, Perl and others	8	Aug 27, 2004

Inside-out objects are slow! (or how to accelerate OO Perl?)

Koszalek Opalek

david

xhoster

Koszalek Opalek

Koszalek Opalek

xhoster

Koszalek Opalek

Koszalek Opalek

xhoster

Joost Diepenmaat

Ted Zlatanov

xhoster

Koszalek Opalek

Ted Zlatanov

Koszalek Opalek

Koszalek Opalek

Koszalek Opalek

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads