Fastest Perl Interpreter

  • Thread starter Michael J. Astrauskas
  • Start date
M

Michael J. Astrauskas

I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work. Is there any particular
processor architecture (32-bit AMD/Intel, Opteron, Sun, Apple, etc.),
operating system, and interpreter that are known to be the most powerful
combination?

My current idea is to use a dual Xeon with a gigabyte ethernet port and
at least 2 GB of RAM.

Thank you for any advice, recommendations, and pointers!
 
D

Darin McBride

Michael said:
I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work. Is there any particular
processor architecture (32-bit AMD/Intel, Opteron, Sun, Apple, etc.),
operating system, and interpreter that are known to be the most powerful
combination?

My guess, then, is that you want something that has a big network pipe
and a fast processor. Perl is perl is perl - the only difference
between the platforms is going to be the C optimiser that you compile
perl itself with. Well, maybe not *only*, but that will have the
largest effect other than the hardware itself.

For example, Intel promises a percentage-boost simply by compiling with
their Proton or Electron compilers over using MSVC or gcc on the same
platform. If you compile perl 5.8.1 with the Electron compiler on
Linux/ia32, you'll probably be doing alright for your system.
My current idea is to use a dual Xeon with a gigabyte ethernet port and
at least 2 GB of RAM.

As soon as you have multiple CPUs, you'll only see any benefits if you
actually have a multi-threaded or multi-process system running. That
is, at least one thread of activity per CPU. If your program can only
do one thing at a time, that second CPU will be 90%+ wasted.

If, however, your program can run multithreaded (dangerous in perl,
last I heard) or multiprocess (i.e. fork() and do stuff in the child
process), then by all means go with the SMP setup. Note that perl
likes fork()'s, and Windows doesn't, so you might be best off here
going with Linux instead of Windows.
Thank you for any advice, recommendations, and pointers!

As long as you're going with commodity hardware, dual P4s will probably
saturate your gigabit ethernet. Depending on what you're doing, you
may want dual gigabit ethernet cards, and go with 4 or more Pentium4's
for better throughput. This only helps, of course, if the data you're
processing can come in from multiple networks or subnets. It can
eliminate a hop for people coming in from more than one router away.

But that doesn't change much if you go away from commodity hardware.
In general, the commercial-unix hardware (RS6000, current SPARCs, etc),
even if not running commercial unix (e.g. Linux on PPC hardware rather
than AIX on the same hardware) may get you better performance, and
possibly less downtime. But you'll still want gigabit ethernet, and
possibly more than one. And you'll still want SMP, if your program can
handle it.

There are so many possibilities to keep aware of that your vague
question can only get you vague answers ... but they may point you in
the right way for your situation.
 
W

William Herrera

I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work.

In most cases, unless there is a large amount of calculation or disk searching
involved, this will result in an i/o bound system, where most of the CPU
cycles are spent waiting for the network i/o to complete. In such a case,
system CPU speed will be secondary in system speed impact to network latency
issues. This may guide selection of hardware--spend more on speeding your I/O,
not necessarily your CPU.

OTOH, if your app is really CPU bound, not network bound, maybe you should not
use Perl. Use something compiled.
 
J

Juha Laiho

Michael J. Astrauskas said:
I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work. Is there any particular
processor architecture (32-bit AMD/Intel, Opteron, Sun, Apple, etc.),
operating system, and interpreter that are known to be the most powerful
combination?

My current idea is to use a dual Xeon with a gigabyte ethernet port and
at least 2 GB of RAM.

If you're really going to need that amount of RAM for your processing,
then the memory latency _and_ bandwidth may play big parts in overall
performance. Try to find this info for the choices you're considering.
When reading the info, try to calculate ratios between different speeds
rather than absolute differences - ratios often tell more. For this there
are two factors: the speeds supported by the motherboard, and the speeds
supported by the memory modules.

As for raw CPU performance, the current Intel/AMD setups bring more
"bang for buck" than traditional Unix platforms (IBM Power CPUs,
Solaris Sparc, HP PA-RISC), but may be more limited in I/O bandwidth
and latency. However, for just one 1Gbit/s network adapter, the "PC"
solutions should still be ok. If you were asking for multiple 1Gbit/s
channels and multiple fast disk channels, then I'd rather recommend
something other than "PC" hardware -- or at least would direct you
towards the high-end PC servers.

Then, do you really mean gigabyte ethernet port, i.e. roughly 10 gigabits/s,
or just one gigabit/s? Pay attention, the difference is huge.

When you're talking about 1Gbit/s network speeds, you'll need to look at
the bus architecture of the system: the traditional 32bit/33MHz PCI bus
has theoretical bus bandwidth of just over 1Gbit/s, so it's not enough to
feed a gigabit ethernet channel - you'll need something faster. 64bit/66MHz
PCI bus will do nicely, but will limit the possible hardware choices
(and you can forget the prices of commodity motherboards).
 
P

Peter Cooper

Michael J. Astrauskas said:
I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work. Is there any particular
processor architecture (32-bit AMD/Intel, Opteron, Sun, Apple, etc.),
operating system, and interpreter that are known to be the most powerful
combination?

My current idea is to use a dual Xeon with a gigabyte ethernet port and
at least 2 GB of RAM.

From a pure 'server speed' point of view (you've already received a few good
opinions on the specifics), I'd reserve dual Opterons, without a doubt.
They're faster than the Xeons in UNIX/Linux server scenarios (but not
workstation scenarios), and cheaper to setup too. You may also get a benefit
from the 64 bit compatability (Perl can be compiled to a 64 bit executable -
see perl.64bit on nntp.perl.org)

http://www.tomshardware.com/cpu/20030422/index.html

Also see these, mostly workstation oriented, benchmarks for G5/Opteron/P4
comparisons:

http://www.pcworld.com/news/article/0,aid,112749,pg,8,00.asp

The Opteron wins almost all tests. The Firing Squad have also produced an
article on building an Opteron workstation, which may be of some ancillary
interest:

http://firingsquad.com/hardware/building_gaming_opteron_2003_Part1/default.a
sp

If you could elaborate on your 'intense amount of network and processor
work' we might be able to narrow things down a bit. For example, your
process might not be memory intensive, meaning money saved there could go
into other areas of the system architecture.

I'd also echo the comments of another respondant, and recommend you look
into producing a compiled version of whatever it is you want to do. The
increased power would, in most cases, pay for itself if the project is over
a certain length, unless the application is too complex.

Pete
 
S

sk

Michael said:
I'm attempting build a system that needs to run a perl script very
quickly. There will be no disk access, but the system will be doing an
intense amount of network and processor work. Is there any particular
processor architecture (32-bit AMD/Intel, Opteron, Sun, Apple, etc.),
operating system, and interpreter that are known to be the most powerful
combination?

Is it going to be running a large number of short tasks or a small
number of long ones? Starting up a Perl interpreter each time will be an
issue if it's the former. If that's the case, you probably want to write
your Perl program as a server that launches once and handles repeated
requests or tasks without exiting. As long as you're not spawning perl
interpreters over and over again, your performance should be good and it
will make it unnecessary to output "compiled" Perl, which is just a
snapshot of what really gets executed once the interpreter chews through
the raw code.

Apart from that, your question is really vague. "Intense amount of
network and processor work" doesn't tell us anything, much less whether
Perl is the right language for the job at all. Is it doing one linear
task or lots of small tasks in parallel? Is it handling small chunks of
data or huge ones? All of which points more to the language and the
compiler used to build the language than to the shininess of a brand of
CPU or bus speed.
 
T

Tom

An option maybe to split the processing between multiple machines? A cheap
option if you have some oldish systems, however that would require
considerable work to enable.

Cheers,
Tom
 
W

W K

Do you think concorde passengers write perl?
Easier and cheaper to not bother with trying to optomise speed and just go
on a 747 - or even an A330 (!).
You have forgotten about space shuttles and orbiting space labs,
all of which travel at relative speeds of eighteen-thousand miles
per hour or greater.

Space shuttles?
 
M

Michael J. Astrauskas

Peter said:
If you could elaborate on your 'intense amount of network and processor
work' we might be able to narrow things down a bit. For example, your
process might not be memory intensive, meaning money saved there could go
into other areas of the system architecture.

The task is basically a head NNTP server. It will be balancing incoming
loads to various other servers on the network. Apparently CPU isn't so
important as network i/o.
I'd also echo the comments of another respondant, and recommend you look
into producing a compiled version of whatever it is you want to do. The
increased power would, in most cases, pay for itself if the project is over
a certain length, unless the application is too complex.

It's for a contest, more or less, so I can't choose what the task is.
It's a Perl script not written by myself.
 
M

Michael J. Astrauskas

William said:
In most cases, unless there is a large amount of calculation or disk searching
involved, this will result in an i/o bound system, where most of the CPU
cycles are spent waiting for the network i/o to complete. In such a case,
system CPU speed will be secondary in system speed impact to network latency
issues. This may guide selection of hardware--spend more on speeding your I/O,
not necessarily your CPU.

It'll basically be a front end NNTP server. Network I/O will be intense.
About 300 copies of the script will be running at any time.
OTOH, if your app is really CPU bound, not network bound, maybe you should not
use Perl. Use something compiled.

Sadly, it's for a contest and the language and script are not of my choice.
 
M

Michael J. Astrauskas

Darin said:
My guess, then, is that you want something that has a big network pipe
and a fast processor. Perl is perl is perl - the only difference
between the platforms is going to be the C optimiser that you compile
perl itself with. Well, maybe not *only*, but that will have the
largest effect other than the hardware itself.

It's for a contest and the requirement is actually dual gigabit Ethernet
ports.
For example, Intel promises a percentage-boost simply by compiling with
their Proton or Electron compilers over using MSVC or gcc on the same
platform. If you compile perl 5.8.1 with the Electron compiler on
Linux/ia32, you'll probably be doing alright for your system.

I'm definitely going to keep this in mind. Thank you.
As soon as you have multiple CPUs, you'll only see any benefits if you
actually have a multi-threaded or multi-process system running. That
is, at least one thread of activity per CPU. If your program can only
do one thing at a time, that second CPU will be 90%+ wasted.

If, however, your program can run multithreaded (dangerous in perl,
last I heard) or multiprocess (i.e. fork() and do stuff in the child
process), then by all means go with the SMP setup. Note that perl
likes fork()'s, and Windows doesn't, so you might be best off here
going with Linux instead of Windows.

The script is single-threaded but there will be many instances of it
running.
As long as you're going with commodity hardware, dual P4s will probably
saturate your gigabit ethernet. Depending on what you're doing, you
may want dual gigabit ethernet cards, and go with 4 or more Pentium4's
for better throughput. This only helps, of course, if the data you're
processing can come in from multiple networks or subnets. It can
eliminate a hop for people coming in from more than one router away.

I was looking at some benchmarks and P4s did quite well, but dual
Opterons did better in most regards (over dual Xeons, as well). Not all,
though, which makes my decision all that much harder.
 
D

Darin McBride

Michael said:
It's for a contest and the requirement is actually dual gigabit Ethernet
ports.

I see I nailed that one already by suggesting it. ;->
I'm definitely going to keep this in mind. Thank you.

Just note that these compilers cost $$$, whereas MSVC only costs $, and
gcc, of course, has no cost. ;->
The script is single-threaded but there will be many instances of it
running.

That is multiprocess. Obviously, I should have used "e.g." rather than
"i.e.". Thus, SMP is helpful. If you can change the code slightly to
use fork after loading all the modules and you can use Linux, I would
say that this would get a small speed boost as your script would only
get compiled once for all the processes, and this may help the OS
overlap things in memory for more speed boosts during runtime (fewer
cache misses, possibly).
I was looking at some benchmarks and P4s did quite well, but dual
Opterons did better in most regards (over dual Xeons, as well). Not all,
though, which makes my decision all that much harder.

Yes - unless you can get the hardware side-by-side running your
application, it'll be a fair bit of guesswork.

Best of luck!
 
B

Ben Morrow

Darin McBride said:
Just note that these compilers cost $$$, whereas MSVC only costs $, and
gcc, of course, has no cost. ;->

Is there significant advantage to using Intel's free icc over gcc?

Ben
 
T

Tassilo v. Parseval

Also sprach Ben Morrow:
Is there significant advantage to using Intel's free icc over gcc?

It's said to give faster binaries than gcc. I, however, was not able to
confirm this when comparing its result with a gcc > v3. But then again,
I don't have an Intel processor.

One thing however could be significant. Intel's icc enforces stricter
programming habits. My experience with gcc is that it makes you lax and
you could end up with code that wont compile on some legacy compilers
without some modifications. Particularly, it doesn't have all those gcc
extensions that can be quite a spoiler when you intend to be portable.

So it's become a habit of me to also test my stuff with the icc. It's a
good indication as to whether it will work with some of the stricter
compilers (like that of SUN or Microsoft).

Tassilo
 
S

Steve Koppelman

Michael said:
It'll basically be a front end NNTP server. Network I/O will be intense.
About 300 copies of the script will be running at any time.

NNTP? Sounds like a strange contest. Is this a coding contest or a
contest contest? SMTP, POP, IMAP, HTTP and maybe an IM protocol I can
see, but NNTP, huh? All right. Are there even 300 regular users of NNTP
left? :) Okay, okay. No matter.

Regardless of what protocol you're building this out of, "running 300
copies of the script" simultaneously is an incredibly bad idea. Doing so
with an interpreted language like Perl, Python, Tcl, VBScript, LISP,
whatever will certainly need heavy duty hardwarte, but not for the
actual work of your script. The CPU and memory and probably the disks
too will all be occupied initializing and tearing down all those
interpreters. The actual meat of what your script does will use only a
tiny fraction of the resources, and you'll need many times the memory
and CPU than you would otherwise.

Perl is fine for what you want to do, but you want to do it
*multithreaded*, with one instance of the script running per box as a
daemon. That ought to be plenty fast given adequate hardware, and your
choice of CPU architecture, compiler and compiler optimizations should
be mostly irrelevant. Throw an adequate number of MIPS at it focus more
on writing properly optimized Perl and making sure your disk and network
I/O are adequate.

If you or someone else insists on implementing this with hundreds of
separate processes, then it's hard to imagine a worse design decision
than spawning hundreds of interpreters simultaneously with tens or
hundreds of thousands of launches and teardowns per day. If you want
separate processes, you should be using C/C++ or something else that
compiles to tight binaries, period.

Splitting hairs over Opterons vs. Xeons vs. supercooled Athlons or
mastodons is pointless. You should have hardware capable of handling
your largest realistically-projected traffic spikes with a good amount
of headroom remaining, period. You can do it with Celerons if you really
want to: just pick hardware that gives you the SPECmarks, the I/O and
the uptime you need at a good price. If you're running it on one box
rather than load-balancing a few smaller ones, and have nothing
"collecting" requests during downtime, then redundant and/or
hot-swappable components (especially disk and power supplies) become
more important. Trying to get a "perfect fit" with little or no headroom
is completely wrong.

Indeed, if availability is critical, you should either design the system
to be load balanced across 1..n machines with no single point of
failure, or you should be holding your contest over a protocol that can
store-and-forward gracefully, unlike private NNTP. With something like
HTTP you can load-balance straightforwardly and run a few smaller and
more disposable servers instead of one big high-availability one. With
an email-based contest, on the other hand, you can have dumb
store-and-forward MX hosts specified in DNS that will catch inbound
entries and pass them along automatically in the event of downtime on
your "collector", effectively offloading the job of inbound-message
fault tolerance to your ISP.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top