C
Clint Olsen
I've been working on a parser project to translate from a proprietary
hardware description language into something more industry standard, and
I've found some peculiar performance characteristics in Perl. Hopefully
someone can shed a bit more light on the subject. I've been using Perl
5.8.2, so my findings should be pretty relevant to current users.
Since this project is fairly complex, I'm using Parse::Yapp to handle the
parsing functionality. Perl's natural pattern matching ability makes it
pretty simple to code up an lexical analyzer by hand.
After running the profiler on the tool, I found that over 70% of the
runtime was being taken up in the scanner (the function actually scanning
tokens from the file). So, I decided to start there.
The scanner is pretty straightforward. I precompiled patterns using qr//o
with all the reserved keywords and special patterns for tokens in the
language. I reused these patterns in m// like so:
#
# Precompile patterns.
#
my $INTEGER = qr/(-?\d+)/o;
....
....
....
elsif (/\G$INTEGER/ogc) {
return [ 'INTEGER', $1 ];
}
There are many such patterns like this, some of them quite a bit more
complicated than integers. I used /gc and the assertion \G in all the
patterns since this was mentioned in the Perl documentation for writing
lex-like scanners.
For file I/O, I snarfed the entire contents into a scalar by adjusting $/:
sub read_file {
if (open FILE, '<', $_[0]) {
local $/ = undef;
my $data = <FILE>;
\$data;
}
}
Using this method seems popular among Perl users and eliminates file i/o
issues from the programmer.
Unfortunately, the result of this combination was pretty sluggish
performance. To read a nominal file which involves considerable file
inclusion takes slightly over a minute of runtime. There is also quite a
bit of system overtime registered by the time(1) command.
The first thing I tried was to eliminate the reliance on the zero-width
assertion \G and /gc on the search. I changed all the calls from using m//
to using s// and just stripped off the leading text as I matched it. The
result of that was a fair amount of savings - around 35% runtime savings.
Encouraged by that result, I decided to take it one step further and
experiment with more traditional lexer file i/o techniques. I assumed Perl
just uses mmap() to read entire files into a scalar, but I'm not entirely
sure what happens behind the scenes without diving into the source.
I tried just opening the file and using Perl's read() call to snarf the
data in 8K chunks, using the following fill function:
sub fill {
my ($buf) = @_;
my $bufin;
$$buf[2] = (read($$buf[0], $bufin, 8192) != 8192);
$$buf[1] .= $bufin;
}
The buf object is an array triplet of a file descriptor, scalar containing
the file data currently read, and EOF indicator. Fill() isn't called until
the length of the current buffer has fallen below 256 characters. So, in
essence this just makes an assumption that no token is greater than 256
characters.
The result of this change produced dramatically better results. The
runtime dropped to no more than 15 seconds! Just to get a sense of how
much system call overhead there was, I ran strace on each of them:
-rw-r--r-- 1 clint users 8994621 Apr 16 17:00 /tmp/parse1.trc
-rw-r--r-- 1 clint users 3625534 Apr 16 17:00 /tmp/parse2.trc
-rw-r--r-- 1 clint users 976800 Apr 17 11:13 /tmp/parse3.trc
The first is the original program reading the entire file(s) in as scalars
and using the zero-width assertion (\G), the second was substituting s//
for m//gc, and the last was the change to the manual buffering. I'm pretty
shocked at the speed differences between these changes. I wouldn't call my
manual buffering very sophisticated, either. But it is certainly worth it
since it yields over a 75% speed improvement on average.
-Clint
hardware description language into something more industry standard, and
I've found some peculiar performance characteristics in Perl. Hopefully
someone can shed a bit more light on the subject. I've been using Perl
5.8.2, so my findings should be pretty relevant to current users.
Since this project is fairly complex, I'm using Parse::Yapp to handle the
parsing functionality. Perl's natural pattern matching ability makes it
pretty simple to code up an lexical analyzer by hand.
After running the profiler on the tool, I found that over 70% of the
runtime was being taken up in the scanner (the function actually scanning
tokens from the file). So, I decided to start there.
The scanner is pretty straightforward. I precompiled patterns using qr//o
with all the reserved keywords and special patterns for tokens in the
language. I reused these patterns in m// like so:
#
# Precompile patterns.
#
my $INTEGER = qr/(-?\d+)/o;
....
....
....
elsif (/\G$INTEGER/ogc) {
return [ 'INTEGER', $1 ];
}
There are many such patterns like this, some of them quite a bit more
complicated than integers. I used /gc and the assertion \G in all the
patterns since this was mentioned in the Perl documentation for writing
lex-like scanners.
For file I/O, I snarfed the entire contents into a scalar by adjusting $/:
sub read_file {
if (open FILE, '<', $_[0]) {
local $/ = undef;
my $data = <FILE>;
\$data;
}
}
Using this method seems popular among Perl users and eliminates file i/o
issues from the programmer.
Unfortunately, the result of this combination was pretty sluggish
performance. To read a nominal file which involves considerable file
inclusion takes slightly over a minute of runtime. There is also quite a
bit of system overtime registered by the time(1) command.
The first thing I tried was to eliminate the reliance on the zero-width
assertion \G and /gc on the search. I changed all the calls from using m//
to using s// and just stripped off the leading text as I matched it. The
result of that was a fair amount of savings - around 35% runtime savings.
Encouraged by that result, I decided to take it one step further and
experiment with more traditional lexer file i/o techniques. I assumed Perl
just uses mmap() to read entire files into a scalar, but I'm not entirely
sure what happens behind the scenes without diving into the source.
I tried just opening the file and using Perl's read() call to snarf the
data in 8K chunks, using the following fill function:
sub fill {
my ($buf) = @_;
my $bufin;
$$buf[2] = (read($$buf[0], $bufin, 8192) != 8192);
$$buf[1] .= $bufin;
}
The buf object is an array triplet of a file descriptor, scalar containing
the file data currently read, and EOF indicator. Fill() isn't called until
the length of the current buffer has fallen below 256 characters. So, in
essence this just makes an assumption that no token is greater than 256
characters.
The result of this change produced dramatically better results. The
runtime dropped to no more than 15 seconds! Just to get a sense of how
much system call overhead there was, I ran strace on each of them:
-rw-r--r-- 1 clint users 8994621 Apr 16 17:00 /tmp/parse1.trc
-rw-r--r-- 1 clint users 3625534 Apr 16 17:00 /tmp/parse2.trc
-rw-r--r-- 1 clint users 976800 Apr 17 11:13 /tmp/parse3.trc
The first is the original program reading the entire file(s) in as scalars
and using the zero-width assertion (\G), the second was substituting s//
for m//gc, and the last was the change to the manual buffering. I'm pretty
shocked at the speed differences between these changes. I wouldn't call my
manual buffering very sophisticated, either. But it is certainly worth it
since it yields over a 75% speed improvement on average.
-Clint