Fastest Hex to Ascii routine

Mark H · Feb 8, 2006

I have been beating myself over the head looking for a faster hex to
ascii routine. I've scoured the Internet for 3 hours now and have
found nothing that even remotely holds up on megabytes of hex to ascii
conversion. Here's what I have so far:
for (my $i = 0; $i < length($file_raw_hex); $i += 2)
{
$file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
}

This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
P4.

Then this is slightly faster:
$file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

Comes in at 1.5 seconds per meg.

But there's got to be something that can do better than this. This is
a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
no other bottleneck here. This code is dog slow.

Does anyone have any suggestions? I have been trying to figure out if
Bit::Vector could help but to no avail (Bit::Vector has no ascii
abilities as far as I know - it only converts between
decimal/hex/octal). I would love if someone has a module to suggest
that uses XS code.

Thanks
Mark

A. Sinan Unur · Feb 8, 2006

Mark H said:
I have been beating myself over the head looking for a faster hex to
ascii routine. I've scoured the Internet for 3 hours now and have
found nothing that even remotely holds up on megabytes of hex to ascii
conversion. Here's what I have so far:
for (my $i = 0; $i < length($file_raw_hex); $i += 2)
{
$file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
}

This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
P4.

Then this is slightly faster:
$file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

Comes in at 1.5 seconds per meg.

But there's got to be something that can do better than this. This is
a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
no other bottleneck here. This code is dog slow.

How about line-by-line or block-by-block processing?
Here is something quick'n'dirty:

#!/usr/bin/perl

use strict;
use warnings;

open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

my ($data, $buffer);

{
local $,;
while (sysread $in, $buffer, 4096) {
my @lines = split /\n/, $buffer;
@lines = map { s{([[:xdigit:]]{2})}{chr(hex $1)}eg } @lines;
$data .= "@lines";
}
}

__END__

D:\Home\asu1\UseNet\clpmisc\hex> tail -n 3 hexfile
EAFC3885140E9010FFD505127FC20C62F47202C403B9B66F8DC88EC542A0D0888A7522911128B559
BF7E364E624A0651D01BBD4ACFAC813686AF489AC0246DC9CBDFC7D43662AB9D41C3EDEE34AE6DFC
7D402B3CC7D47DF8DF785689AE243A970963E458A6981C20FB81D13F511DF287CDB11F66C0F2A8FE

D:\Home\asu1\UseNet\clpmisc\hex> dir hexfile

02/08/2006 03:52 PM 2,050,000 hexfile

D:\Home\asu1\UseNet\clpmisc\hex> timethis read.pl hexfile

TimeThis : Command Line : read.pl hexfile
TimeThis : Start Time : Wed Feb 08 16:23:35 2006
TimeThis : End Time : Wed Feb 08 16:23:37 2006
TimeThis : Elapsed Time : 00:00:01.859

which translates to a little less than a second per megabyte on my
AMD64 1.8Ghz laptop (running at 800Mhz on batteries) with Win XPSP2.

See what results you get on your system.

And, please, the next time post a complete program that we can run
by copying and pasting.

Mark H · Feb 8, 2006

Hi Sinan,

Thank you for throwing in your hat here to help!

Your program doesn't do what I assume you think it does. Yes, it seems
very fast. But it doesn't actually output in ASCII. It turned my hex
into a series of numbers that made no sense.

Best,
Mark

Mark H · Feb 8, 2006

Not sure if this would make things any faster but our hex data is
already in memory in a $variable with no \n's in it. So splitting
isn't necessary (it's not line-by-line data)... it's just megs of solid
hex.

Mark

Mark H · Feb 8, 2006

Somehow I am having a hard time believing that no XS module exists for
this. It's so simple to write hex to ascii conversion in C and I would
be surprised that no one has invented a simple module to handle this
with great speed...

Best,
Mark

ednotover · Feb 8, 2006

Mark said:
Here's what I have so far:
for (my $i = 0; $i < length($file_raw_hex); $i += 2)
{
$file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
}

$file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

Why not just pack it all in one fell swoop?

$file_raw = pack 'H*', $file_raw_hex;

Ed

A. Sinan Unur · Feb 8, 2006

[ Please quote an appropriate amount of context when replying ]

....

$file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

Click to expand...

....

#!/usr/bin/perl

use strict;
use warnings;

open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

my ($data, $buffer);

{
local $,;
while (sysread $in, $buffer, 4096) {
my @lines = split /\n/, $buffer;
@lines = map { s{([[:xdigit:]]{2})}{chr(hex $1)}eg } @lines;
$data .= "@lines";
}
}

__END__

Your program doesn't do what I assume you think it does.
Yes, it seems very fast. But it doesn't actually output in ASCII.

Well, it depends on what is in your input file. I copied the chr(hex $1)
straight from your code.

Is it possible that you are actually reading a binary file, and what
you are looking for is

perldoc -f ord

I did, however, notice a couple of unintentional bugs in the code I
posted above.

Please post a couple of sample lines from the input file.

Here is what I have (repeated 25,000 times) in the file that I am using:

5468697320697320612074657374202E2E2E205468697320697320612074657374202E2E2E205468

That is, this is a text file, consisting of hex digits. This is consistent
with what you posted.

#!/usr/bin/perl

use strict;
use warnings;

open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

my ($data, $buffer);

my $crlf = '\015\012';

while (sysread $in, $buffer, 4096) {
my @lines = split /$crlf/, $buffer;
s{([[:xdigit:]]{2})}{chr(hex $1)}eg for @lines;
$data .= join('', @lines);
}

close $in or die $!;

open my $out, '>', 'ascii' or die $!;
print $out $data, "\n";
close $out or die $!;

__END__

Mark H · Feb 8, 2006

Ed takes the prize on this one. THANK YOU! I don't know why when I go
searching for hex to ascii converters, people for years have been
suggesting all of this other code when Ed's does everything you need it
to do and 100 times the speed (literally!). The processing time per
meg went from 2 seconds or 0.02 seconds.

Is there something I missed about why so many do it other ways?

Best,
Mark

Mark H · Feb 8, 2006

Thank you Jim for your detailed reply on this. I do see some of your
points about this not being a typical operation. But this is what
Perl's best at: The Atypical. I doubted her for a while, convinced
we'd be coding sections in C but she pulled through in the end, as
usual.

Thanks for everyone who helped on this. It's my hope that when the
next person comes along to search for "hex to ascii" perl fastest, this
result will now come up with help.

Mark

Anno Siegel · Feb 9, 2006

Mark H said:
Somehow I am having a hard time believing that no XS module exists for
this. It's so simple to write hex to ascii conversion in C and I would
be surprised that no one has invented a simple module to handle this
with great speed...

"pack 'H*'" is that code, right in the Perl core.

The slowness of your solution comes from splitting the data into
one-byte pieces. Use a reasonable chunk size and it will be fast
enough.

Anno

Tad McClellan · Feb 9, 2006

Mark H said:
It's so simple to write hex to ascii conversion in C

Then write a hex to ascii conversion in C, and your problem is solved!

Unless there is some compelling reason to use a particular
programming language.

Is there such a reason?

Fastest way to compute dot product (inner product) in Ruby?	4	Nov 2, 2007
A call for some useful primes for RNGs	0	Dec 30, 2009
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
How bad is $'? (Was: "Get substring of line")	4	Jan 18, 2005
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Dec 15, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Nov 1, 2007

Fastest Hex to Ascii routine

Mark H

A. Sinan Unur

Mark H

Mark H

Mark H

ednotover

A. Sinan Unur

Mark H

Mark H

Anno Siegel

Tad McClellan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads