Fastest Hex to Ascii routine

Discussion in 'Perl Misc' started by Mark H, Feb 8, 2006.

  1. Mark H

    Mark H Guest

    I have been beating myself over the head looking for a faster hex to
    ascii routine. I've scoured the Internet for 3 hours now and have
    found nothing that even remotely holds up on megabytes of hex to ascii
    conversion. Here's what I have so far:
    for (my $i = 0; $i < length($file_raw_hex); $i += 2)
    {
    $file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
    }

    This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
    P4.

    Then this is slightly faster:
    $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

    Comes in at 1.5 seconds per meg.

    But there's got to be something that can do better than this. This is
    a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
    no other bottleneck here. This code is dog slow.

    Does anyone have any suggestions? I have been trying to figure out if
    Bit::Vector could help but to no avail (Bit::Vector has no ascii
    abilities as far as I know - it only converts between
    decimal/hex/octal). I would love if someone has a module to suggest
    that uses XS code.

    Thanks
    Mark
     
    Mark H, Feb 8, 2006
    #1
    1. Advertising

  2. "Mark H" <> wrote in news::

    > I have been beating myself over the head looking for a faster hex to
    > ascii routine. I've scoured the Internet for 3 hours now and have
    > found nothing that even remotely holds up on megabytes of hex to ascii
    > conversion. Here's what I have so far:
    > for (my $i = 0; $i < length($file_raw_hex); $i += 2)
    > {
    > $file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
    > }
    >
    > This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
    > P4.
    >
    > Then this is slightly faster:
    > $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;
    >
    > Comes in at 1.5 seconds per meg.
    >
    > But there's got to be something that can do better than this. This is
    > a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
    > no other bottleneck here. This code is dog slow.


    How about line-by-line or block-by-block processing?
    Here is something quick'n'dirty:

    #!/usr/bin/perl

    use strict;
    use warnings;

    open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

    my ($data, $buffer);

    {
    local $,;
    while (sysread $in, $buffer, 4096) {
    my @lines = split /\n/, $buffer;
    @lines = map { s{([[:xdigit:]]{2})}{chr(hex $1)}eg } @lines;
    $data .= "@lines";
    }
    }

    __END__

    D:\Home\asu1\UseNet\clpmisc\hex> tail -n 3 hexfile
    EAFC3885140E9010FFD505127FC20C62F47202C403B9B66F8DC88EC542A0D0888A7522911128B559
    BF7E364E624A0651D01BBD4ACFAC813686AF489AC0246DC9CBDFC7D43662AB9D41C3EDEE34AE6DFC
    7D402B3CC7D47DF8DF785689AE243A970963E458A6981C20FB81D13F511DF287CDB11F66C0F2A8FE

    D:\Home\asu1\UseNet\clpmisc\hex> dir hexfile

    02/08/2006 03:52 PM 2,050,000 hexfile

    D:\Home\asu1\UseNet\clpmisc\hex> timethis read.pl hexfile

    TimeThis : Command Line : read.pl hexfile
    TimeThis : Start Time : Wed Feb 08 16:23:35 2006
    TimeThis : End Time : Wed Feb 08 16:23:37 2006
    TimeThis : Elapsed Time : 00:00:01.859

    which translates to a little less than a second per megabyte on my
    AMD64 1.8Ghz laptop (running at 800Mhz on batteries) with Win XPSP2.

    See what results you get on your system.

    And, please, the next time post a complete program that we can run
    by copying and pasting.

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Feb 8, 2006
    #2
    1. Advertising

  3. Mark H

    Mark H Guest

    Hi Sinan,

    Thank you for throwing in your hat here to help!

    Your program doesn't do what I assume you think it does. Yes, it seems
    very fast. But it doesn't actually output in ASCII. It turned my hex
    into a series of numbers that made no sense.

    Best,
    Mark
     
    Mark H, Feb 8, 2006
    #3
  4. Mark H

    Mark H Guest

    Not sure if this would make things any faster but our hex data is
    already in memory in a $variable with no \n's in it. So splitting
    isn't necessary (it's not line-by-line data)... it's just megs of solid
    hex.

    Mark
     
    Mark H, Feb 8, 2006
    #4
  5. Mark H

    Mark H Guest

    Somehow I am having a hard time believing that no XS module exists for
    this. It's so simple to write hex to ascii conversion in C and I would
    be surprised that no one has invented a simple module to handle this
    with great speed...

    Best,
    Mark
     
    Mark H, Feb 8, 2006
    #5
  6. Mark H

    Guest

    Mark H wrote:

    > Here's what I have so far:
    > for (my $i = 0; $i < length($file_raw_hex); $i += 2)
    > {
    > $file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
    > }


    > $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;


    Why not just pack it all in one fell swoop?

    $file_raw = pack 'H*', $file_raw_hex;

    Ed
     
    , Feb 8, 2006
    #6
  7. "Mark H" <> wrote in news::

    [ Please quote an appropriate amount of context when replying ]

    > "Mark H" <> wrote in
    > news::
    >

    ....
    >> $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;


    ....

    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";
    >
    > my ($data, $buffer);
    >
    > {
    > local $,;
    > while (sysread $in, $buffer, 4096) {
    > my @lines = split /\n/, $buffer;
    > @lines = map { s{([[:xdigit:]]{2})}{chr(hex $1)}eg } @lines;
    > $data .= "@lines";
    > }
    > }
    >
    > __END__


    > Your program doesn't do what I assume you think it does.
    > Yes, it seems very fast. But it doesn't actually output in ASCII.


    Well, it depends on what is in your input file. I copied the chr(hex $1)
    straight from your code.

    Is it possible that you are actually reading a binary file, and what
    you are looking for is

    perldoc -f ord

    I did, however, notice a couple of unintentional bugs in the code I
    posted above.

    Please post a couple of sample lines from the input file.

    Here is what I have (repeated 25,000 times) in the file that I am using:

    5468697320697320612074657374202E2E2E205468697320697320612074657374202E2E2E205468

    That is, this is a text file, consisting of hex digits. This is consistent
    with what you posted.

    #!/usr/bin/perl

    use strict;
    use warnings;

    open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

    my ($data, $buffer);

    my $crlf = '\015\012';

    while (sysread $in, $buffer, 4096) {
    my @lines = split /$crlf/, $buffer;
    s{([[:xdigit:]]{2})}{chr(hex $1)}eg for @lines;
    $data .= join('', @lines);
    }

    close $in or die $!;

    open my $out, '>', 'ascii' or die $!;
    print $out $data, "\n";
    close $out or die $!;

    __END__


    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Feb 8, 2006
    #7
  8. Mark H

    Mark H Guest

    Ed takes the prize on this one. THANK YOU! I don't know why when I go
    searching for hex to ascii converters, people for years have been
    suggesting all of this other code when Ed's does everything you need it
    to do and 100 times the speed (literally!). The processing time per
    meg went from 2 seconds or 0.02 seconds.

    Is there something I missed about why so many do it other ways?

    Best,
    Mark
     
    Mark H, Feb 8, 2006
    #8
  9. Mark H

    Mark H Guest

    Thank you Jim for your detailed reply on this. I do see some of your
    points about this not being a typical operation. But this is what
    Perl's best at: The Atypical. I doubted her for a while, convinced
    we'd be coding sections in C but she pulled through in the end, as
    usual.

    Thanks for everyone who helped on this. It's my hope that when the
    next person comes along to search for "hex to ascii" perl fastest, this
    result will now come up with help.

    Mark
     
    Mark H, Feb 8, 2006
    #9
  10. Mark H

    Anno Siegel Guest

    Mark H <> wrote in comp.lang.perl.misc:
    > Somehow I am having a hard time believing that no XS module exists for
    > this. It's so simple to write hex to ascii conversion in C and I would
    > be surprised that no one has invented a simple module to handle this
    > with great speed...


    "pack 'H*'" is that code, right in the Perl core.

    The slowness of your solution comes from splitting the data into
    one-byte pieces. Use a reasonable chunk size and it will be fast
    enough.

    Anno
    --
    $_='Just another Perl hacker'; print +( join( '', map { eval $_; $@ }
    'use warnings FATAL => "all"; printf "%-1s", "\n"', 'use strict; a',
    'use warnings FATAL => "all"; "@x"', '1->m') =~
    m|${ s/(.)/($1).*/g; \ $_ }|is),',';
     
    Anno Siegel, Feb 9, 2006
    #10
  11. Mark H <> wrote:

    > It's so simple to write hex to ascii conversion in C



    Then write a hex to ascii conversion in C, and your problem is solved!

    Unless there is some compelling reason to use a particular
    programming language.

    Is there such a reason?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 9, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    10
    Views:
    6,214
    Neredbojias
    Aug 19, 2005
  2. Bengt Richter
    Replies:
    6
    Views:
    471
    Juha Autero
    Aug 19, 2003
  3. jack
    Replies:
    4
    Views:
    588
  4. Dun Peal
    Replies:
    2
    Views:
    263
    Carl Banks
    Oct 18, 2010
  5. James O'Brien
    Replies:
    3
    Views:
    255
    Ben Morrow
    Mar 5, 2004
Loading...

Share This Page