How to get offset position from unpack()?

Discussion in 'Perl Misc' started by jl_post@hotmail.com, Feb 15, 2010.

  1. Guest

    Hi,

    The unpack() function is very, very useful for me, as I regularly
    do a lot of unpacking of non-Perl-created data strings to see what
    information they hold. If I didn't have the use of the unpack()
    function, certain tasks would be much more difficult.

    However, there's something I want to do with unpack() that I
    haven't figured out how to do: I'd like to unpack part of a string,
    but keep track of where the unpacking ended, so I can resume unpacking
    the string (at a later time) where I left off.

    Here's a trivial example:

    Let's say I have a data string that holds lists of strings, like
    this:

    " 2 5hello 5world 2 2hi 5there"

    The first number (" 2") signifies that the first list holds two
    strings. The next number (" 5") signifies that the first encoded
    string is 5 characters long. The next number (also a " 5") signifies
    the same for the next encoded string.

    So I could write a format string for unpack() to be: "a2/(a2/a)"

    So the lines of code:

    my $dataString = ' 2 5hello 5world extra data';
    my @a = unpack 'a2/(a2/a)', $dataString;
    print "$_\n" foreach @a;

    would output:

    hello
    world

    My question becomes: What if I want to parse out the extra data
    later with a different pack string? It would be nice if there was a
    way to return the current offset somehow with unpack(), so that I
    could unpack again with something like this:

    my @b = unpack "\@$offset $newPackString", $dataString;

    Now, I could calculate this offset myself by examining what was
    placed in @a, but this gets tricky fast with packstrings that use "Z",
    "A", and 'a' (and combinations).

    (Incidentally, C's sscanf() function has a little-known "n" format
    character that returns the number of characters consumed. I'm hoping
    that unpack() has a similar feature.)

    I posted a similar question back in 2004, and Anno Siegel responded
    with the suggestion of adding "a*" to my first packstring, and then
    using the length() of the last element to calculate the offset, like
    this:

    my $dataString = ' 2 5hello 5world extra data';
    my @a = unpack 'a2/(a2/a) a*', $dataString;
    my $offset = length($dataString) - length( pop(@a) );
    print "$_\n" foreach @a;
    my @b = unpack "\@$offset $newPackString", $dataString;

    While this approach technically works, repeatedly using "a*" at the
    end of a packstring in a continual loop creates a O(n^2) algorithm.
    This isn't a problem for short $dataStrings, but is a significant
    problem when $dataStrings are long and/or have no limit in length.

    I've noticed that Perl 5.10 added lots of convenient new features
    to pack() and unpack() (such as the ability to pack floats and doubles
    in an endian-ness different than your own), so I'm hoping that
    unpack() now has a way to return the $dataString offset. However,
    I've read both "perldoc -f unpack" and "perldoc -f pack" but I can't
    seem to find this behavior documented, if it exists at all.

    So does anyone know if I can get unpack() to return an offset?

    Thanks!

    -- Jean-Luc
    , Feb 15, 2010
    #1
    1. Advertising

  2. Guest

    > Quoth "" <>:
    >
    > However, there's something I want to do with unpack() that I
    > haven't figured out how to do:  I'd like to unpack part of a
    > string, but keep track of where the unpacking ended, so I can
    > resume unpacking the string (at a later time) where I left off.



    On Feb 15, 1:48 pm, Ben Morrow <> replied:
    >     ~% perl -E'my $x = "aaa"; say for unpack "a2.", $x'
    >     aa
    >     2
    >     ~%



    Wow, thanks! The '.' character was exactly what I was looking for!

    (I notice it's new in Perl 5.10, so if I'm working for platforms
    that have an older version of Perl I'll just have to just the old "a*"
    trick.)

    I tried searching for "."'s behavior in "perldoc -f unpack",
    "perldoc -f pack", and even "perldoc perlpacktut", but I couldn't find
    where it mentions that it returns the offset when used with unpack().
    Is there a place that explains this with a little more depth?

    Anyway, thanks for your help, Ben. I really appreciate it.

    -- Jean-Luc
    , Feb 16, 2010
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lance Riedel

    Translated Offset to Source Offset

    Lance Riedel, Oct 14, 2003, in forum: XML
    Replies:
    2
    Views:
    485
    Patrick TJ McPhee
    Oct 15, 2003
  2. J. Romano
    Replies:
    2
    Views:
    111
    J. Romano
    Aug 18, 2004
  3. Replies:
    2
    Views:
    287
  4. Stevo
    Replies:
    10
    Views:
    275
    Thomas 'PointedEars' Lahn
    Mar 27, 2008
  5. Roy Smith
    Replies:
    4
    Views:
    240
    Roy Smith
    Jan 27, 2013
Loading...

Share This Page