substr() hassle, *n*x vs. Win32

Discussion in 'Perl Misc' started by Mirco Wahab, Jun 14, 2006.

  1. Mirco Wahab

    Mirco Wahab Guest

    While trying to get along with some
    "C to perl" interfacing, I stumbled
    upon substr when using it to 'lvalue'
    a packed scalar structure to a portion
    of its string like

    substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;

    which works fine (in the context I use it)
    under Perl-587 (Inline::C 0.44) in Linux.

    At a Win32-environment (Activeperl 587,
    nmake, cl) this seem to fail sometimes.
    Maybe I made a mistake that I'm not
    aware of.

    I'll include a short example where this
    problem occurs. A C-function is called,
    which allocates an SV* and writes
    geometrical data (double) to it.
    Back in perl, this data is unpacked
    an printed (checked visually).

    Then, this date is accessed by Perls
    substr(), which works, as said, under
    Linux but not under windows. The data
    in $data gets messed up, as if perl
    would cur out a portion of the string
    and insert a new one of different size.

    Thanks in advance,

    Mirco

    ==>
    #!/usr/lib/perl
    use strict;
    use warnings;

    my $bsize = P3Dsize(); # sizeof(struct) from C
    my $strct = "d3i"; # (atual) format of the above
    my $vlen = 6; # number of elements to work with
    my ($data, $x, $y, $z, $id, $blk); # some declarations

    makesomevector_of($vlen, $data); # call into inline c

    for my $idx (0 .. $vlen-1 ) {
    # first: access structures and print (always fine!)
    $blk = substr $data, $idx*$bsize, $bsize; # take out structure
    ($x, $y, $z, $id) = unpack $strct, $blk ; # unpack it
    print " $x\t$y\t$z\t[$id]\n"; # print it (fine)

    # second: access structures from Perl, !seems to fail in Win32!
    substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+1, $y+1, $z+1, $id+1;

    # third: access structures again and print (fails badly under W32/nmake/cl)
    $blk = substr $data, $idx*$bsize, $bsize; # re-take structure
    ($x, $y, $z, $id) = unpack $strct, $blk; # unpack it
    print "+$x\t$y\t$z\t[$id]+\n"; # print it (prints garbage!)
    }

    use Inline C => <<'END_OF_C_CODE';

    typedef struct {
    double x, y, z;
    int id;
    } P3D;

    int P3Dsize() { return sizeof(P3D); }

    int makesomevector_of(int Count, SV* perl_sv)
    {
    int id, blocksize = Count * P3Dsize();
    P3D* pvec = (P3D*) malloc(blocksize);

    if(pvec) {
    for(id=0; id<Count; id++){
    double val = id+1;
    pvec[id].id = id;
    pvec[id].x = val;
    pvec[id].y = val*val;
    pvec[id].z = val*val*val;
    }
    sv_setpvn( perl_sv, (char *)pvec, blocksize );
    return 1;
    }
    return 0;
    }
    END_OF_C_CODE
     
    Mirco Wahab, Jun 14, 2006
    #1
    1. Advertising

  2. Mirco Wahab

    Mirco Wahab Guest

    Thus spoke Mirco Wahab (on 2006-06-14 12:48):

    >
    > substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
    > [Problem]


    I could - after some fiddling - spot the problem
    and I'm wondering if there aren't some receipes for
    that one ...

    The problem is, I had a C structural type

    struct {
    double x, y, z;
    int n;
    };

    and would tell that it is 28 bytes in size,
    corresponding to a "dddi" pack format.

    But actually each compiler has had its
    own thinking about this, and - of course,
    its the structure alignment problem, which
    shines through here.

    In the case above, my gcc 3.3.3 would tell
    out of the box: "C struct size is 28 bytes"
    whereas my VC6/cl would insist:
    "C struct size is 32 bytes"

    I could fix the Win32-problem simply by
    changing the pack-format from (28 byte) "dddi"
    to (32 byte) "dddii". This leads of course
    to phenomenal failures in the Linux-version ;-)

    It looks like one could have a cheap and fast
    vector interface by writing stuff directly
    into scalars (if $n gets larger than 10^5),
    if not the only problem that hit me was the
    compiler specific alignment problem.

    How do I solve this?

    Another question: if I return a C-generated
    SV* back to perl via return(SV*), do I have
    to 'mortalize' anything - or does perl take
    care of that?

    (Source example attached)

    Regards

    Mirco

    ==>
    #!/usr/lib/perl
    use strict;
    use warnings;

    my $vlen = 6; # number of elements to work with
    my $bsize = P3Dsize(); # sizeof(struct) from C
    my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
    my ($blk, $x, $y, $z, $n); # some declarations
    print "C struct size is: $bsize bytes\n";

    # call into inline c
    my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";

    for my $idx (0 .. $vlen-1 ) {
    # first: access structures and print (always fine!)
    $blk = substr $data, $idx*$bsize, $bsize; # take out structure
    ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
    print " $x\t$y\t$z\t[$n]\n"; # print it (fine)

    # second: access structures from Perl (simply increment by 9)
    substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;

    # third: look into structures again and print them
    $blk = substr $data, $idx*$bsize, $bsize; # re-take structure
    ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
    print "+$x\t$y\t$z\t[$n]+\n"; # print it
    }

    use Inline C => <<'END_OF_C_CODE';

    typedef struct {
    double x, y, z;
    int n; // always check structure alignment by
    } P3D; // sizeof(struct) vs. sizeof(all members)

    int P3Dsize() { return sizeof(P3D); }

    SV* makesomevector_of(int Count)
    {
    int i, blocksize = Count * P3Dsize();
    P3D* pvec = (P3D*) malloc(blocksize);
    SV* perl_sv = newSV(0);

    if(pvec) {
    for(i=0; i<Count; i++){
    double val = i+1;
    pvec.n = i;
    pvec.x = val;
    pvec.y = val*val;
    pvec.z = val*val*val;
    }
    sv_setpvn( perl_sv, (char *)pvec, blocksize );
    free( pvec );
    }
    return perl_sv;
    }
    END_OF_C_CODE
     
    Mirco Wahab, Jun 14, 2006
    #2
    1. Advertising

  3. Mirco Wahab

    Ben Morrow Guest

    Quoth Mirco Wahab <>:
    > Thus spoke Mirco Wahab (on 2006-06-14 12:48):
    >
    > >
    > > substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
    > > [Problem]

    >
    > I could - after some fiddling - spot the problem
    > and I'm wondering if there aren't some receipes for
    > that one ...
    >
    > The problem is, I had a C structural type
    >
    > struct {
    > double x, y, z;
    > int n;
    > };
    >
    > and would tell that it is 28 bytes in size,
    > corresponding to a "dddi" pack format.
    >
    > But actually each compiler has had its
    > own thinking about this, and - of course,
    > its the structure alignment problem, which
    > shines through here.
    >
    > In the case above, my gcc 3.3.3 would tell
    > out of the box: "C struct size is 28 bytes"
    > whereas my VC6/cl would insist:
    > "C struct size is 32 bytes"
    >
    > I could fix the Win32-problem simply by
    > changing the pack-format from (28 byte) "dddi"
    > to (32 byte) "dddii". This leads of course


    Better would be one of "dddixx" or "dddxxi" depending on where the
    space is. You can use the offsetof() C macro to find out.

    > to phenomenal failures in the Linux-version ;-)
    >
    > It looks like one could have a cheap and fast
    > vector interface by writing stuff directly
    > into scalars (if $n gets larger than 10^5),
    > if not the only problem that hit me was the
    > compiler specific alignment problem.


    You may want to look at Bit::Vector.

    > How do I solve this?


    One way is to use Inline::Struct.
    Another is to have your Makefile.PL compile a little test program that
    uses offsetof to work out the right template and prints it out. Then you
    can substitute this into your .pm.

    > Another question: if I return a C-generated
    > SV* back to perl via return(SV*), do I have
    > to 'mortalize' anything - or does perl take
    > care of that?


    Have you read Inline::C-Cookbook? If you return a SV*, Inline::C will
    mortalize it for you. Otherwise, you must do it yourself.

    > (Source example attached)
    >
    > ==>
    > #!/usr/lib/perl
    > use strict;
    > use warnings;
    >
    > my $vlen = 6; # number of elements to work with
    > my $bsize = P3Dsize(); # sizeof(struct) from C
    > my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
    > my ($blk, $x, $y, $z, $n); # some declarations
    > print "C struct size is: $bsize bytes\n";
    >
    > # call into inline c
    > my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";
    >
    > for my $idx (0 .. $vlen-1 ) {
    > # first: access structures and print (always fine!)
    > $blk = substr $data, $idx*$bsize, $bsize; # take out structure
    > ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
    > print " $x\t$y\t$z\t[$n]\n"; # print it (fine)
    >
    > # second: access structures from Perl (simply increment by 9)
    > substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;
    >
    > # third: look into structures again and print them
    > $blk = substr $data, $idx*$bsize, $bsize; # re-take structure
    > ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
    > print "+$x\t$y\t$z\t[$n]+\n"; # print it
    > }
    >
    > use Inline C => <<'END_OF_C_CODE';
    >
    > typedef struct {
    > double x, y, z;
    > int n; // always check structure alignment by
    > } P3D; // sizeof(struct) vs. sizeof(all members)
    >
    > int P3Dsize() { return sizeof(P3D); }
    >
    > SV* makesomevector_of(int Count)
    > {
    > int i, blocksize = Count * P3Dsize();
    > P3D* pvec = (P3D*) malloc(blocksize);


    Don't use malloc! Use New/Safefree, which is the perl interface to the
    allocator.

    > SV* perl_sv = newSV(0);


    You should probably use NEWSV instead, and it would be more efficient to
    allocate string space straight away.

    Ben

    --
    Joy and Woe are woven fine,
    A Clothing for the Soul divine William Blake
    Under every grief and pine 'Auguries of Innocence'
    Runs a joy with silken twine.
     
    Ben Morrow, Jun 14, 2006
    #3
  4. Mirco Wahab

    Guest

    Mirco Wahab <> wrote:
    > Thus spoke Mirco Wahab (on 2006-06-14 12:48):
    >
    > >
    > > substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
    > > [Problem]

    >
    > I could - after some fiddling - spot the problem
    > and I'm wondering if there aren't some receipes for
    > that one ...
    >
    > The problem is, I had a C structural type
    >
    > struct {
    > double x, y, z;
    > int n;
    > };
    >
    > and would tell that it is 28 bytes in size,
    > corresponding to a "dddi" pack format.
    >
    > But actually each compiler has had its
    > own thinking about this, and - of course,
    > its the structure alignment problem, which
    > shines through here.
    >

    ....
    >
    > It looks like one could have a cheap and fast
    > vector interface by writing stuff directly
    > into scalars (if $n gets larger than 10^5),
    > if not the only problem that hit me was the
    > compiler specific alignment problem.
    >
    > How do I solve this?


    I think the simple answer is that you don't solve this. You can't
    take on the performance power of C without taking its liabilities, one
    of which is the nonportability of structs. So you can circumvent it
    in several ways, but they depend on what you are trying to do. You
    could use four independent arrays (3 for doubles and one for int), although
    there may be alignment problems there as well. Or you could just fiddle
    with it until it works on your machine and then accept that it will not be
    portable.

    >
    > Another question: if I return a C-generated
    > SV* back to perl via return(SV*), do I have
    > to 'mortalize' anything - or does perl take
    > care of that?


    In this case, Perl takes care of it. I know this for two reasons. I ran
    your code in a loop and noticed no memory leak, I added a
    sv_2mortal(perl_sv) and ran your code and got "Attempt to free unreferenced
    scalar" errors. Alas, I don't know how you figure these things out from
    first principles.

    I generally side step these issues by making a string of the right length
    in perl, and then passing that string into the Inline code for the C to
    fill in and use.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Jun 14, 2006
    #4
  5. Mirco Wahab

    Guest

    wrote:
    > ...
    > >
    > > It looks like one could have a cheap and fast
    > > vector interface by writing stuff directly
    > > into scalars (if $n gets larger than 10^5),
    > > if not the only problem that hit me was the
    > > compiler specific alignment problem.
    > >
    > > How do I solve this?

    >
    > I think the simple answer is that you don't solve this. You can't
    > take on the performance power of C without taking its liabilities, one
    > of which is the nonportability of structs. So you can circumvent it
    > in several ways, but they depend on what you are trying to do. You
    > could use four independent arrays (3 for doubles and one for int),
    > although there may be alignment problems there as well. Or you could
    > just fiddle with it until it works on your machine and then accept that
    > it will not be portable.
    >
    > >
    > > Another question: if I return a C-generated
    > > SV* back to perl via return(SV*), do I have
    > > to 'mortalize' anything - or does perl take
    > > care of that?

    >
    > In this case, Perl takes care of it. I know this for two reasons. I ran
    > your code in a loop and noticed no memory leak, I added a
    > sv_2mortal(perl_sv) and ran your code and got "Attempt to free
    > unreferenced scalar" errors. Alas, I don't know how you figure these
    > things out from first principles.


    On both of these, never mind me. Listen to Ben. He really knows what he
    is doing. If I'd seen his post before I started composing my own, I
    wouldn't have responded.

    > I generally side step these issues by making a string of the right length
    > in perl, and then passing that string into the Inline code for the C to
    > fill in and use.


    Well, but I do still like doing this.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Jun 14, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. SparvHok

    Postback, a hassle!?

    SparvHok, Dec 7, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    511
    Patrice
    Dec 7, 2004
  2. Paul Watt

    hassle with layout

    Paul Watt, Apr 4, 2006, in forum: HTML
    Replies:
    3
    Views:
    366
    Paul Watt
    Apr 4, 2006
  3. kanzen
    Replies:
    7
    Views:
    405
    =?ISO-8859-1?Q?Gerhard_H=E4ring?=
    Mar 23, 2005
  4. Evan Kroske

    Python 2.6 worth the hassle?

    Evan Kroske, May 6, 2009, in forum: Python
    Replies:
    1
    Views:
    284
    Steven D'Aprano
    May 7, 2009
  5. Mark

    hassle with macro

    Mark, Apr 26, 2010, in forum: C Programming
    Replies:
    3
    Views:
    293
    Ralf Damaschke
    Apr 26, 2010
Loading...

Share This Page