substr() hassle, *n*x vs. Win32

M

Mirco Wahab

While trying to get along with some
"C to perl" interfacing, I stumbled
upon substr when using it to 'lvalue'
a packed scalar structure to a portion
of its string like

substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;

which works fine (in the context I use it)
under Perl-587 (Inline::C 0.44) in Linux.

At a Win32-environment (Activeperl 587,
nmake, cl) this seem to fail sometimes.
Maybe I made a mistake that I'm not
aware of.

I'll include a short example where this
problem occurs. A C-function is called,
which allocates an SV* and writes
geometrical data (double) to it.
Back in perl, this data is unpacked
an printed (checked visually).

Then, this date is accessed by Perls
substr(), which works, as said, under
Linux but not under windows. The data
in $data gets messed up, as if perl
would cur out a portion of the string
and insert a new one of different size.

Thanks in advance,

Mirco

==>
#!/usr/lib/perl
use strict;
use warnings;

my $bsize = P3Dsize(); # sizeof(struct) from C
my $strct = "d3i"; # (atual) format of the above
my $vlen = 6; # number of elements to work with
my ($data, $x, $y, $z, $id, $blk); # some declarations

makesomevector_of($vlen, $data); # call into inline c

for my $idx (0 .. $vlen-1 ) {
# first: access structures and print (always fine!)
$blk = substr $data, $idx*$bsize, $bsize; # take out structure
($x, $y, $z, $id) = unpack $strct, $blk ; # unpack it
print " $x\t$y\t$z\t[$id]\n"; # print it (fine)

# second: access structures from Perl, !seems to fail in Win32!
substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+1, $y+1, $z+1, $id+1;

# third: access structures again and print (fails badly under W32/nmake/cl)
$blk = substr $data, $idx*$bsize, $bsize; # re-take structure
($x, $y, $z, $id) = unpack $strct, $blk; # unpack it
print "+$x\t$y\t$z\t[$id]+\n"; # print it (prints garbage!)
}

use Inline C => <<'END_OF_C_CODE';

typedef struct {
double x, y, z;
int id;
} P3D;

int P3Dsize() { return sizeof(P3D); }

int makesomevector_of(int Count, SV* perl_sv)
{
int id, blocksize = Count * P3Dsize();
P3D* pvec = (P3D*) malloc(blocksize);

if(pvec) {
for(id=0; id<Count; id++){
double val = id+1;
pvec[id].id = id;
pvec[id].x = val;
pvec[id].y = val*val;
pvec[id].z = val*val*val;
}
sv_setpvn( perl_sv, (char *)pvec, blocksize );
return 1;
}
return 0;
}
END_OF_C_CODE
 
M

Mirco Wahab

Thus spoke Mirco Wahab (on 2006-06-14 12:48):
substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
[Problem]

I could - after some fiddling - spot the problem
and I'm wondering if there aren't some receipes for
that one ...

The problem is, I had a C structural type

struct {
double x, y, z;
int n;
};

and would tell that it is 28 bytes in size,
corresponding to a "dddi" pack format.

But actually each compiler has had its
own thinking about this, and - of course,
its the structure alignment problem, which
shines through here.

In the case above, my gcc 3.3.3 would tell
out of the box: "C struct size is 28 bytes"
whereas my VC6/cl would insist:
"C struct size is 32 bytes"

I could fix the Win32-problem simply by
changing the pack-format from (28 byte) "dddi"
to (32 byte) "dddii". This leads of course
to phenomenal failures in the Linux-version ;-)

It looks like one could have a cheap and fast
vector interface by writing stuff directly
into scalars (if $n gets larger than 10^5),
if not the only problem that hit me was the
compiler specific alignment problem.

How do I solve this?

Another question: if I return a C-generated
SV* back to perl via return(SV*), do I have
to 'mortalize' anything - or does perl take
care of that?

(Source example attached)

Regards

Mirco

==>
#!/usr/lib/perl
use strict;
use warnings;

my $vlen = 6; # number of elements to work with
my $bsize = P3Dsize(); # sizeof(struct) from C
my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
my ($blk, $x, $y, $z, $n); # some declarations
print "C struct size is: $bsize bytes\n";

# call into inline c
my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";

for my $idx (0 .. $vlen-1 ) {
# first: access structures and print (always fine!)
$blk = substr $data, $idx*$bsize, $bsize; # take out structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print " $x\t$y\t$z\t[$n]\n"; # print it (fine)

# second: access structures from Perl (simply increment by 9)
substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;

# third: look into structures again and print them
$blk = substr $data, $idx*$bsize, $bsize; # re-take structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print "+$x\t$y\t$z\t[$n]+\n"; # print it
}

use Inline C => <<'END_OF_C_CODE';

typedef struct {
double x, y, z;
int n; // always check structure alignment by
} P3D; // sizeof(struct) vs. sizeof(all members)

int P3Dsize() { return sizeof(P3D); }

SV* makesomevector_of(int Count)
{
int i, blocksize = Count * P3Dsize();
P3D* pvec = (P3D*) malloc(blocksize);
SV* perl_sv = newSV(0);

if(pvec) {
for(i=0; i<Count; i++){
double val = i+1;
pvec.n = i;
pvec.x = val;
pvec.y = val*val;
pvec.z = val*val*val;
}
sv_setpvn( perl_sv, (char *)pvec, blocksize );
free( pvec );
}
return perl_sv;
}
END_OF_C_CODE
 
B

Ben Morrow

Quoth Mirco Wahab said:
Thus spoke Mirco Wahab (on 2006-06-14 12:48):
substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
[Problem]

I could - after some fiddling - spot the problem
and I'm wondering if there aren't some receipes for
that one ...

The problem is, I had a C structural type

struct {
double x, y, z;
int n;
};

and would tell that it is 28 bytes in size,
corresponding to a "dddi" pack format.

But actually each compiler has had its
own thinking about this, and - of course,
its the structure alignment problem, which
shines through here.

In the case above, my gcc 3.3.3 would tell
out of the box: "C struct size is 28 bytes"
whereas my VC6/cl would insist:
"C struct size is 32 bytes"

I could fix the Win32-problem simply by
changing the pack-format from (28 byte) "dddi"
to (32 byte) "dddii". This leads of course

Better would be one of "dddixx" or "dddxxi" depending on where the
space is. You can use the offsetof() C macro to find out.
to phenomenal failures in the Linux-version ;-)

It looks like one could have a cheap and fast
vector interface by writing stuff directly
into scalars (if $n gets larger than 10^5),
if not the only problem that hit me was the
compiler specific alignment problem.

You may want to look at Bit::Vector.
How do I solve this?

One way is to use Inline::Struct.
Another is to have your Makefile.PL compile a little test program that
uses offsetof to work out the right template and prints it out. Then you
can substitute this into your .pm.
Another question: if I return a C-generated
SV* back to perl via return(SV*), do I have
to 'mortalize' anything - or does perl take
care of that?

Have you read Inline::C-Cookbook? If you return a SV*, Inline::C will
mortalize it for you. Otherwise, you must do it yourself.
(Source example attached)

==>
#!/usr/lib/perl
use strict;
use warnings;

my $vlen = 6; # number of elements to work with
my $bsize = P3Dsize(); # sizeof(struct) from C
my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
my ($blk, $x, $y, $z, $n); # some declarations
print "C struct size is: $bsize bytes\n";

# call into inline c
my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";

for my $idx (0 .. $vlen-1 ) {
# first: access structures and print (always fine!)
$blk = substr $data, $idx*$bsize, $bsize; # take out structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print " $x\t$y\t$z\t[$n]\n"; # print it (fine)

# second: access structures from Perl (simply increment by 9)
substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;

# third: look into structures again and print them
$blk = substr $data, $idx*$bsize, $bsize; # re-take structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print "+$x\t$y\t$z\t[$n]+\n"; # print it
}

use Inline C => <<'END_OF_C_CODE';

typedef struct {
double x, y, z;
int n; // always check structure alignment by
} P3D; // sizeof(struct) vs. sizeof(all members)

int P3Dsize() { return sizeof(P3D); }

SV* makesomevector_of(int Count)
{
int i, blocksize = Count * P3Dsize();
P3D* pvec = (P3D*) malloc(blocksize);

Don't use malloc! Use New/Safefree, which is the perl interface to the
allocator.
SV* perl_sv = newSV(0);

You should probably use NEWSV instead, and it would be more efficient to
allocate string space straight away.

Ben
 
X

xhoster

Mirco Wahab said:
Thus spoke Mirco Wahab (on 2006-06-14 12:48):
substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
[Problem]

I could - after some fiddling - spot the problem
and I'm wondering if there aren't some receipes for
that one ...

The problem is, I had a C structural type

struct {
double x, y, z;
int n;
};

and would tell that it is 28 bytes in size,
corresponding to a "dddi" pack format.

But actually each compiler has had its
own thinking about this, and - of course,
its the structure alignment problem, which
shines through here.
....

It looks like one could have a cheap and fast
vector interface by writing stuff directly
into scalars (if $n gets larger than 10^5),
if not the only problem that hit me was the
compiler specific alignment problem.

How do I solve this?

I think the simple answer is that you don't solve this. You can't
take on the performance power of C without taking its liabilities, one
of which is the nonportability of structs. So you can circumvent it
in several ways, but they depend on what you are trying to do. You
could use four independent arrays (3 for doubles and one for int), although
there may be alignment problems there as well. Or you could just fiddle
with it until it works on your machine and then accept that it will not be
portable.
Another question: if I return a C-generated
SV* back to perl via return(SV*), do I have
to 'mortalize' anything - or does perl take
care of that?

In this case, Perl takes care of it. I know this for two reasons. I ran
your code in a loop and noticed no memory leak, I added a
sv_2mortal(perl_sv) and ran your code and got "Attempt to free unreferenced
scalar" errors. Alas, I don't know how you figure these things out from
first principles.

I generally side step these issues by making a string of the right length
in perl, and then passing that string into the Inline code for the C to
fill in and use.

Xho
 
X

xhoster

...

I think the simple answer is that you don't solve this. You can't
take on the performance power of C without taking its liabilities, one
of which is the nonportability of structs. So you can circumvent it
in several ways, but they depend on what you are trying to do. You
could use four independent arrays (3 for doubles and one for int),
although there may be alignment problems there as well. Or you could
just fiddle with it until it works on your machine and then accept that
it will not be portable.


In this case, Perl takes care of it. I know this for two reasons. I ran
your code in a loop and noticed no memory leak, I added a
sv_2mortal(perl_sv) and ran your code and got "Attempt to free
unreferenced scalar" errors. Alas, I don't know how you figure these
things out from first principles.

On both of these, never mind me. Listen to Ben. He really knows what he
is doing. If I'd seen his post before I started composing my own, I
wouldn't have responded.
I generally side step these issues by making a string of the right length
in perl, and then passing that string into the Inline code for the C to
fill in and use.

Well, but I do still like doing this.

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top