Q: re Inline and Benchmark

Michele Dondi · Nov 4, 2004

Me too. And this is why I asked in the first place.

That's an artifact of Benchmark. Inline checks parameters and gives
the "usage" message you've seen when (for instance) a function requiring
no arguments is called with arguments. Benchmark calls a given
coderef in the form "&$subref;" (line 646 in v 1.051), so if @_ happens to
be non-empty at this point, the error occurs.

In fact I made a local copy of Benchmark.pm and hacked it using
"$subref->()" instead and it works as expected.

Prototyped Perl subs would have a similar problem. Time for a bug report,
I guess... And another lesson why one shouldn't call arbitrary subs
with &.

Definitely!

TY,
Michele

Michele Dondi · Nov 4, 2004

{ map {
no strict 'refs';
$_ => \&{$_};
} qw/sol1 sol2 .../ };

Except that this won't work! When I try it in actual code I get

"no" not allowed in expression at foo.pl line 12, at end of line
BEGIN not safe after errors--compilation aborted at foo.pl line 12.

Of course I can do

{
no strict 'refs';
cmpthese 500, { map { $_ => \&{$_} }
qw/sol1 sol2 etc/ };
}

instead, but IMHO the no-strict-'refs'-ization logically best applies
to the map()'s code block. Now I know that there are easy workarounds,
e.g. using the "EXPR-form" of map() with a do(); still I wonder why
the block of the "BLOCK-form" of map() doesn't behave like a "normal"
block.

TIA,
Michele

Sisyphus · Nov 5, 2004

Michele said:
fact that it leaks. What I'm now wondering is whether the leak was not
mentioned because:
a) no-one noticed it;
or because:
b) code that leaks is of little concern, and generally not considered
worth mentioning.

Click to expand...

Certainly *for me* and *in this case* it's not necessarily worth
mentioning that it leaks. However since I could understand Abigail's
code[1], but I can't follow your discussion anymore and yet I'd like
to include the non-leaking code in the comparison too (for
completeness) once you all agree on "the best/correct way to do it",
for some reasonably sensible meaning of "best/correct", may you supply
a complete, working, function? You can write me privately at
<ti tod nfni tod im tod mcl ta razalb> if needed.

[1] I've not written a line in C in ten years or so, but I conserve
some reminiscence of it. ;-)

Michele

use warnings;

use Inline C => Config =>
BUILD_NOISY => 1;

use Inline C => <<'EOC';

char * abigail () {
char * str;
int i;
int l;

l = 20;
srand (time ((time_t) NULL));
if ((str = (char *) malloc (l * sizeof (char))) == (char *)
NULL) {
perror (malloc);
exit (-1);
}
for (i = 0; i < l; i ++) {
str = rand () % 255;
}
return (str);
}

SV * sisyphus () {
char * str;
SV * outsv;
int i;
int l;

l = 20;
srand (time ((time_t) NULL));
New(1, str, l * sizeof(char), char);
if(str == NULL)
croak("Failed to allocate memory in r_string()");
for (i = 0; i < l; i ++) {
str = rand () % 255;
}
outsv = newSVpv(str, l);
Safefree(str);
return outsv;
}

SV * sisyphus2 () {
SV * outsv;
char c;
int i;
int l = 20;

outsv = NEWSV(0, l);
srand (time ((time_t) NULL));

for (i = 0; i < l; i ++) {
c = rand () % 255;
sv_insert(outsv, i, 1, &c, 1);
}
SvPOK_on(outsv);
SvCUR_set(outsv, l);
return outsv;
}

EOC

$s = abigail();
print $s, " ", length($s), "\n";

$s = sisyphus();
print $s, " ", length($s), "\n";

$s = sisyphus2();
print $s, " ", length($s), "\n";

__END__

'abigail()' is the function as he originally wrote it. The only
complaint I originally had about it is that it leaks memory.

'sisyphus()' is the version that I posted in my first post (with a typo
or two fixed). It doesn't leak memory - but it's a little wasteful in
that it uses 2 memory buffers, whereas one would suffice.

'sisyphus2()' is the same as 'sisyphus()' except that it attends to the
"wastefulness" issue (hopefully . You can probably convert the 2
lines inside the for loop into a single line, and thus dispense with any
need for declaring a 'char' .... I'm not going to worry about *that*
wastefulness right now What does puzzle me a little is that it
produces a "Use of uninitialized value in subroutine entry" warning and
I haven't yet worked out why. What on earth is perl on about there ??

The above has been tested, but not benchmarked. I don't think you'll
detect any difference in speed of the 3 subs, but it's not hard to find out.

Note that (for the purposes of my test) I changed the length of the
random string from 20000 to 20.

For some reason the length returned by 'abigail()' is usually 21. I
suspect you're also getting the trailing NULL character in strings that
it returns. However, on Linux, that 21st character often shows up as a
'1'. I can't explain that. On a couple of occasions it returned a string
that was less than 20 characters long - perhaps because of an embedded
NULL character in the string, so it probably pays to check that you get
the full quota returned.

Cheers,
Rob

Anno Siegel · Nov 5, 2004

Michele Dondi said:
Except that this won't work! When I try it in actual code I get

"no" not allowed in expression at foo.pl line 12, at end of line
BEGIN not safe after errors--compilation aborted at foo.pl line 12.

Of course I can do

{
no strict 'refs';
cmpthese 500, { map { $_ => \&{$_} }
qw/sol1 sol2 etc/ };
}

instead, but IMHO the no-strict-'refs'-ization logically best applies
to the map()'s code block. Now I know that there are easy workarounds,
e.g. using the "EXPR-form" of map() with a do(); still I wonder why
the block of the "BLOCK-form" of map() doesn't behave like a "normal"
block.

Well, there are blocks in Perl, and then there are blocks. Bare blocks,
loop blocks and sub blocks all have slightly different rules about what
can legally appear inside, so I'm unsurprised that "use" (and "no")
can go in some but not in others.

The wording of the message is a bit misleading -- the immediate
environment of "no" is definitely a block, and "no ..." is a statement,
not part of an expression. I think I'll be able to live with it...

Anno

Michele Dondi · Nov 5, 2004

'sisyphus2()' is the same as 'sisyphus()' except that it attends to the
"wastefulness" issue (hopefully . You can probably convert the 2
lines inside the for loop into a single line, and thus dispense with any
need for declaring a 'char' .... I'm not going to worry about *that*
wastefulness right now What does puzzle me a little is that it
produces a "Use of uninitialized value in subroutine entry" warning and
I haven't yet worked out why. What on earth is perl on about there ??

I don't think that that "wastefulness" would be a major issue, so I
just wouldn't bother. OTOH I'm amazed at how even a seemingly simple
task like this, analized sufficiently in depth is revealing so many
facets and hints to learn new things...

Note that (for the purposes of my test) I changed the length of the
random string from 20000 to 20.

For some reason the length returned by 'abigail()' is usually 21. I

Ditto as above. Good point changing the length to 20, for probably one
wouldn't have noticed otherwise. I suspect you didn't actually check
the length first, but printed to STDOUT (with some filtering,
obviously) and noticed something strange, didn't you?

suspect you're also getting the trailing NULL character in strings that
it returns. However, on Linux, that 21st character often shows up as a
'1'. I can't explain that. On a couple of occasions it returned a string
Huh?!?

that was less than 20 characters long - perhaps because of an embedded
NULL character in the string, so it probably pays to check that you get
the full quota returned.

Ditto as above! Many good points IMHO make all this stuff an
interesting case study...

TY,
Michele

Michele Dondi · Nov 5, 2004

The wording of the message is a bit misleading -- the immediate
environment of "no" is definitely a block, and "no ..." is a statement,
not part of an expression. I think I'll be able to live with it...

Well, I guess I can live with it too. Especially since I had never
even noticed up until now! However, out of curiosity: do you think
that it "ought" to be allowed, hypothetically speaking?

Michele

Sisyphus · Nov 5, 2004

Michele said:
Huh?!

Let's just say that when you 'return(str)' you return up until the first
NULL is reached. In this case the 21st character is a '1' and the 22nd
character is a NULL. When I started running this on Win32 I sometimes
got 25 characters back before a NULL was reached.

There's no "perhaps" about it - it's definitely what was happening. And
the longer the string, the greater the chance that you'll strike a NULL
before you get that full quota back.

Ditto as above! Many good points IMHO make all this stuff an
interesting case study...

'abigail()' is buggy, non-portable (Win32 doesn't like
'perror(malloc)'), and leaks - though the bugs and portability issue are
easily fixed. (See revised code below.)

'sisyphus2()' despite addressing the "buffer duplication" issue, is
about 4 times slower than 'sisyphus()'. This is because 'sv_insert()' is
slow. Maybe there's a more efficient way of setting the characters of
a SV to random values. If there is, then I couldn't find it, and we'll
have to wait for someone else to tell us about it.

The only changes I've made to 'sisyphus()' and 'sisyphus2()' is to take
values mod 256 (not 255). This means that those 2 subs produce
characters in the range chr(0)..chr(255), whereas 'abigail()' produces
characters in the range chr(1)..chr(255). You can suit yourself in
regard to this - the important thing is that you don't want to let
'abigail()' produce NULL characters.

I've also restored 'l' to the original value of 20000.

Just use the 'sisyphus()' subroutine - which is essentially as I posted
in the first place - unless someone can convince you to do otherwise.

use warnings;
use Benchmark;

use Inline C => Config =>
BUILD_NOISY => 1;

use Inline C => <<'EOC';

char * abigail () {
char * str;
int i;
int l;

l = 20000;
srand (time ((time_t) NULL));
if ((str = (char *) malloc ((l + 1) * sizeof (char))) ==
(char *) NULL) {
croak("Failed to allocate memory in abigail()");
}
for (i = 0; i < l; i ++) {
str = (rand () % 255) + 1;
}
str[l] = 0;
return (str);
}

SV * sisyphus () {
char * str;
SV * outsv;
int i;
int l;

l = 20000;
srand (time ((time_t) NULL));
New(1, str, l * sizeof(char), char);
if(str == NULL)
croak("Failed to allocate memory in sisyphus()");
for (i = 0; i < l; i ++) {
str = rand () % 256;
}
outsv = newSVpv(str, l);
Safefree(str);
return outsv;
}

SV * sisyphus2 () {
SV * outsv;
char c;
int i;
int l = 20000;

outsv = NEWSV(0, l);
srand (time ((time_t) NULL));

for (i = 0; i < l; i ++) {
c = rand () % 256;
sv_insert(outsv, i, 1, &c, 1);
}
SvPOK_on(outsv);
SvCUR_set(outsv, l);
return outsv;
}

EOC

$s1 = abigail();
print length($s1), "\n";

$s2 = sisyphus();
print length($s2), "\n";

$s3 = sisyphus2();
print length($s3), "\n";

timethese(500, {
's1' => '$s1 = abigail();',
's2' => '$s2 = sisyphus();',
's3' => '$s3 = sisyphus2();',
}),

__END__

Cheers,
Rob

Tassilo v. Parseval · Nov 5, 2004

Also sprach Sisyphus:

'sisyphus2()' despite addressing the "buffer duplication" issue, is
about 4 times slower than 'sisyphus()'. This is because 'sv_insert()' is
slow. Maybe there's a more efficient way of setting the characters of
a SV to random values. If there is, then I couldn't find it, and we'll
have to wait for someone else to tell us about it.

Why not just SvGROW the SV to the desired length and afterwards

i = 0;
while (i < l)
SvPVX(sv)[i++] = rand() % 256;

SvPVX returns the address of the C-string stored in the SV. Once you
know that the SV is at least a PV you can do it.

Tassilo

Ilya Zakharevich · Nov 5, 2004

[A complimentary Cc of this posting was sent to
Tassilo v. Parseval

Since SvGROW will upgrade a fresh SV to a PV, one could get away with
just using SvPVX and thus save the ternary operation in SvPV_nolen. And
again half a nanosecond is gained. ;-)

Right, this is what I do in tightmost loops. However, I think I
forgot to put SvCUR_set(outsv,l). (I did not check, but my vague
recolection is that SvGROW does not update SvCUR()...)

Yours,
Ilya

Sisyphus · Nov 6, 2004

Tassilo said:
Also sprach Sisyphus:

'sisyphus2()' despite addressing the "buffer duplication" issue, is
about 4 times slower than 'sisyphus()'. This is because 'sv_insert()' is
slow. Maybe there's a more efficient way of setting the characters of
a SV to random values. If there is, then I couldn't find it, and we'll
have to wait for someone else to tell us about it.

Click to expand...

Why not just SvGROW the SV to the desired length and afterwards

i = 0;
while (i < l)
SvPVX(sv)[i++] = rand() % 256;

SvPVX returns the address of the C-string stored in the SV. Once you
know that the SV is at least a PV you can do it.

Tassilo

NEWSV() creates the string of the desired length - so, unless I'm
missing the point, there's no need for SvGROW(). I've replaced each 'l'
with 'len' for readability ... something I should have done earlier.

SV * fastest () {
SV * outsv;
int i;
int len = 20000;

outsv = NEWSV(0, len);
srand (time ((time_t) NULL));

for (i = 0; i < len; i ++) {
SvPVX(outsv)[i++] = rand() % 256;
}
SvPOK_on(outsv);
SvCUR_set(outsv, len);
return outsv;
}

That's clearly the fastest way of doing it. It's twice as fast as
sisyphus(). That damned warning about the uninitialised subroutine
entry, which seems somehow connected with my use of sv_insert(), also
disappears. (I'm going to have to ask about that crazy warning on the XS
list. I know why it normally appears, and how to deliberately create it,
but I can't for the life of me work out why the sisyphus() code produces
it.)

So ... is the above as you intended ? It produces strings of the correct
length and they seem to be random. But whereas sisyphus() and
sisyphus2() produce the same random string, fastest() produces a
different string - though there's a significant (offset) section that
*does* match.

The answer is probably staring me in the face. My perlapi docs claim
that the SV (ie 'outsv') must contain a string - and I wonder if there's
any guarantee that it does (initially) and what happens when it contains
more than one NULL character.

There's something I've not done quite right. With luck, things will be
quiet at work tonight and I might get a chance to work it out myself ....

Cheers,
Rob

Michele Dondi · Nov 6, 2004

Just use the 'sisyphus()' subroutine - which is essentially as I posted
in the first place - unless someone can convince you to do otherwise.

That's what I was thinking of doing anyway.

TY,
Michele

Anno Siegel · Nov 6, 2004

Michele Dondi said:
Well, I guess I can live with it too. Especially since I had never
even noticed up until now! However, out of curiosity: do you think
that it "ought" to be allowed, hypothetically speaking?

It ought to work if we expected orthogonal design: Never mind what kind
of block, "use" and "no" can go into it. But this is Perl...

Anno

Sisyphus · Nov 6, 2004

Sisyphus said:
for (i = 0; i < len; i ++) {
SvPVX(outsv)[i++] = rand() % 256;
}

Try maybe:
for (i = 0; i < len; i ++) {
SvPVX(outsv) = rand() % 256;
}

It dawned on me a few hours ago. My face is still glowing bright red

No wonder it was twice as fast.

Thanks Tassilo - sorry for the noise.

Cheers,
Rob

Uri Guttman · Nov 7, 2004

MD> On Wed, 03 Nov 2004 00:23:13 +0100, Michele Dondi

MD> Except that this won't work! When I try it in actual code I get

MD> "no" not allowed in expression at foo.pl line 12, at end of line
MD> BEGIN not safe after errors--compilation aborted at foo.pl line 12.

MD> Of course I can do

MD> {
MD> no strict 'refs';
MD> cmpthese 500, { map { $_ => \&{$_} }
MD> qw/sol1 sol2 etc/ };
MD> }

or use a do{} block inside the map block. that should handle any stuff
you want to put in there.

uri

Anno Siegel · Nov 7, 2004

Uri Guttman said:
MD> On Wed, 03 Nov 2004 00:23:13 +0100, Michele Dondi

MD> Except that this won't work! When I try it in actual code I get

MD> "no" not allowed in expression at foo.pl line 12, at end of line
MD> BEGIN not safe after errors--compilation aborted at foo.pl line 12.

MD> Of course I can do

MD> {
MD> no strict 'refs';
MD> cmpthese 500, { map { $_ => \&{$_} }
MD> qw/sol1 sol2 etc/ };
MD> }

or use a do{} block inside the map block. that should handle any stuff
you want to put in there.

Inside, or instead. In the latter case, map needs a comma after the
block -- "do{}" is a block seen from inside, but an expression when
looked at from outside.

Anno

Michele Dondi · Nov 7, 2004

MD> On Wed, 03 Nov 2004 00:23:13 +0100, Michele Dondi
MD> Of course I can do

MD> {
MD> no strict 'refs';
MD> cmpthese 500, { map { $_ => \&{$_} }
MD> qw/sol1 sol2 etc/ };
MD> }

or use a do{} block inside the map block. that should handle any stuff
you want to put in there.

Dear Uri,

Of course I can do

{
no strict 'refs';
cmpthese 500, { map { $_ => \&{$_} }
qw/sol1 sol2 etc/ };
}

instead, but IMHO the no-strict-'refs'-ization logically best applies
to the map()'s code block. Now I know that there are easy workarounds,
e.g. using the "EXPR-form" of map() with a do(); still I wonder why ^^^^
^^^^
the block of the "BLOCK-form" of map() doesn't behave like a "normal"
block.

Michele

Ilya Zakharevich · Nov 8, 2004

[A complimentary Cc of this posting was sent to
Sisyphus

NEWSV() creates the string of the desired length - so, unless I'm
missing the point, there's no need for SvGROW().

People usually try to avoid it since it is not documented. Fat chance
that it will suddently go, though...

SvPOK_on(outsv);
SvCUR_set(outsv, len);
return outsv;

This is not enough.

*SvEND(outsv) = 0; /* Terminate by \0 */

(otherwise calling Perl functions which pass the guy to syscalls will
fail).

Hope this helps,
Ilya

Sisyphus · Nov 8, 2004

Ilya said:
[A complimentary Cc of this posting was sent to
Sisyphus

NEWSV() creates the string of the desired length - so, unless I'm
missing the point, there's no need for SvGROW().

Click to expand...

People usually try to avoid it since it is not documented. Fat chance
that it will suddently go, though...

SvPOK_on(outsv);
SvCUR_set(outsv, len);
return outsv;

Click to expand...

This is not enough.

*SvEND(outsv) = 0; /* Terminate by \0 */

Aahh!! I had done a Devel:

eek:

ump of the various return values and
noticed that the \0 was missing - though CUR and LEN were set correctly.
I couldn't work out how to remedy that. Every successful attempt to add
the \0 seemed to screw up either LEN or CUR (or both). Now I know how to
achieve it. Thanks Ilya.

So the code finally looks like this:

SV * fastest () {
SV * outsv;
int i;
int len = 20000;

outsv = NEWSV(0, len);
srand (time ((time_t) NULL));

for (i = 0; i < len; i ++)
SvPVX(outsv) = rand() % 256;
SvPOK_on(outsv);
SvCUR_set(outsv, len);
*SvEND(outsv) = 0;
return outsv;
}

Which is slightly (negligibly) *slower* than the first code that I
posted (which doubles up on the string buffers).

Cheers,
Rob

Ilya Zakharevich · Nov 8, 2004

[A complimentary Cc of this posting was sent to
Sisyphus

And, of course, the initial NEWSV() should have len+1 to accomodate
this 0...

Aahh!! I had done a Devel:eek:ump of the various return values and
noticed that the \0 was missing - though CUR and LEN were set correctly.
I couldn't work out how to remedy that. Every successful attempt to add
the \0 seemed to screw up either LEN or CUR (or both). Now I know how to
achieve it. Thanks Ilya.

Sorry for giving help in so small increments... Usually I just
cut&paste from my old XS code...

Yours,
Ilya

Sisyphus · Nov 9, 2004

Ilya said:
[A complimentary Cc of this posting was sent to
Sisyphus

And, of course, the initial NEWSV() should have len+1 to accomodate
this 0...

It doesn't seem to need it. If I change it to len+1 (and that's the only
change I make) then the Devel:

eek:

ump of the returned SV does not
change.

Also, on Win32 at least, my 'perldoc perlapi' documentation for NEWSV()
claims that "... A non-zero 'len' parameter indicates the number of
bytes of preallocated string space the SV should have. An extra byte for
a tailing NUL is also reserved. ...."

I interpret that to mean that I don't need to include that extra byte
for the tailing NUL in 'len'.

However, including that extra byte in 'len' doesn't seem to hurt - so
I've no real objection to doing it.

Sorry for giving help in so small increments... Usually I just
cut&paste from my old XS code...

Oh ... that's ok ... my brain can only assimilate information in "small
increments"

Cheers,
Rob

Q re sysopen() and more...	6	Oct 13, 2004
[Proposal] Q: RE lexical scopes and sub declarations	5	Dec 1, 2004
'Needless flexibilities' and structured records [very long]	10	Mar 15, 2013
Counting the repetition count of repeated substrings	0	Nov 15, 2007
initialize a hash	9	Aug 15, 2006
differences between find2perl and find	4	Aug 28, 2009
[PerlMonks] About List::Util's pure Perl shuffle()	5	Jul 13, 2007
Win32:OLE and properties enumeration	1	Nov 24, 2004

Q: re Inline and Benchmark

Michele Dondi

Michele Dondi

Sisyphus

Anno Siegel

Michele Dondi

Michele Dondi

Sisyphus

Tassilo v. Parseval

Ilya Zakharevich

Sisyphus

Michele Dondi

Anno Siegel

Sisyphus

Uri Guttman

Anno Siegel

Michele Dondi

Ilya Zakharevich

Sisyphus

Ilya Zakharevich

Sisyphus

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads