Perl array bug?

M

Malcolm Hoar

I believe I may have found a bug in ActivePerl 5.12.1.1201 (64-bit).

This problem does *not* arise with ActivePerl 5.10.1.1007 (64-bit).

Both running on Windows 7 64-bit with all current hot fixes.

Unfortunately, I have not yet been able to isolate the problem
code from a large and proprietary program.

Howeber, in short, we clear a large array:

@Big1 = ();

The specific elements of the array are populated:

$Big1[$index] = $value;

However, the array contains many "holes" (undefined elements).
So, this amounts to something like:

$Big1[538] = 0;
$Big1[53487] = 1;
$Big1[35306] = 2;
etc.

Later, we interate over the array:

foreach $key (@Big1) {
$len = length ($key);
if ($len) {
if ($key eq '' && $len == 4) {print "You're kidding me!\n"; }

Where we encounter a bizarre situation whereby an element
is undefined but has a positive length.

I have tried to create a small standalone program that
demonstrates the problem but without any success.

I did try logging the values used to populate @Big1 and
wrote a standalone that would fill an array with the
same data. However, that did not exhibit the problem.

Is anyone aware of any similar issues or have any
suggestions that might help me create a standalone
program to demonstrate the problem?

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
U

Uri Guttman

MH> $Big1[$index] = $value;

MH> However, the array contains many "holes" (undefined elements).
MH> So, this amounts to something like:

MH> $Big1[538] = 0;
MH> $Big1[53487] = 1;
MH> $Big1[35306] = 2;
MH> etc.

MH> Later, we interate over the array:

MH> foreach $key (@Big1) {
MH> $len = length ($key);
MH> if ($len) {
MH> if ($key eq '' && $len == 4) {print "You're kidding me!\n"; }

you are obviously not using warnings or you would get several
there. length and eq will warn on undef.

MH> Where we encounter a bizarre situation whereby an element
MH> is undefined but has a positive length.

i doubt it is a bug in perl. i doubt your logic is clean given the
warnings that should be emitted. maybe there is something else that is
going on we can't see.

MH> I have tried to create a small standalone program that
MH> demonstrates the problem but without any success.

that is important. you can't claim a bug unless you can isolate and show
it to others with code.

MH> I did try logging the values used to populate @Big1 and
MH> wrote a standalone that would fill an array with the
MH> same data. However, that did not exhibit the problem.

that means it isn't just an array with empty slots problem. it is likely
elsewhere in the code.

uri
 
S

Steve C

Malcolm said:
I believe I may have found a bug in ActivePerl 5.12.1.1201 (64-bit).

This problem does *not* arise with ActivePerl 5.10.1.1007 (64-bit).

Many people have claimed "This doesn't happen with X" as
proof that Y must somehow be at fault. Its a fallacy.
 
M

Malcolm Hoar

you are obviously not using warnings or you would get several
there. length and eq will warn on undef.

No, I'm not. This code is quite old. But it's also been
running on hundreds of systems for almost 10 years without
any problem!
MH> Where we encounter a bizarre situation whereby an element
MH> is undefined but has a positive length.

i doubt it is a bug in perl. i doubt your logic is clean given the
warnings that should be emitted. maybe there is something else that is
going on we can't see.

Well, yeah, there's lots going on but not with these vars.
MH> I have tried to create a small standalone program that
MH> demonstrates the problem but without any success.

that is important. you can't claim a bug unless you can isolate and show
it to others with code.

I understand the need for a reproducible example. That
looks like it's going to be very tough. It may be
impossible if, as I suspect, the problem is deep down
with the malloc() calls.

This same code did reveal some other malloc() problems with
ActivePerl versions with build numbers around 600, I think.
But they were resolved a long time ago.
MH> I did try logging the values used to populate @Big1 and
MH> wrote a standalone that would fill an array with the
MH> same data. However, that did not exhibit the problem.

that means it isn't just an array with empty slots problem. it is likely
elsewhere in the code.

That doesn't follow if the problem is with Perl's internal
memory management.

It looks like I can work around the problem with an
explicit check for undef versus relying upon the
returned length(). I will pursue this in order to
get my users up and running first. Then I will continue
trying to create a standalone demonstration of the
issue.

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
M

Malcolm Hoar

Many people have claimed "This doesn't happen with X" as
proof that Y must somehow be at fault. Its a fallacy.

I made no such claim. It's "relevant evidence". But it's
not proof, and I didn't claim it was.

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
U

Uri Guttman

MH> It looks like I can work around the problem with an
MH> explicit check for undef versus relying upon the
MH> returned length(). I will pursue this in order to
MH> get my users up and running first. Then I will continue
MH> trying to create a standalone demonstration of the
MH> issue.

that was what i should have suggested. defined is a better test than
length as it makes more logical sense and won't be confused.

there are some subtle things with arrays and undefining entries. are you
doing that? i never like to see undef used like a function.

uri
 
M

Malcolm Hoar

that was what i should have suggested. defined is a better test than
length as it makes more logical sense and won't be confused.

there are some subtle things with arrays and undefining entries. are you
doing that? i never like to see undef used like a function.

No. This array name is reused a lot (which helps memory usage).
But it's explicitly cleared at the start of the problem module:

@Big1 = ();

Elements are set, modified, and used but never deleted/undef'ed.

Before the module returns, the array is completely cleared
once again.


--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
U

Uri Guttman

MH> No. This array name is reused a lot (which helps memory usage).
MH> But it's explicitly cleared at the start of the problem module:

MH> @Big1 = ();

MH> Elements are set, modified, and used but never deleted/undef'ed.

MH> Before the module returns, the array is completely cleared once
MH> again.

maybe instead of clearing it, let it exit scope? or since it is sparse
as you claim, change it to a hash and you will save space. maybe the
problem will go away too. it may be as simple as changing all [] to {}
where the array is used plus some other things like keys for loops.

uri
 
I

Ilya Zakharevich

MH> But it's explicitly cleared at the start of the problem module:

MH> @Big1 = ();

MH> Elements are set, modified, and used but never deleted/undef'ed.

MH> Before the module returns, the array is completely cleared once
MH> again.

maybe instead of clearing it, let it exit scope?

AFACR, exiting the scope is equivalent to

undef @Big1;

(But maybe I'm mixing scalars and arrays; I worked on the code of
undef() so long ago, and did not look into it for some time.) Anyway,
this should be visible with Devel::peek...
problem will go away too.

The problem is a bug in Perl. It won't go away by changing a user's code.

Ilya
 
I

Ilya Zakharevich

Many people have claimed "This doesn't happen with X" as
proof that Y must somehow be at fault. Its a fallacy.

Do not be silly. For a piece of code

a) which uses constructs taught in the first hour of Perl tutorials;
b) which works differently between two versions of Perl

there must be a documented easy-findable explanation. If not, it is a
bug in Perl.

Yours,
Ilya
 
X

Xho Jingleheimerschmidt

Uri said:
MH> $Big1[$index] = $value;

MH> However, the array contains many "holes" (undefined elements).
MH> So, this amounts to something like:

MH> $Big1[538] = 0;
MH> $Big1[53487] = 1;
MH> $Big1[35306] = 2;
MH> etc.

MH> Later, we interate over the array:

MH> foreach $key (@Big1) {
MH> $len = length ($key);
MH> if ($len) {
MH> if ($key eq '' && $len == 4) {print "You're kidding me!\n"; }

Instead of printing "You're kidding me", try dumping the value using
Devel::peek. That might give a clue for further debugging.

you are obviously not using warnings or you would get several
there. length and eq will warn on undef.

Maybe he does. Whether he gets warnings or not changes the issue at
issue not one little bit.
MH> I have tried to create a small standalone program that
MH> demonstrates the problem but without any success.

that is important. you can't claim a bug unless you can isolate and show
it to others with code.

Of course he can.

Xho
 
X

Xho Jingleheimerschmidt

Steve said:
Many people have claimed "This doesn't happen with X" as
proof that Y must somehow be at fault. Its a fallacy.

It may not be proof, but it pretty good evidence. Some people, when they
have nothing useful to say, just resort to gratuitous insults.

Xho
 
M

Malcolm Hoar

If you can reproduce the problem at will, try inserting

use Devel::peek ();
Devel::peek::Dump $key;

Thank you, sir.

Yes, it's a bit of hassle but I can reproduce the problem
at will.

So here's the offending part of the code:

foreach $key (@Big1) {
$len = length ($key);
if ($len) {
if ($key eq '') {
print "Here's our bad boy\n";
use Devel::peek ();
Devel::peek::Dump $key;
}

and the output:

Here's our bad boy
SV = PVLV(0x2c4488) at 0x32979a8
REFCNT = 2
FLAGS = (GMG,SMG)
IV = 0
NV = 0
PV = 0
MAGIC = 0x34932d8
MG_VIRTUAL = &PL_vtbl_defelem
MG_TYPE = PERL_MAGIC_defelem(y)
TYPE = y
TARGOFF = 1601
TARGLEN = -1
TARG = 0x24d760
SV = PVAV(0x2e5d378) at 0x24d760
REFCNT = 3
FLAGS = ()
ARRAY = 0x34b23e8
FILL = 63273
MAX = 65533
ARYLEN = 0x0
FLAGS = (REAL)
Elt No. 0
Elt No. 1
Elt No. 2
Elt No. 3

Unfortunately, I don't understand any of that. If you or anyone
else sees any clues in there, I would be most happy to hear
from you!

I think the array is too large to dump the whole thing.

And, yes, I fully appreciate your point about code changes
and debugging modes causing the problem to disappear.
Been there and done that a few times before with malloc()
related issues!

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
X

Xho Jingleheimerschmidt

Malcolm said:
So here's the offending part of the code:

foreach $key (@Big1) {
$len = length ($key);
if ($len) {
if ($key eq '') {
print "Here's our bad boy\n";
use Devel::peek ();
Devel::peek::Dump $key;
}

and the output:

Here's our bad boy
SV = PVLV(0x2c4488) at 0x32979a8
REFCNT = 2
FLAGS = (GMG,SMG)
IV = 0
NV = 0
PV = 0
MAGIC = 0x34932d8
MG_VIRTUAL = &PL_vtbl_defelem
MG_TYPE = PERL_MAGIC_defelem(y)

This looks like the magic undef that Perl uses in sparse arrays. All
(more or less) of the the sparce undefs point to the same structure, to
save memory. That structures knows that, if assigned to, it has to
mutate the thing pointing to it into a real scalar.

.....
Unfortunately, I don't understand any of that. If you or anyone
else sees any clues in there, I would be most happy to hear
from you!

I don't understand much of it either. But your format differs from mine
shortly after where I trimmed it, but I'm using a different version and
different OS, so that might be expected.

Could you also Dump out another undefined element (one that reports a
length of zero) for comparison?

Xho
 
M

Malcolm Hoar

Could you also Dump out another undefined element (one that reports a
length of zero) for comparison?

Hmmmm, well I just tried Dumping some known elements
before the foreach loop:

use Devel::peek ();
print "Big1[1599] should be undef\n";
Devel::peek::Dump $Big1[1599];
print "Big1[1600] first populated element of this array\n";
Devel::peek::Dump $Big1[1600];
print "Big1[1601] this is the bad boy with length > 0 and no value\n";
Devel::peek::Dump $Big1[1601];

This gave very different looking output:

Big1[1599] should be undef
SV = NULL(0x0) at 0x31566c0
REFCNT = 1
FLAGS = ()
Big1[1600] first populated element of this array
SV = PV(0x3308370) at 0x3154cf8
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x32e8ac8 "\0\0\0Y"\0
CUR = 4
LEN = 8
Big1[1601] this is the bad boy with length > 0 and no value
SV = NULL(0x0) at 0x31559e8
REFCNT = 1
FLAGS = ()

So the foreach loop and $key iterator seem to be playing
a role here.

Let me explore that and compare Dumps directly from the
array with those from $key and $len inside the loop.

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
M

Malcolm Hoar

If you can reproduce the problem at will, try inserting

Here's the original code snippet:

foreach $key (@Big1) {
$len = length ($key);
if ($len) {
if ($key eq '' && $len == 4) {print "You're kidding me!\n"; }

When I examined the array elements outside of the foreach
loop, the data seemed clean as best as I could tell.

So, I decided to try and different loop structure and reworked
it in the form:

for ($i = 0; $i < @Big1 - 1; $i++) {
$key = $Big1[$i];
$len = length ($key);
if ($len) {
...

That didn't seem to make any difference; the problem still
arose.

Since the bad data seemed to be associated with the value
of $len rather than the value of $key, I decided to attack
the length() call and reworked the code thus:


foreach $key (@Big1) {
$len = &mylen ($key);
if ($len) {
...

sub mylen {
my ($var) = @_;
$sss = length ($var);
return $sss;
}

And it fixed the problem.

At this point, I am fairly certain it is a bug. Rather less
sure that it is a malloc() related issue. I am wondering if
Perl is trying to optimize the code within the loop and taking
some kind of (inappropriate) shortcut when evaluating the
call to the length() intrinsic?

To summarize, I reduced the salient bit of code to:

foreach $key(@Big1){
#$len=length($key);
$len=&mylen($key);
if ($len && $key eq '') {print "Got the bad boy\n"; exit;}
}
exit;

And this works fine. If I flip the comment to activate the
regular call to length() I get the "bad boy" error.

Obviously, this small snippet is surrounded by gobs of other
code and data which clearly have a bearing on the problem.
I will seek to reduce that to something which would be
reasonable to post for others to study.









--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
M

Malcolm Hoar

Here's some isolated code that demonstrates my problem:

@Big1 = ();
$Big1[4] = 0;
$Big1[9] = 1;
$Big1[6] = 2;
my $len = 0; # This appears to be significant
foreach $key (@Big1) {
$len = length ($key);
print "key = $key, len=$len\n";
}
exit;

Operating system is 64-bit Windows 7.

Using ActivePerl version 5.12.1.1201 64-bit I get:

C:\ZIP>c:\perl512\bin\perl.exe test.pl
key = , len=0
key = , len=0
key = , len=0
key = , len=0
key = 0, len=1
key = , len=1 <==== ???
key = 2, len=1
key = , len=1 <==== ???
key = , len=1 <==== ???
key = 1, len=1

Using ActivePerl version 5.10.1.1007 64-bit I get:

C:\ZIP>C:\perl510\bin\perl.exe test.pl
key = , len=0
key = , len=0
key = , len=0
key = , len=0
key = 0, len=1
key = , len=0
key = 2, len=1
key = , len=0
key = , len=0
key = 1, len=1

So, what say you guys? Bug or user error?

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
J

Jürgen Exner

Here's some isolated code that demonstrates my problem:

@Big1 = ();
$Big1[4] = 0;
$Big1[9] = 1;
$Big1[6] = 2;
my $len = 0; # This appears to be significant
foreach $key (@Big1) {
$len = length ($key);
print "key = $key, len=$len\n";
}
exit;

Operating system is 64-bit Windows 7.

Using ActivePerl version 5.12.1.1201 64-bit I get:

C:\ZIP>c:\perl512\bin\perl.exe test.pl
key = , len=0
key = , len=0
key = , len=0
key = , len=0
key = 0, len=1
key = , len=1 <==== ???
key = 2, len=1
key = , len=1 <==== ???
key = , len=1 <==== ???
key = 1, len=1
So, what say you guys? Bug or user error?

Clearly programmer error. Neither printing the value of nor computing
the length of undef is well defined. It is like division by zero.
It is an illegal operation, therefore the system can do whatever it
pleases to do. If it is nice then it may fail with an error message like
perl would have done if you had used warnings and strict.
But returning some garbage is just as legitimate. And it just happens
that 5.10 and 5.12 are returning different garbage.

jue
 
M

Malcolm Hoar

Clearly programmer error. Neither printing the value of nor computing
the length of undef is well defined. It is like division by zero.
It is an illegal operation, therefore the system can do whatever it
pleases to do. If it is nice then it may fail with an error message like
perl would have done if you had used warnings and strict.
But returning some garbage is just as legitimate. And it just happens
that 5.10 and 5.12 are returning different garbage.

Well, that's a point of view.

I do not agree that it is like division by zero which clearly
has no meaning (at least in ordinary arithmetic).

In my view, an undef must have a zero length. And, as far as
I know, Perl 5.10.1 and every major version going back 10+
years have reflected that quite consistently.

An undef is rather more like a null pointer, and other
languages (that I have used) will typically handle those
in a consistent (not random) manner.

I know not whether the change in behavior at 5.12.1 was by
design or happenstance. But I for one consider it somewhat
unfortunate and probably misguided.

Sadly, I'm going to have to review a great deal of code and
I suspect that I am far from alone.

In addition, the role of the "my $len" statement in this
situation troubles me a little and leaves me wondering
whether the changes made at 5.12.1 have other possibly
ugly manifestations.

On balance, I'm inclined to disagree with your position
and it's certainly inconvenient for me. But I accept that
it has some merit worthy of consideration and I thank you
for making it.

--
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Malcolm Hoar "The more I practice, the luckier I get". |
| (e-mail address removed) Gary Player. |
| http://www.malch.com/ Shpx gur PQN. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top