Variable length arrays

  • Thread starter Erwin Lindemann
  • Start date
E

Erwin Lindemann

If a VLA appears within a loop body, it seems the behavior is
different with two different compilers I tried. I looked at the
standard text, but couldn't find a definite answer there either.

Consider the following test program

/* begin foo.c */
#include <stdio.h>
#include <string.h>

void test(int n, size_t size)
{
int i;

for(i = 0; i < n; i++) {
unsigned char vla[size];
memset(vla, (i & 255), size);
printf("step %d: vla=%p\n", i, &vla[0]);
}
}

int main(void)
{
test(10, 256*1024L);
return 0;
}
/* end foo.c */

With gcc, 'vla' is reused in every iteration, i.e., the address
of 'vla[0]' is identical in every step.

However, with lcc-win32, output is as follows...

step 0: vla=0x002ffea0
step 1: vla=0x002bfea0
step 2: vla=0x0027fea0
step 3: vla=0x0023fea0
[*CRASH*]

, meaning, new storage is allocated for 'vla' at every iteration,
eventually exhausting all available auto storage.

Now, is this just implementation dependant and this kind of construct
should be avoided, or is one of these compilers not working correctly?

Should a bug report be filed?

Thanks
 
J

Jack Klein

If a VLA appears within a loop body, it seems the behavior is
different with two different compilers I tried. I looked at the
standard text, but couldn't find a definite answer there either.

There is no definitive answer. All the standard says about VLAs that
might be relevant is this:

"For such an object that does have a variable length array type, its
lifetime extends from the declaration of the object until execution of
the program leaves the scope of the declaration. If the scope is
entered recursively, a new instance of the object is created each
time. The initial value of the object is indeterminate."

It says nothing at all about whether each creation must, may, or may
not have the same address. So it is entirely a QOI issue.
Consider the following test program

/* begin foo.c */
#include <stdio.h>
#include <string.h>

void test(int n, size_t size)
{
int i;

for(i = 0; i < n; i++) {
unsigned char vla[size];
memset(vla, (i & 255), size);
printf("step %d: vla=%p\n", i, &vla[0]);
}
}

int main(void)
{
test(10, 256*1024L);
return 0;
}
/* end foo.c */

With gcc, 'vla' is reused in every iteration, i.e., the address
of 'vla[0]' is identical in every step.

However, with lcc-win32, output is as follows...

step 0: vla=0x002ffea0
step 1: vla=0x002bfea0
step 2: vla=0x0027fea0
step 3: vla=0x0023fea0
[*CRASH*]

, meaning, new storage is allocated for 'vla' at every iteration,
eventually exhausting all available auto storage.

Now, is this just implementation dependant and this kind of construct
should be avoided, or is one of these compilers not working correctly?

Since the standard does not require either behavior, it is not a
conformance defect that I can see. It is a QOI issue, and I suggest
you contact the offending compiler's implementer, either directly or
on his support group.
Should a bug report be filed?

Since the standard does not guarantee that the creation of any array
of automatic duration will succeed, even if it is not a VLA, it's hard
to see a complaint on standard conformance grounds, but it's a huge
QOI issue.

I'd complain if I used this feature of his compiler, although I have
plonked him and don't use his compiler at all anymore.

Consider a similar case:

#include <stdio.h>

void func(int x)
{
printf("%p\n", (void *)&x;
}

int main(void)
{
int x = 0;
func();
func();
func();
return x;
}

I don't think I have ever used an implementation where the three calls
to func() would output different values.

Do you think the C standard requires it to be the same? If not, do
you think it should?

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
 
B

Bartc

Erwin Lindemann said:
If a VLA appears within a loop body, it seems the behavior is
different with two different compilers I tried. I looked at the
standard text, but couldn't find a definite answer there either.

Consider the following test program

/* begin foo.c */
#include <stdio.h>
#include <string.h>

void test(int n, size_t size)
{
int i;

for(i = 0; i < n; i++) {
unsigned char vla[size];
memset(vla, (i & 255), size);
printf("step %d: vla=%p\n", i, &vla[0]);
}
}

int main(void)
{
test(10, 256*1024L);
return 0;
}
/* end foo.c */

With gcc, 'vla' is reused in every iteration, i.e., the address
of 'vla[0]' is identical in every step.

However, with lcc-win32, output is as follows...

step 0: vla=0x002ffea0
step 1: vla=0x002bfea0
step 2: vla=0x0027fea0
step 3: vla=0x0023fea0
[*CRASH*]

, meaning, new storage is allocated for 'vla' at every iteration,
eventually exhausting all available auto storage.

I've never used VLAs, but: as you can't access all the previous incarnations
of vla[] inside the loop, only the last one, then they are wasting memory.

Also you may expect changes to vla[] to be valid from one loop iteration
to the next, and they're not; you get a fresh array each time.

Any ordinary variable declared in the loop body, however, just gets the one
address.
Now, is this just implementation dependant and this kind of construct
should be avoided, or is one of these compilers not working correctly?

Should a bug report be filed?

I would say this behaviour (of lccwin32) is undesirable. It's certainly not
useful.
 
M

Micah Cowan

Erwin Lindemann said:
With gcc, 'vla' is reused in every iteration, i.e., the address
of 'vla[0]' is identical in every step.

However, with lcc-win32, output is as follows...

step 0: vla=0x002ffea0
step 1: vla=0x002bfea0
step 2: vla=0x0027fea0
step 3: vla=0x0023fea0
[*CRASH*]

, meaning, new storage is allocated for 'vla' at every iteration,
eventually exhausting all available auto storage.

Now, is this just implementation dependant and this kind of construct
should be avoided, or is one of these compilers not working correctly?

Should a bug report be filed?

I'd file it, yeah.

6.2.4#6:

"such an object" "does have"
For [an object with automatic storage duration] that [has] a variable
length array type, its lifetime extends from the declaration of the
object until execution of the program leaves the scope of the
declaration.
 
J

jacob navia

Erwin said:
> If a VLA appears within a loop body, it seems the behavior is
> different with two different compilers I tried. I looked at the
> standard text, but couldn't find a definite answer there either.
>

There is no definite answer. It is implementation dependent.
> Consider the following test program
>
> /* begin foo.c */
> #include <stdio.h>
> #include <string.h>
>
> void test(int n, size_t size)
> {
> int i;
>
> for(i = 0; i < n; i++) {
> unsigned char vla[size]; //<<<<<<<<<<<<<<<<<
> memset(vla, (i & 255), size);
> printf("step %d: vla=%p\n", i, &vla[0]);
> }
> }
>
> int main(void)
> {
> test(10, 256*1024L);
> return 0;
> }
> /* end foo.c */
>
> With gcc, 'vla' is reused in every iteration, i.e., the address
> of 'vla[0]' is identical in every step.

If I change the marked line above to

unsigned char vla[size*i+1];

and change the size expression in the memset call to
the same expression, gcc will produce the following
output in my linux box:
step 0: vla=0xbffff990
step 1: vla=0xbffbf990
step 2: vla=0xbff7f990
step 3: vla=0xbff3f990
step 4: vla=0xbfeff990
step 5: vla=0xbfebf990
step 6: vla=0xbfe7f990
step 7: vla=0xbfe3f990
step 8: vla=0xbfdff990
step 9: vla=0xbfdbf990

As you can see, the addresses are different each time.

gcc is doing simply a constant propagation. It notices that the
expression in the VLA size is a constant expression within the
affected block, and optimizes the case you presented above
to reuse always the same block. Fine, maybe I could do such an
optimization too, so that the clique of comp.lang.c (where you
belong) can have a harder time, but what for? This optimization
would only optimize a special case.

I will do it some day when I introduce strength reduction into
loop bodies. But I am skeptical of those classic optimizations.
I am at 80-90% of the speed of gcc in many benchmarks, but the
compilation speed of lcc-win is 600-800% better than gcc. I
prefer this situation.
>
> However, with lcc-win32, output is as follows...
>
> step 0: vla=0x002ffea0
> step 1: vla=0x002bfea0
> step 2: vla=0x0027fea0
> step 3: vla=0x0023fea0
> [*CRASH*]
>
> , meaning, new storage is allocated for 'vla' at every iteration,
> eventually exhausting all available auto storage.
>
> Now, is this just implementation dependant and this kind of construct
> should be avoided, or is one of these compilers not working correctly?
>
> Should a bug report be filed?
>

You can only file a bug report if you buy maintenance at premium rates.
I have a special price for clique members sorry!

You are welcome.
 
J

jacob navia

Jack said:
I'd complain if I used this feature of his compiler, although I have
plonked him and don't use his compiler at all anymore.

Consider a similar case:

#include <stdio.h>

void func(int x)
{
printf("%p\n", (void *)&x); // missing ")" fixed
}

int main(void)
{
int x = 0;
func();
func();
func();
return x;
}

I don't think I have ever used an implementation where the three calls
to func() would output different values.

Do you think the C standard requires it to be the same? If not, do
you think it should?

I am happy you do not use my compiler system since your
buggy code will not even compile:

Error tbb.c: 11 insufficient number of arguments to `func'
Error tbb.c: 12 insufficient number of arguments to `func'
Error tbb.c: 13 insufficient number of arguments to `func'
3 errors, 0 warnings
 
C

CBFalconer

jacob said:
.... snip ...

You can only file a bug report if you buy maintenance at premium
rates. I have a special price for clique members sorry!

Horrible attitude. Also reflected in the apparent bug density.
 
J

jacob navia

CBFalconer said:
jacob navia wrote:
... snip ...

Horrible attitude. Also reflected in the apparent bug density.

What bug?

Of course if you just answer to my demonstration with

[snip]

and you are unable to forward ANY arguments you can
say that it is a bug without any fear isn't it?

I explained in a long discussion why that is NOT a bug.

You just snip all arguments. Is that a fair way of discussing?
 
K

Keith Thompson

jacob navia said:
Erwin said:
If a VLA appears within a loop body, it seems the behavior is
different with two different compilers I tried. I looked at the
standard text, but couldn't find a definite answer there either.

There is no definite answer. It is implementation dependent.
Consider the following test program

/* begin foo.c */
#include <stdio.h>
#include <string.h>

void test(int n, size_t size)
{
int i;

for(i = 0; i < n; i++) {
unsigned char vla[size]; //<<<<<<<<<<<<<<<<<
memset(vla, (i & 255), size);
printf("step %d: vla=%p\n", i, &vla[0]);
}
}

int main(void)
{
test(10, 256*1024L);
return 0;
}
/* end foo.c */

With gcc, 'vla' is reused in every iteration, i.e., the address
of 'vla[0]' is identical in every step.

The standard doesn't say that "vla" must have the same address at
every iteration of the loop, and it shouldn't matter whether it does
or not. In fact, a strictly conforming program can't tell whether the
same memory is re-used, since the address becomes invalid at the end
of the block, when the lifetime of "vla" has ended.

*However*, the lifetime of "vla" begins at its declaration and ends at
the end of the block (the loop body). There's a distinct array object
for each iteration of the loop, but the lifetimes of these objects do
not overlap. Since the program apparently dies with an out-of-memory
condition, it appears that the generated code is allocating space for
"vla" at the point of declaration *and not deallocating it* at the end
of the array's lifetime.

An argument could probably be made that this behavior doesn't violate
the standard, but it's certainly a bug. I don't care whether "vla"
has the same address each time; I care that the program crashes.

Consider:

for (i = 0; i < n; i ++) {
unsigned char arr[2000];
/* ... */
}

Would it be acceptable for n copies of "arr" to have memory allocated
for them simultaneously, causing the program to die? If not, why
would it be acceptable for a VLA?

Or perhaps I've misunderstood what's going on here. I don't have
lcc-win, so I can't test it myself. jacob, can you explain why the
program dies?

[...]

IMHO, yes.
You can only file a bug report if you buy maintenance at premium rates.
I have a special price for clique members sorry!

*Yawn*.
 
S

SM Ryan

#
# If a VLA appears within a loop body, it seems the behavior is
# different with two different compilers I tried. I looked at the
# standard text, but couldn't find a definite answer there either.

This is a 48 year old issue. Answer is both block and procedure
level allocation have their benefits so both will continue to be
used. Program to cope with either. In particular don't allocate
arrays in loops unless you're prepared to have the entire amount
allocated.

If you must allocate in a loop and you must have it released
at the end of the loop, put the loop body in a separate function.
 
M

Micah Cowan

SM Ryan said:
#
# If a VLA appears within a loop body, it seems the behavior is
# different with two different compilers I tried. I looked at the
# standard text, but couldn't find a definite answer there either.

This is a 48 year old issue. Answer is both block and procedure
level allocation have their benefits so both will continue to be
used. Program to cope with either. In particular don't allocate
arrays in loops unless you're prepared to have the entire amount
allocated.

Regardless of whether block or procedure-level allocation is used,
though, there must always be exactly one instance of the object.

On (non-broken) implementations with block-level allocation, the
following:

void foo(int a)
{
for (i=0; i!=10000; ++i) {
int v;
/* do something with v; */
}
}

would not result in 10,000 "v" objects being simultaneously allocated;
neither should an "int v[a];" result in such (and the standard
requires that it does not).
 
J

jacob navia

Keith said:
Or perhaps I've misunderstood what's going on here. I don't have
lcc-win, so I can't test it myself. jacob, can you explain why the
program dies?

I replied with a lengthy explanation that you apparently
did not bother to READ.

The program dies because for a VLA I do not make the
optimization that the variable "size" is a loop invariant,
i.e. does not change within the loop.

If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.


As I explained (and repeat here again) this is an optimization
that I do not do and gcc does.

Please read my posts before replying

thanks
 
M

Micah Cowan

jacob navia said:
I replied with a lengthy explanation that you apparently
did not bother to READ.

No, you didn't. You explained why new arrays are created, which is
fine.. You have failed to explain why the old ones aren't destroyed,
as they are required to have been.
The program dies because for a VLA I do not make the
optimization that the variable "size" is a loop invariant,
i.e. does not change within the loop.

If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.

Sure. Hopefully, though, it will also continue to destroy the old ones
first, just as it's required to.
 
J

Joachim Schmitz

jacob said:
I replied with a lengthy explanation that you apparently
did not bother to READ.

The program dies because for a VLA I do not make the
optimization that the variable "size" is a loop invariant,
i.e. does not change within the loop.
So you know exactly where the bug in your compiler is and how to fix it.
Good.
But why then don't you you say so explictly?
If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.
Irrelevant, as others pointed out
As I explained (and repeat here again) this is an optimization
that I do not do and gcc does.
And it doesn't die. This _is_ relevant.
Please read my posts before replying
Please be more clear in what you want to say.

Bye, Jojo
 
H

Harald van Dijk

If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.

If you print a pointer to the end of the array, you'll find it has the
same address every time. It happens to be a system with a stack that grows
downwards, so it's not practical to give the start of the array a fixed
location, but there's no reason why the end should not be given one.
 
J

jacob navia

Micah said:
No, you didn't. You explained why new arrays are created, which is
fine.. You have failed to explain why the old ones aren't destroyed,
as they are required to have been.

Required by whom?

you?

Thompson?
 
J

jacob navia

Joachim said:
So you know exactly where the bug in your compiler is and how to fix it.
Good.
But why then don't you you say so explictly?

because there is no bug./

Nowhere in the standard it is written that I must free those
arrays
If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.
Irrelevant, as others pointed out
As I explained (and repeat here again) this is an optimization
that I do not do and gcc does.
And it doesn't die. This _is_ relevant.

not to me

The standard does not specify that those arrays should be destroyed.
They are no longer available, that's all.
 
J

Joachim Schmitz

jacob said:
because there is no bug./
OK, see how unclear your wording was?
Nowhere in the standard it is written that I must free those
arrays
And what does common sense say? Isn't it you how constantly wants to improve
the standard?
If you replace
int tab[size];

with

int tab[size*i+1];

gcc will ALSO produce different arrays for each iteration.
Irrelevant, as others pointed out
As I explained (and repeat here again) this is an optimization
that I do not do and gcc does.
And it doesn't die. This _is_ relevant.

not to me

The standard does not specify that those arrays should be destroyed.
They are no longer available, that's all.
Even if it's not required by a standard, this is a Quality Of Implemenation
issue.
Destroying those arrays would surely not violate the standard, would it?
And I'd find such a "feature" much more usefull than a printf format
specified for complex numbers...

Bye, Jojo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top