memcpy() where assignment would do?

kuyper · Aug 23, 2007

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

Jensen Somers · Aug 23, 2007

Hi,

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

IIRC, this is a GCC extension. I had problems with this when using the
Microsoft Visual C compiler (which still uses some fork of the ISO89
standard). After further investigation it turned out this seemed to be an
extension and should throw a warning/error when compiling the --pedantic.

Jensen.

Eric Sosman · Aug 23, 2007

kuyper wrote On 08/23/07 11:27,:

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

A guess: The bogus definition of num_char[0] may
actually allocate memory as would num_char[1], but has
some other bizarre effect as well. If it didn't do
something unusual the programmer would have written [1]
in the first place, instead of writing [0] and then
going to the extra work of figuring out how to turn the
error message off. The use of memcpy() instead of
`cr_file[147] = num_char[0]' or `cr_file[147] = 1' may
have something to do with whatever that weird effect is.

Guess #2: Does the code call "the" memcpy(), or some
out-of-the-blue substitute? Writing your own substitutes
for Standard library functions is a no-no, but we've
already seen that the author didn't feel held bound to
respect the Standard at all times ...

Guess #3: Somewhere in the dusty annals of the code's
ancestry you will find the word IOCCC -- or was it XYZZY?

Mark Bluemel · Aug 23, 2007

kuyper said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

I'd be inclined to investigate what my compiler generated for each of
these constructs and look at what the differences might imply...

As you have given us very little context - platform, compiler, etc -
unless someone here has seen exactly this, it's unlikely we can comment
much more.

CBFalconer · Aug 23, 2007

Jensen said:
kuyper said:

I've run across some rather peculiar code; here are the relevant
lines that left me confused :

unsigned char cr_file[384];

Click to expand...

Yhis defines an array of 384 unsigned chars, indices 0 through 383.

unsigned char num_char[0];

Click to expand...

This is illegal. 0 size arrays cannot be declared.

I am piggybacking this reply.

kuyper · Aug 23, 2007

Eric Sosman wrote:
....

A guess: The bogus definition of num_char[0] may
actually allocate memory as would num_char[1], but has
some other bizarre effect as well. If it didn't do
something unusual the programmer would have written [1]
in the first place, instead of writing [0] and then
going to the extra work of figuring out how to turn the
error message off. The use of memcpy() instead of
`cr_file[147] = num_char[0]' or `cr_file[147] = 1' may
have something to do with whatever that weird effect is.

That would make sense; but it seems very unlikely. On the other hand,
up until yesterday, I would have said that code like this was very
unlikely. :-}

It will be easy to test for this. I intend to replace the odd code
with more conventional code. If you're first guess is correct, the
resulting output files won't match those created with the original
code. I'll be performing that test sometime today or tomorrow.

Guess #2: Does the code call "the" memcpy(), or some
out-of-the-blue substitute? Writing your own substitutes
for Standard library functions is a no-no, but we've
already seen that the author didn't feel held bound to
respect the Standard at all times ...

There's no alternative definition of memcpy() in the source code, and
it doesn't link to any libraries that might contain one.

kuyper · Aug 23, 2007

Mark Bluemel wrote:
....

As you have given us very little context - platform, compiler, etc -
unless someone here has seen exactly this, it's unlikely we can comment
much more.

Platform: SGI Origin 300 running IRIX 6.5. The compiler is the SGI C
compiler distributed with that version of IRIX. Compiler options: -O2 -
mips4 -xansi -fullwarn. I first noticed this code when I changed -
xansi to -ansi, which apparantly turns off an SGI extension supporting
0-length arrays.

Keith Thompson · Aug 23, 2007

kuyper said:
Mark Bluemel wrote:
...

Platform: SGI Origin 300 running IRIX 6.5. The compiler is the SGI C
compiler distributed with that version of IRIX. Compiler options: -O2 -
mips4 -xansi -fullwarn. I first noticed this code when I changed -
xansi to -ansi, which apparantly turns off an SGI extension supporting
0-length arrays.

Can you find SGI's documentation for that extension?

kuyper · Aug 23, 2007

Keith said:
Can you find SGI's documentation for that extension?

No. I've downloaded their C manual, and wandered around their website,
without finding anything. I've found mentions of the fact that they
have extensions, but no comprehensive list of the extensions, and no
mention of this specific extension. However, when I use -xansi, the
compiler tolerates declaration of a zero-length array without comment,
and the program works as if the array has a non-zero length; when I
use -ansi, compilation fails. The distinction between those two
options is supposed to be that -xansi enables SGI-specific extensions
to ANSI C.

Old Wolf · Aug 23, 2007

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Do these lines occur inside a structure definition?

kuyper · Aug 24, 2007

Old said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Click to expand...

Do these lines occur inside a structure definition?

No - they occur at block scope.

Peter J. Holzer · Aug 24, 2007

Old said:
Old said:

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Click to expand...

Do these lines occur inside a structure definition?

Click to expand...

No - they occur at block scope.

That's strange. Before C89 many compilers accepted zero-sized arrays and
it was a common idiom to define a structure like this:

struct foo {
size_t size; /* more likely int a the time */
short whatever;
double data[0];
}

and use it like this:

struct foo *p = malloc(sizeof struct foo + sizeof double * nelems);

p->size = nelems;
p->whatever = 42;
for (i = 0; i < nelems; i++) {
p->data = get_some_data();
}

/* do some more processing */

free(p);

data didn't actually use any space in the struct, but enforced proper
alignment and padding, so the single malloc would allocate the exact
amount of memory needed.

C89 didn't standardize zero-sized arrays (presumably because they
didn't fit with the "pointer arithmetic only defined within an object"
model) and subsequently people stopped using that idiom and (more)
compilers started to reject it.

I don't know what possible use a zero-sized array could have as an
automatic variable. If it's really zero-sized it's completely useless,
and if it isn't it must be some fixed size (at least if it is used with
memset as you showed - if it was used with ordinary indexes I could
imagine some compiler magic implementing a dynamic array[0]), and if it's
some fixed size, why not use that?

hp

[0] Yes, there could of course be some other compiler magic which calls
__builtin_dynamic_array_memset if memset is used on zero-sized
array and __builtin_normal_memset otherwise.

neildferguson · Aug 25, 2007

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

Here's a very off-topic possibility: the variables are defined as they are in
order to appear in particular segments of the linker's memory map, with
particular symbolic identification, so that the memory's addresses can be
associated with particular hardware I/O operations.

Neil

kuyper · Aug 25, 2007

[email protected] said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

Click to expand...

Here's a very off-topic possibility: the variables are defined as they are in
order to appear in particular segments of the linker's memory map, with
particular symbolic identification, so that the memory's addresses can be
associated with particular hardware I/O operations.

Ingenious possibility, but unfortunately not a plausible explanation
for this program. It's only purpose is breaking up a large partitioned
data set into several smaller files, which is non-trivial only because
the records are variable length, the files have to be split at the
boundary between two records, and each output file requires a seperate
header. cr_file is the array containing that header. It's not a very
complicated program. As a result, this wierd way of filling in the
headers adds significantly and unnecessarily to the total complexity.

For now, I'm assuming that he didn't have a valid reason for writing
the code this way. I suspect that I'll probably never learn what the
invalid reason was that motivated him to do so.

kuyper · Aug 25, 2007

CBFalconer said:
Jensen said:

kuyper said:

I've run across some rather peculiar code; here are the relevant
lines that left me confused :

unsigned char cr_file[384];

Click to expand...

Click to expand...

Yhis defines an array of 384 unsigned chars, indices 0 through 383.

Yes, of course. The size of that array doesn't confuse me. The bizarre
thing is the way it was used.

unsigned char num_char[0];

Click to expand...

Click to expand...

This is illegal. 0 size arrays cannot be declared.

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.

Keith Thompson · Aug 25, 2007

kuyper said:
CBFalconer wrote: [...]

This is illegal. 0 size arrays cannot be declared.

Click to expand...

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.

You said elsethread that this appears to be an SGI-specific extension.
Have you tried one of the comp.sys.sgi.* newsgroups? Or can you
contact SGI customer support?

kuyper · Aug 25, 2007

Keith said:
kuyper said:

CBFalconer wrote: [...]

This is illegal. 0 size arrays cannot be declared.

Click to expand...

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.

Click to expand...

You said elsethread that this appears to be an SGI-specific extension.
Have you tried one of the comp.sys.sgi.* newsgroups? Or can you
contact SGI customer support?

No, I haven't. While some people have suggested otherwise, I don't
think that the 0-sized array is related to the peculiar memcpy() calls
- the proposed connections are all pretty implausible to me. Since
it's the memcpy() calls that I'm mainly confused by, I haven't
followed upon on the SGI extensions angle. Of course, the memcpy()
seems pretty implausible too; but there it is. Maybe I should check
out SGI sources, though at this time I'm more inclined to simply drop
it.

Richard Tobin · Aug 26, 2007

kuyper said:
I've run across some rather peculiar code;

Is there any possibility that this code was originally machine-generated,
or results from macro-expansion of something more plausible?

-- Richard

kuyper · Aug 27, 2007

Richard said:
Is there any possibility that this code was originally machine-generated,
or results from macro-expansion of something more plausible?

Not likely. We don't use much machine-generated code in our project. I
believe that this program was created by hand, most likely by
modification of an existing program by the same author intended to
handle the same data set in a different fashion; but I have no idea
which program that was, nor where it might be found. Literally
inheriting responsibility for a program can be very difficult,
partticularly when, as in this case, the late author was not strong on
documentation, either internal or external.

problem with memcpy and pointers/arrays confusion - again	39	Mar 9, 2006
Array assignment via struct	36	Aug 4, 2005
Struct assignment	24	Jun 30, 2007
Assigning an array to another array using C's assignment operator	0	Feb 1, 2013
Assigning an array to another array using C's assignment operator	13	Jan 31, 2013
Can a C compiler do this - <related to Padding in Structures>?	18	May 6, 2010
Assigning an array to another array using C's assignment operator	1	Feb 1, 2013
Assigning an array to another array using C's assignment operator	0	Feb 1, 2013

memcpy() where assignment would do?

kuyper

Jensen Somers

Eric Sosman

Mark Bluemel

CBFalconer

kuyper

kuyper

Keith Thompson

kuyper

Old Wolf

kuyper

Peter J. Holzer

neildferguson

kuyper

kuyper

Keith Thompson

kuyper

Richard Tobin

kuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads