memcpy() where assignment would do?

K

kuyper

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.
 
J

Jensen Somers

Hi,

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

IIRC, this is a GCC extension. I had problems with this when using the
Microsoft Visual C compiler (which still uses some fork of the ISO89
standard). After further investigation it turned out this seemed to be an
extension and should throw a warning/error when compiling the --pedantic.

Jensen.
 
E

Eric Sosman

kuyper wrote On 08/23/07 11:27,:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

A guess: The bogus definition of num_char[0] may
actually allocate memory as would num_char[1], but has
some other bizarre effect as well. If it didn't do
something unusual the programmer would have written [1]
in the first place, instead of writing [0] and then
going to the extra work of figuring out how to turn the
error message off. The use of memcpy() instead of
`cr_file[147] = num_char[0]' or `cr_file[147] = 1' may
have something to do with whatever that weird effect is.

Guess #2: Does the code call "the" memcpy(), or some
out-of-the-blue substitute? Writing your own substitutes
for Standard library functions is a no-no, but we've
already seen that the author didn't feel held bound to
respect the Standard at all times ...

Guess #3: Somewhere in the dusty annals of the code's
ancestry you will find the word IOCCC -- or was it XYZZY?
 
M

Mark Bluemel

kuyper said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

I'd be inclined to investigate what my compiler generated for each of
these constructs and look at what the differences might imply...

As you have given us very little context - platform, compiler, etc -
unless someone here has seen exactly this, it's unlikely we can comment
much more.
 
K

kuyper

Eric Sosman wrote:
....
A guess: The bogus definition of num_char[0] may
actually allocate memory as would num_char[1], but has
some other bizarre effect as well. If it didn't do
something unusual the programmer would have written [1]
in the first place, instead of writing [0] and then
going to the extra work of figuring out how to turn the
error message off. The use of memcpy() instead of
`cr_file[147] = num_char[0]' or `cr_file[147] = 1' may
have something to do with whatever that weird effect is.

That would make sense; but it seems very unlikely. On the other hand,
up until yesterday, I would have said that code like this was very
unlikely. :-}

It will be easy to test for this. I intend to replace the odd code
with more conventional code. If you're first guess is correct, the
resulting output files won't match those created with the original
code. I'll be performing that test sometime today or tomorrow.
Guess #2: Does the code call "the" memcpy(), or some
out-of-the-blue substitute? Writing your own substitutes
for Standard library functions is a no-no, but we've
already seen that the author didn't feel held bound to
respect the Standard at all times ...

There's no alternative definition of memcpy() in the source code, and
it doesn't link to any libraries that might contain one.
 
K

kuyper

Mark Bluemel wrote:
....
As you have given us very little context - platform, compiler, etc -
unless someone here has seen exactly this, it's unlikely we can comment
much more.

Platform: SGI Origin 300 running IRIX 6.5. The compiler is the SGI C
compiler distributed with that version of IRIX. Compiler options: -O2 -
mips4 -xansi -fullwarn. I first noticed this code when I changed -
xansi to -ansi, which apparantly turns off an SGI extension supporting
0-length arrays.
 
K

Keith Thompson

kuyper said:
Mark Bluemel wrote:
...

Platform: SGI Origin 300 running IRIX 6.5. The compiler is the SGI C
compiler distributed with that version of IRIX. Compiler options: -O2 -
mips4 -xansi -fullwarn. I first noticed this code when I changed -
xansi to -ansi, which apparantly turns off an SGI extension supporting
0-length arrays.

Can you find SGI's documentation for that extension?
 
K

kuyper

Keith said:
Can you find SGI's documentation for that extension?

No. I've downloaded their C manual, and wandered around their website,
without finding anything. I've found mentions of the fact that they
have extensions, but no comprehensive list of the extensions, and no
mention of this specific extension. However, when I use -xansi, the
compiler tolerates declaration of a zero-length array without comment,
and the program works as if the array has a non-zero length; when I
use -ansi, compilation fails. The distinction between those two
options is supposed to be that -xansi enables SGI-specific extensions
to ANSI C.
 
O

Old Wolf

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Do these lines occur inside a structure definition?
 
K

kuyper

Old said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Do these lines occur inside a structure definition?

No - they occur at block scope.
 
P

Peter J. Holzer

Old said:
I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Do these lines occur inside a structure definition?

No - they occur at block scope.

That's strange. Before C89 many compilers accepted zero-sized arrays and
it was a common idiom to define a structure like this:

struct foo {
size_t size; /* more likely int a the time */
short whatever;
double data[0];
}

and use it like this:

struct foo *p = malloc(sizeof struct foo + sizeof double * nelems);

p->size = nelems;
p->whatever = 42;
for (i = 0; i < nelems; i++) {
p->data = get_some_data();
}

/* do some more processing */

free(p);

data didn't actually use any space in the struct, but enforced proper
alignment and padding, so the single malloc would allocate the exact
amount of memory needed.

C89 didn't standardize zero-sized arrays (presumably because they
didn't fit with the "pointer arithmetic only defined within an object"
model) and subsequently people stopped using that idiom and (more)
compilers started to reject it.

I don't know what possible use a zero-sized array could have as an
automatic variable. If it's really zero-sized it's completely useless,
and if it isn't it must be some fixed size (at least if it is used with
memset as you showed - if it was used with ordinary indexes I could
imagine some compiler magic implementing a dynamic array[0]), and if it's
some fixed size, why not use that?

hp

[0] Yes, there could of course be some other compiler magic which calls
__builtin_dynamic_array_memset if memset is used on zero-sized
array and __builtin_normal_memset otherwise.
 
N

neildferguson

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

Here's a very off-topic possibility: the variables are defined as they are in
order to appear in particular segments of the linker's memory map, with
particular symbolic identification, so that the memory's addresses can be
associated with particular hardware I/O operations.

Neil
 
K

kuyper

I've run across some rather peculiar code; here are the relevant lines
that left me confused :

unsigned char cr_file[384];
unsigned char num_char[0];

Note: this declaration actually works on our compiler, and it appears
to be equivalent to giving a length of 1. The developer inserted
compiler options into the make file to turn off the relevant warning
messages. Sadly, this is not the most confusing part of the code. This
is an example of the confusing part:

num_char[0] = 1;
memcpy(&cr_file[147], num_char, 1);

num_char is used only in this fashion; its value after the call to
memcpy() has no bearing on the behavior of the program. I may be
missing something, but it seems to me that this code is therefore
exactly equivalent to

cr_file[147] = 1;

In fact, I would expect that some compilers would generate identical
code for both ways of writing it.

Am I missing something? If not, could someone at least suggest a
plausible reason why the developer might write such bizarre code? I
can't ask the developer, he died recently, which is how I became
responsible for this code.

Here's a very off-topic possibility: the variables are defined as they are in
order to appear in particular segments of the linker's memory map, with
particular symbolic identification, so that the memory's addresses can be
associated with particular hardware I/O operations.

Ingenious possibility, but unfortunately not a plausible explanation
for this program. It's only purpose is breaking up a large partitioned
data set into several smaller files, which is non-trivial only because
the records are variable length, the files have to be split at the
boundary between two records, and each output file requires a seperate
header. cr_file is the array containing that header. It's not a very
complicated program. As a result, this wierd way of filling in the
headers adds significantly and unnecessarily to the total complexity.

For now, I'm assuming that he didn't have a valid reason for writing
the code this way. I suspect that I'll probably never learn what the
invalid reason was that motivated him to do so.
 
K

kuyper

CBFalconer said:
Jensen said:
kuyper said:
I've run across some rather peculiar code; here are the relevant
lines that left me confused :

unsigned char cr_file[384];

Yhis defines an array of 384 unsigned chars, indices 0 through 383.

Yes, of course. The size of that array doesn't confuse me. The bizarre
thing is the way it was used.
unsigned char num_char[0];

This is illegal. 0 size arrays cannot be declared.

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.
 
K

Keith Thompson

kuyper said:
CBFalconer wrote: [...]
This is illegal. 0 size arrays cannot be declared.

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.

You said elsethread that this appears to be an SGI-specific extension.
Have you tried one of the comp.sys.sgi.* newsgroups? Or can you
contact SGI customer support?
 
K

kuyper

Keith said:
kuyper said:
CBFalconer wrote: [...]
This is illegal. 0 size arrays cannot be declared.

Well, of course. Nonetheless, it was declared, and it does compile,
and it does work, apparently as a result of using a compiler flag
which enables SGI-specific extensions. A (small) part of my question
is "why was it declared with a length of 0?" The bigger part is given
in the Subject: header.

You said elsethread that this appears to be an SGI-specific extension.
Have you tried one of the comp.sys.sgi.* newsgroups? Or can you
contact SGI customer support?

No, I haven't. While some people have suggested otherwise, I don't
think that the 0-sized array is related to the peculiar memcpy() calls
- the proposed connections are all pretty implausible to me. Since
it's the memcpy() calls that I'm mainly confused by, I haven't
followed upon on the SGI extensions angle. Of course, the memcpy()
seems pretty implausible too; but there it is. Maybe I should check
out SGI sources, though at this time I'm more inclined to simply drop
it.
 
R

Richard Tobin

kuyper said:
I've run across some rather peculiar code;

Is there any possibility that this code was originally machine-generated,
or results from macro-expansion of something more plausible?

-- Richard
 
K

kuyper

Richard said:
Is there any possibility that this code was originally machine-generated,
or results from macro-expansion of something more plausible?

Not likely. We don't use much machine-generated code in our project. I
believe that this program was created by hand, most likely by
modification of an existing program by the same author intended to
handle the same data set in a different fashion; but I have no idea
which program that was, nor where it might be found. Literally
inheriting responsibility for a program can be very difficult,
partticularly when, as in this case, the late author was not strong on
documentation, either internal or external.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top