translating C++ -> C

Roderick Bloem · May 6, 2005

I am going to take the liberty of crossposting this to comp.lang.c++
(originally comp.lang.c), and to summarize the discussion for the sake
of those reading only c++.

The question is: If you are writing C and you have a struct P, can you
create a struct C that is an extension of the first (starts just like P
and adds some data), and then use a C* as if it were an P*?

Example:

typedef struct {
int a;
short b;
} P;

typedef struct {
int a;
short b;
long long c;
} C;

C *c; P *p;
c = (C*) malloc(sizeof(C));
c->a = 1; c->b =2;
p = (P*) c;
printf("%d\n", p->a);

P and C stand for parent and child, and hint at the OO structure that we
are tying to mimick in C. We want to be able to use a child struct as a
parent struct, as you would in C++.

The basic answer in comp.lang.c is "it works on any compiler I have
seen, but there is no guarantee".

The standard appears to limit the freedom of the compiler in laying out
the struct: the order of the elements if fixed, padding can be added
between elements and at the end, but only if necessary for alignment.
This does not quite prescribe where the padding should be. If you have
three byte-aligned bytes, and a 4-byte aligned 4-byte word, you need one
byte of padding, but you can put that whereever you want before the
word: bbbpwwww or bpbbwwww are both allowed (b is a byte, p padding, and
w part of the word). The standard apperently does not require that the
padding is applied the same way in different structs.

Another problem that has been pointed out is this: what if P ends in a
4-byte aligned byte b and 3 bytes of padding. The compiler may decide
that the most efficient way to clear b is to do a four-byte clear
operation. If C adds 3 bytes to the struct, these may go in the
padding, and an attempt to assign p.b=0 may clear the extra the extra
bytes in C if p points to a C struct.

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct? Does that mean that the translator depended on
features of the compilers that are not prescribed by the standard, or am
I missing something?

It is clear that there are alternatives, e.g., we may define C as
typedef struct {
P p;
long c;
} C;
at the expense of some extra typing when accessing common elements.

[disclaimer: I do not have the C standard. Everyting I write about it
is either hearsay or Harbison & Steele.]

Roderick

Keith Thompson · May 6, 2005

Roderick Bloem said:
The standard appears to limit the freedom of the compiler in laying
out the struct: the order of the elements if fixed, padding can be
added between elements and at the end, but only if necessary for
alignment.

C99 6.7.2.1p13:
Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the
order in which they are declared. A pointer to a structure object,
suitably converted, points to its initial member (or if that
member is a bit-field, then to the unit in which it resides), and
vice versa. There may be unnamed padding within a structure
object, but not at its beginning.

C99 6.7.2.1p15:
There may be unnamed padding at the end of a structure or union.

There is no implication that padding can be added only if necessary
for alignment. The compiler is free to insert padding because it
makes the struct look bigger and scares away predators.

[...]

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct? Does that mean that the translator depended
on features of the compilers that are not prescribed by the standard,
or am I missing something?

Are you referring to cfront?

It probably means that the author(s) of the translator either were
experts on C, or were lucky enough not to run into any problems. It
doesn't imply anything about the C expertise of C++ programmers other
than the ones who worked on the translator.

There's no fundamental reason why either the translator or the code it
generated had to be written in perfectly portable C. As long as it
did the job, that may have been good enough, and the authors were free
to take advantage of assumptions that happen to be valid for all C
implementations of interest, even if they're not guaranteed by the
standard. (Portable standard-conforming code is generally better, all
else being equal, but all else is not always equal.)

Chris Torek · May 6, 2005

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct?

On the first, perhaps; on the second, well...

Does that mean that the translator depended on features of the
compilers that are not prescribed by the standard ...

If you are referring to cfront, it *definitely* *did* depend on
non-portable features. In particular, you had to tell it all about
how the C compiler it used as its "assembler" laid out structures,
including padding, so that it could track the C compiler's work
and subvert it.

Note that cfront was in fact a "real compiler" according to the
definition I prefer:

To decide if Step S is a "preprocessor" or a "compiler",
answer the following question: if an error occurs *after*
Step S, is it a mistake by the programmer, or is it a
mistake in Step S?

Consider the following examples:

foo.c, line 123: invalid operand to unary &
# or same with "foo.cpp" as the file name

/tmp/151522.c, line 123: invalid operand to unary &

/tmp/151523.s, line 5012: invalid register operand to add

When compiling a C or C++ program named "foo.c" or "foo.cpp", the
first message is perfectly natural if you goofed up some "#define",
because the preprocessor part of the language does not understand
the language proper. But getting (just) the second message from
a C++ compiler, when compiling "foo.cpp", indicates a bug in the
C++ compiler, not invalid C++ code that was simply copied through
to the C compiler. So C++ is not a "preprocessor", because it is
a bug in the C++ system, not a bug in your own code, that produced
the message about file in /tmp.

In all cases, the last message (from the assembler) indicates a
bug in the compiler, because the compiler should not be emitting
invalid CPU register names. The exception to this rule occurs if
the compiler happens to have an "insert arbitrary assembly code"
escape clause (like __asm__), and you used it.

Lexical Analysis on C++	1	Oct 31, 2023
How to position the tooltip comment on these buttons?	9	Nov 4, 2023
Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
How to try a range of hex values in C# code ?	0	Nov 19, 2022
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
Need help finding Segmentation fault C++	0	Apr 16, 2022
Function is not worked in C	2	Jun 27, 2023
<Button ...> display is fine, except for two things	1	Oct 23, 2023

translating C++ -> C

Roderick Bloem

Keith Thompson

Chris Torek

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads