Help with structs

Jack Trades · Aug 26, 2010

I'm trying to learn C by writing an interpreter for a Scheme-like
language in it. I have the following struct which is used for Scheme
objects...

typedef struct object {
object_type type;
union {
struct { // FIXNUM
long value;
} fixnum;
struct { // FLONUM
double value;
} flonum;
// Other types omitted
} data;
} object;

When I tried to implement + I couldn't figure out how to do it simply
and ended up writing this...

object *h_numeric_add(object *obj_1, object *obj_2) {
switch (obj_1->type) {
case FLONUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.fixnum.value);
}
case FIXNUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.fixnum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_fixnum(obj_1->data.fixnum.value +
obj_2->data.fixnum.value);
}
}
}

It works just fine but looks very ugly to me, especially considering
that the only thing I'm varying is the obj->data.flo/fixnum.value. Is
there a better way to do this?

Thanks
Jack Trades

Seebs · Aug 27, 2010

typedef struct object {
object_type type;
union {
struct { // FIXNUM
long value;
} fixnum;
struct { // FLONUM
double value;
} flonum;
// Other types omitted
} data;
} object;

When I tried to implement + I couldn't figure out how to do it simply
and ended up writing this...

object *h_numeric_add(object *obj_1, object *obj_2) {
switch (obj_1->type) {
case FLONUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.fixnum.value);
}
case FIXNUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.fixnum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_fixnum(obj_1->data.fixnum.value +
obj_2->data.fixnum.value);
}
}
}

I would suggest:

double object_to_flonum(object *o) {
switch (o->type) {
case FLONUM:
return o->data.flonum.value;
break;
case FIXNUM:
return (double) o->data.fixnum.value;
break;
}
}

and then a similar thing for fixnums.

But. I would also suggest:
union {
long int fixnum;
double flonum;
} data;

You don't need the structs in there.

It works just fine but looks very ugly to me, especially considering
that the only thing I'm varying is the obj->data.flo/fixnum.value. Is
there a better way to do this?

Write converters so you only need 2N functions instead of N^2 choices.

-s

Tom St Denis · Aug 27, 2010

I would suggest:

double object_to_flonum(object *o) {
switch (o->type) {
case FLONUM:
return o->data.flonum.value;
break;
case FIXNUM:
return (double) o->data.fixnum.value;
break;
}
}

and then a similar thing for fixnums.

Could actually do this as a #define e.g. (I'd make it cleaner ...)

#define OBJECT_CONV(type, cast) \
cast object_to_##type(object *o) { \
switch (o->type) { case FLONUM: return o->data.flonum.value; case
FIXNUM: return o->data->fixnum.value; ... }}

Then you just have

OBJECT_CONF(fixnum, long);
OBJECT_CONF(flonum, double);

Tom

Jack Trades · Aug 27, 2010

I would suggest:

double object_to_flonum(object *o) {
switch (o->type) {
case FLONUM:
return o->data.flonum.value;
break;
case FIXNUM:
return (double) o->data.fixnum.value;
break;
}
}

and then a similar thing for fixnums.

But. I would also suggest:
union {
long int fixnum;
double flonum;
} data;

You don't need the structs in there.

Thank you very much for this piece of advice. It really cleaned up my
code and helped me to understand unions better.

Write converters so you only need 2N functions instead of N^2 choices.

Maybe I'm missing something but I don't understand how this would
eliminate the need for a huge switch statement. I'm trying to return
the most compact type possible so that:

(+ 1 1) => 2
(+ 1 1.1) => 2.1
(+ 1 3/4) => 7/4 (or something similar)

So, in my mind, the steps I need to take are this:
1) Determine which object is greater in the numerical tower.
2) Coerce the lesser object to the type of the greater one.
3) Add the two numbers.
4) Create a new object of the correct type and return the result.

After writing this I realized that I should probably break this into
two functions, one for coercing numeric types (steps 1 & 2) and
another function for arithmetic. However, being new to C, I'm back
to:

// Pseudocode
object *coerce(object *x, object *y) {
switch (x->type) {
case FIXNUM:
switch (y->type) {
case FIXNUM: return x, y
case RATIONAL: return coerce_to_rational(x), y
case FLONUM: return coerce_to_flonum(x), y
case COMPLEX: return coerce_to_complex(x), y
case FLONUM:
...

Am I missing something obvious here?

Thanks
Jack Trades

Gene · Aug 27, 2010

I'm trying to learn C by writing an interpreter for a Scheme-like
language in it. I have the following struct which is used for Scheme
objects...

typedef struct object {
object_type type;
union {
struct { // FIXNUM
long value;
} fixnum;
struct { // FLONUM
double value;
} flonum;
// Other types omitted
} data;

} object;

When I tried to implement + I couldn't figure out how to do it simply
and ended up writing this...

object *h_numeric_add(object *obj_1, object *obj_2) {
switch (obj_1->type) {
case FLONUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_flonum(obj_1->data.flonum.value +
obj_2->data.fixnum.value);
}
case FIXNUM:
switch (obj_2->type) {
case FLONUM:
return make_flonum(obj_1->data.fixnum.value +
obj_2->data.flonum.value);
case FIXNUM:
return make_fixnum(obj_1->data.fixnum.value +
obj_2->data.fixnum.value);
}
}

}

It works just fine but looks very ugly to me, especially considering
that the only thing I'm varying is the obj->data.flo/fixnum.value. Is
there a better way to do this?

As has been discussed here previously another choice is

typedef object_type OBJECT;

typedef struct {
object_type type;
long val;
} FIXNUM_OBJECT;

#define FIXNUM_VAL(X) (((FIXNUM_OBJECT*)(X))->val)

typedef struct {
object_type type;
double val;
| FLONUM_OBJECT;

#define FLONUM_VAL(X) (((FLONUM_OBJECT*)(X))->val)

#define TYPE_PAIR(A, B) ((A) | ((B) << 8))

OBJECT *plus(OBJECT *a, OBJECT *b)
{
switch (TYPE_PAIR(*a, *b)) {
case TYPE_PAIR(FIXNUM, FIXNUM):
return new_fixnum(FIXNUM_VAL(a) + FIXNUM_VAL(b));
case TYPE_PAIR(FIXNUM, FLONUM):
return new_flonum(FIXNUM_VAL(a) + FLONUM_VAL(b));
case TYPE_PAIR(FLONUM, FIXNUM):
return new_flonum(FLONUM_VAL(a) + FIXNUM_VAL(b));
case TYPE_PAIR(FLONUM, FLONUM):
return new_flonum(FLONUM_VAL(a) + FLONUM_VAL(b));
default: error("bad arg types");
}
}

Seebs · Aug 27, 2010

Maybe I'm missing something but I don't understand how this would
eliminate the need for a huge switch statement. I'm trying to return
the most compact type possible so that:

It wouldn't completely, but it might reduce the hugeness of the switch.

(+ 1 1) => 2
(+ 1 1.1) => 2.1
(+ 1 3/4) => 7/4 (or something similar)

So, in my mind, the steps I need to take are this:
1) Determine which object is greater in the numerical tower.
2) Coerce the lesser object to the type of the greater one.
3) Add the two numbers.
4) Create a new object of the correct type and return the result.

Okay, consider this path:

1. Define your ordering.
2. Query the order of each object:
order1 = order_of(o1);
order2 = order_of(o2);
3. Pick the greater order:
use_this_order = (order1 > order2) ? order1 : order2;
4. Switch on it:
switch (use_this_order) {
case FLOAT_ORDER:
x = make_flonum(flonum_add(flonum_of(o1), flonum_of(o2)));
break;
case FIXNUM_ORDER:
x = make_fixnum(fixnum_add(fixnum_of(o1), fixnum_of(o2)));
break;
}

Do the calculation up front and use the results, instead of spreading
the logic all over.

Explicit rules like "if this is float and that is fix" (what your switch
statement does) are vulnerable to typos and thinkos; just do the general
rule all at once, expressed in the terms you put it in when you described
what you wanted to do. You don't want a million special rules; you want
the general rule "use the higher order".

You also want a thing that lets you, say, detect that something doesn't
fit the ordering and throw an error.

Am I missing something obvious here?

Maybe! Here's the great mystery:

It is (if you code for it) both cheap and harmless to coerce something to a
type it already has.

-s

BartC · Aug 27, 2010

Jack Trades said:
Maybe I'm missing something but I don't understand how this would
eliminate the need for a huge switch statement. I'm trying to return
the most compact type possible so that:

I think this is the multiple dispatch problem: there are probably N**2 ways
of tackling it too..

(When I had a similar problem, with the possibility of N up to 1000, I used
an NxN table to convert to single, *compact* code relevant to the type
combinations I was interested in. Then this became single dispatch, much
simpler.)

// Pseudocode
object *coerce(object *x, object *y) {
switch (x->type) {
case FIXNUM:
switch (y->type) {
case FIXNUM: return x, y
case RATIONAL: return coerce_to_rational(x), y
case FLONUM: return coerce_to_flonum(x), y
case COMPLEX: return coerce_to_complex(x), y
case FLONUM:
...

Your code will work. Presumably you call coerce() before you look at the
arithmetic op involved. This won't give you the fastest performance however.

You could just call coerce() only when the types don't agree (x->type !=
y->type).

Jack Trades · Aug 28, 2010

As has been discussed here previously another choice is

typedef object_type OBJECT;

typedef struct {
object_type type;
long val;

} FIXNUM_OBJECT;

#define FIXNUM_VAL(X) (((FIXNUM_OBJECT*)(X))->val)

typedef struct {
object_type type;
double val;
| FLONUM_OBJECT;

#define FLONUM_VAL(X) (((FLONUM_OBJECT*)(X))->val)

#define TYPE_PAIR(A, B) ((A) | ((B) << 8))

OBJECT *plus(OBJECT *a, OBJECT *b)
{
switch (TYPE_PAIR(*a, *b)) {
case TYPE_PAIR(FIXNUM, FIXNUM):
return new_fixnum(FIXNUM_VAL(a) + FIXNUM_VAL(b));
case TYPE_PAIR(FIXNUM, FLONUM):
return new_flonum(FIXNUM_VAL(a) + FLONUM_VAL(b));
case TYPE_PAIR(FLONUM, FIXNUM):
return new_flonum(FLONUM_VAL(a) + FIXNUM_VAL(b));
case TYPE_PAIR(FLONUM, FLONUM):
return new_flonum(FLONUM_VAL(a) + FLONUM_VAL(b));
default: error("bad arg types");
}

}

It took me a bit to understand what that was doing but it does look
like it would clean up some of my code. I'll have to do some more
reading on C macros. ATM the only thing I know about them is that I
need to be careful, so I've refrained from using them so far. I'm
used to macros in Scheme and I find it hard enough to debug "regular"
C right now.

Thanks
Jack Trades

Gene · Aug 28, 2010

It took me a bit to understand what that was doing but it does look
like it would clean up some of my code. I'll have to do some more
reading on C macros. ATM the only thing I know about them is that I
need to be careful, so I've refrained from using them so far. I'm
used to macros in Scheme and I find it hard enough to debug "regular"
C right now.

Thanks
Jack Trades

Macros just do text replacement. The main safety measure is to put
all parameters and the entire result in parens so you don't get
unexpected associativity problems.

#define MULT(A, B) A * B
MULT(a+b, c) / d; // Becomes a+b * c / d. Not what you want.

So you should instead use
#define MULT(A, B) ((A) * (B))

The other gotcha is multiple arg use:

#define SQR(X) ((X) * (X))
y = SQR(++x); // Becomes ((++x) * (++x)). Not what you want.

So if this looks too hairy,

#define FLONUM_VAL(X) (((FLONUM_OBJECT*)(X))->val)

try dividing the labor between two macros:

#define DOWNCAST(X, Type) ((Type ## _OBJECT*)(X))

#define FLONUM_VAL(X) (DOWNCAST(X, FLONUM)->val)

Then a nice thing is that you can optionally tell DOWNCAST to do
runtime type checks:

#ifndef NDEBUG
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)type_check(X, Type))
#else
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)(X))
#endif

OBJECT *type_check(OBJECT *x, object_type type)
{
if (*x == type)
error("downcast failure");
return x;
}

Jack Trades · Aug 28, 2010

Okay, consider this path:

1. Define your ordering.
2. Query the order of each object:
order1 = order_of(o1);
order2 = order_of(o2);
3. Pick the greater order:
use_this_order = (order1 > order2) ? order1 : order2;
4. Switch on it:
switch (use_this_order) {
case FLOAT_ORDER:
x = make_flonum(flonum_add(flonum_of(o1), flonum_of(o2)));
break;
case FIXNUM_ORDER:
x = make_fixnum(fixnum_add(fixnum_of(o1), fixnum_of(o2)));
break;
}

Do the calculation up front and use the results, instead of spreading
the logic all over.
Explicit rules like "if this is float and that is fix" (what your switch
statement does) are vulnerable to typos and thinkos; just do the general
rule all at once, expressed in the terms you put it in when you described
what you wanted to do. You don't want a million special rules; you want
the general rule "use the higher order".

You also want a thing that lets you, say, detect that something doesn't
fit the ordering and throw an error.

I think I follow that and it sounds like good advice. I've added very
little error checking so far as even my own code is starting to
confuse me a bit. I'm starting to catch on, but it's pretty slow
going.

I've been doing AI programming in Scheme for the last few years and I
think I'm trying to use too many functional idioms. I spent my first
week of learning C trying to figure out how to get anything useful
done without being able to return an array from a function in the
usual Scheme manner. Even now I still find myself unnecessarily
creating linked lists and using my homemade car/cdr functions on
them. I guess you can write Scheme in any language too

Maybe! Here's the great mystery:

It is (if you code for it) both cheap and harmless to coerce something to a
type it already has.

That helps quite a bit. I'm still getting used to the fact that some
operations in C are incredibly cheap. My next major hurdle is going
to be memory management and I've been putting that off since I started
a couple weeks ago. However I think I'm going to need to tackle that
sooner than later as my interpreter is growing pretty fast.

Thanks
Jack Trades

Jack Trades · Aug 28, 2010

Any solution amounts to the same thing. There are a few things to reduce the
complexity, like a generic balance routine the pre-coerces arguments to a common
type.

I'm starting to lean toward something like that. I wrote it off
initially because I thought it would be inefficient for something like
(+ 1 2 3 4 7/8) where I could do integer addition for the first 4
numbers then coerce the result to a rational. However I'm starting to
think that may be premature optimization.

You can also flatten the switches with something like switch
(100*obj_1->type + obj_2->type).

I'm not quite sure how this would work. Why multiply by 100?

I thought Nobody Owens ended your organisation.

We're an elusive group!

Thanks
Jack Trades

Seebs · Aug 28, 2010

I'm not quite sure how this would work. Why multiply by 100?

Imagine that type runs from 1 to 50.

100*type1 + type2 will then run from 101 to 5050, with each
pair unique. And you can do things like:

case (TYPE_FLONUM * 100) + TYPE_FIXNUM:
case (TYPE_FIXNUM * 100) + TYPE_FLONUM:
/* handle fix/float pair */
break;

-s

Keith Thompson · Aug 28, 2010

Gene said:
Macros just do text replacement. The main safety measure is to put
all parameters and the entire result in parens so you don't get
unexpected associativity problems.

[...]

My usual HHGTTG-inspired horrible example is:

#include <stdio.h>

#define SIX 1+5
#define NINE 8+1

int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;
}

Jack Trades · Aug 28, 2010

(When I had a similar problem, with the possibility of N up to 1000, I used
an NxN table to convert to single, *compact* code relevant to the type
combinations I was interested in. Then this became single dispatch, much
simpler.)

That's an interesting solution. When I started writing these
procedures I was pretty frustrated at the complexity of what I thought
should be a relatively simple problem. However I've learned about as
much about C in the past few days as in the last couple of weeks I've
been learning it.

Thanks
Jack Trades

Jack Trades · Aug 28, 2010

Imagine that type runs from 1 to 50.

100*type1 + type2 will then run from 101 to 5050, with each
pair unique. And you can do things like:

case (TYPE_FLONUM * 100) + TYPE_FIXNUM:
case (TYPE_FIXNUM * 100) + TYPE_FLONUM:
/* handle fix/float pair */
break;

OK. That makes perfect sense now.

Thanks
Jack Trades

Jack Trades · Aug 28, 2010

[...]> Macros just do text replacement. The main safety measure is to put

all parameters and the entire result in parens so you don't get
unexpected associativity problems.

Click to expand...

[...]

My usual HHGTTG-inspired horrible example is:

#include <stdio.h>

#define SIX 1+5
#define NINE 8+1

int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;

}

Which would become 1+5*8+1 and return 42?

I'm assuming that in addition to order of operations issues there is
also a problem with variable capture as is the case with LISP style
macros? I know this is probably a stupid question but are there any
third-party hygenic macro packages available for C?

Thanks
Jack Trades

Jack Trades · Aug 28, 2010

Then a nice thing is that you can optionally tell DOWNCAST to do
runtime type checks:

#ifndef NDEBUG
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)type_check(X, Type))
#else
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)(X))
#endif

OBJECT *type_check(OBJECT *x, object_type type)
{
if (*x == type)
error("downcast failure");
return x;

}

Until a few minutes ago I never knew about #ifndef. I think I
understand how it works and I'm reading about other preprocessor
directives now, but what does ## do? I can't seem to find it in any
of the ~10 pages I've read through.

Thanks
Jack Trades

Jack Trades · Aug 28, 2010

Then a nice thing is that you can optionally tell DOWNCAST to do
runtime type checks:

#ifndef NDEBUG
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)type_check(X, Type))
#else
#define DOWNCAST(X, Type) ((Type ## _OBJECT*)(X))
#endif

OBJECT *type_check(OBJECT *x, object_type type)
{
if (*x == type)
error("downcast failure");
return x;

}

Until a few minutes ago I never knew about #ifndef. I think I
understand how it works and I'm reading about other preprocessor
directives now, but what does ## do? I can't seem to find it in any
of the ~10 pages I've read through.

Thanks
Jack Trades

Seebs · Aug 28, 2010

Until a few minutes ago I never knew about #ifndef. I think I
understand how it works and I'm reading about other preprocessor
directives now, but what does ## do? I can't seem to find it in any
of the ~10 pages I've read through.

Pastes things together into a single token.

HISTORY BREAK!

Long ago, people did stuff like this:

#define GLUE(x,y) x/**/y
int
GLUE(ma,in)() {
return 0;
}

But it was standardized and established that /**/ shall expand to some kind
of whitespace -- that it breaks tokens. So C90 added ## to let you do this:

#define GLUE(x,y) x ## y

-s

Jack Trades · Aug 28, 2010

Until a few minutes ago I never knew about #ifndef. I think I
understand how it works and I'm reading about other preprocessor
directives now, but what does ## do? I can't seem to find it in any
of the ~10 pages I've read through.

Thanks
Jack Trades

Sorry I just found documentation on ## which is used for
concatenation.

Thanks again to everyone for all the help.
Jack Trades

problem with memory allocation and structs	5	Dec 4, 2011
Union of structs with duplicate var names	4	May 10, 2010
Practical packing for structs of bytes	12	Sep 17, 2010
polymorphic structs with "methods"	33	Mar 23, 2008
Issue with textbox script?	0	Sep 5, 2022
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
GET NEIL DEGRASSES TYSON, I ripped a hole with this one...	0	Nov 10, 2022
comparing two structs .. revisited	5	Feb 7, 2006

Help with structs

Jack Trades

Seebs

Tom St Denis

Jack Trades

Gene

Seebs

BartC

Jack Trades

Gene

Jack Trades

Jack Trades

Seebs

Keith Thompson

Jack Trades

Jack Trades

Jack Trades

Jack Trades

Jack Trades

Seebs

Jack Trades

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads