"serializing" structs in C

C

copx

I want to save a struct to disk.... as plain text.
At the moment I do it with a function that just
writes the data using fprintf. I mean like this:
fprintf(fp, "%d %d", my_struct.a, my_struct.b)
This way I have to write another "serializing"
function for every new kind of struct I want
to write, though. Is there a way to write
functions that can write/read any struct
to/from plain text format in a portable way?

copx
 
L

lallous

Hello,

Since you don't want to write a serialization function, I would suggest
encoding (say base64) the struct into a string then writing that string into
the file.
For reading you would read the string, decode it to get the struct back
again.

If you add members however, you have to rewrite old files....unless you can
control it via 'version' field in your struct...
 
M

Martin Dickopp

copx said:
I want to save a struct to disk.... as plain text.
At the moment I do it with a function that just
writes the data using fprintf. I mean like this:
fprintf(fp, "%d %d", my_struct.a, my_struct.b)
This way I have to write another "serializing"
function for every new kind of struct I want
to write, though. Is there a way to write
functions that can write/read any struct
to/from plain text format in a portable way?

Such a serializing function would need to know the type and location
of each structure member. If you store this information once for every
structure, you can pass it as an additional argument to a structure
serializing function. The program below should give you some ideas.

Martin



#include <stddef.h>
#include <stdio.h>

/* Structure that will be serialized. */
struct foo {
int a;
unsigned long b;
double c;
};

/* Information about a structure member. */
struct member_info {
enum {NIL, TYPE_INT, TYPE_UINT, TYPE_LONG, TYPE_ULONG, TYPE_DOUBLE} type;
size_t offset;
};

/* Information about the members of struct foo. */
const struct member_info foo_info [] = {
{TYPE_INT, offsetof (struct foo, a)},
{TYPE_ULONG, offsetof (struct foo, b)},
{TYPE_DOUBLE, offsetof (struct foo, c)},
{NIL, 0}
};

void serialize (const void *const data, const struct member_info *info)
{
while (1)
{
switch (info->type)
{
case TYPE_INT:
printf ("%d", *(int *)((char *)data + info->offset));
break;
case TYPE_UINT:
printf ("%u", *(unsigned int *)((char *)data + info->offset));
break;
case TYPE_LONG:
printf ("%ld", *(long *)((char *)data + info->offset));
break;
case TYPE_ULONG:
printf ("%lu", *(unsigned long *)((char *)data + info->offset));
break;
case TYPE_DOUBLE:
printf ("%f", *(double *)((char *)data + info->offset));
break;
}

if ((++info)->type == NIL)
{
putchar ('\n');
break;
}
else
putchar (' ');
}
}

int main (void)
{
struct foo myfoo = {42, 2004, 3.1415926536};
serialize (&myfoo, foo_info);
return 0;
}
 
M

Michel Bardiaux

Martin said:
Such a serializing function would need to know the type and location
of each structure member. If you store this information once for every
structure, you can pass it as an additional argument to a structure
serializing function. The program below should give you some ideas.

Martin

Two remarks:

(1) Although the need to write code for each and every structure member
can seem a PITA, it is a good investment. The same info can be used for
*other* serialization functions, such as printouts (consider the ease:
genericprintout(&myfoo, foo_info) et voila!), automatic garbage
collectors, ...

(2) If there are pointers in the structs, one must call the serializer
recursively; this requires 2 additional bookkeepings: a table of the
addresses of which structures have already been serialized, to avoid
duplications (one uses a tag instead); and easy access to the info table
from the address of a struct (the easiest is to have always an info
pointer as the first field of every struct).

Season greetings,
 
F

Flash Gordon

On Tue, 30 Dec 2003 12:40:00 +0200

Please don't top post. It is considered rude since it makes it harder to
read. It is also in contravention of the appropriate RFC, use your
favourite search engine for more information, terms such as netiquette,
top-posting, bottom-posting may help. Instead, after appropriate
snipage, post you reply under the parts of the post you are doing, as I
have done.

Top-posting fixed.

This is correct.

Unfortunately true.

You can use pre-processor magic to automate a lot (if not all) of the
generation of such functions.
Since you don't want to write a serialization function, I would
suggest encoding (say base64) the struct into a string then writing
that string into the file.
For reading you would read the string, decode it to get the struct
back again.

Unfortunately this does not help for a number of reasons.

1) The OP wants plain text, base64 (or any other encoding) does not
normally count as plain text.

2) The file produced will not be portable since the representation of
the data may vary between systems, e.g. different padding, different
endianness, 1s-complements vs 2s-complement etc.
If you add members however, you have to rewrite old files....unless
you can control it via 'version' field in your struct...

Not much help in dealing with all of the above. Also not much help if
the OP wants to be able to read the files with a text editor.
 
C

CBFalconer

copx said:
I want to save a struct to disk.... as plain text.
At the moment I do it with a function that just
writes the data using fprintf. I mean like this:
fprintf(fp, "%d %d", my_struct.a, my_struct.b)
This way I have to write another "serializing"
function for every new kind of struct I want
to write, though. Is there a way to write
functions that can write/read any struct
to/from plain text format in a portable way?

No, but you can write code to serialize some classes of
structures. For example:

typedef struct para {
size_t linecount;
char **lines;
} para, paraptr;

(you can use some tricks to make the lines field a variable length
array in C90, but making it a pointer to an array of char *
pointers, which need to be allocated, and in turn need to be
allocated space to hold the individual strings, is cleaner)

Now the serializing routine might be:

void writepara(paraptr pp)
{
size_t count;

count = pp->linecount;
printf("%ud\n", (unsigned int)count);
while (count--) puts((pp->lines)[count], stdout);
}

If I have not goofed excessively, the resultant text file will
consist of an integer specifying the number of following lines,
followed by the lines in reversed order. This pattern will be
repeated as needed, and can be read back into entirely different
structures if desired. The lines proper cannot contain embedded
'\n's nor '\0's without confusion.
 
C

copx

[snip]
/* Information about the members of struct foo. */
const struct member_info foo_info [] = {
{TYPE_INT, offsetof (struct foo, a)},
{TYPE_ULONG, offsetof (struct foo, b)},
{TYPE_DOUBLE, offsetof (struct foo, c)},
{NIL, 0}
};
[snip]

Is offsetof() really part of ANSI/ISO C?
I don't claim to know every ANSI keyword
but I've never seen it before and it looks
a lot like typeof() which is a non-standard GNU
extension IIRC.

Otherwise this looks like a pretty good
solution for my problem.

copx
 
K

Kevin Goodsell

lallous said:

Please don't top-post. It's rude.
Since you don't want to write a serialization function, I would suggest
encoding (say base64) the struct into a string then writing that string into
the file.

This is very bad advice, in my opinion. Depending on what exactly you
mean, it's either more complicated than the original proposal or
completely non-portable.

-Kevin
 
J

Jack Klein

[snip]
/* Information about the members of struct foo. */
const struct member_info foo_info [] = {
{TYPE_INT, offsetof (struct foo, a)},
{TYPE_ULONG, offsetof (struct foo, b)},
{TYPE_DOUBLE, offsetof (struct foo, c)},
{NIL, 0}
};
[snip]

Is offsetof() really part of ANSI/ISO C?

Has been since the first ANSI 1989 standard, still is in the latest
C99 standard. Here's the relevant text from the current standard:

========
7.17 Common definitions <stddef.h>

1 The following types and macros are defined in the standard header
<stddef.h>. Some are also defined in other headers, as noted in their
respective subclauses.

[snip]

3 The macros are

NULL
which expands to an implementation-defined null pointer constant; and

offsetof(type, member-designator)
which expands to an integer constant expression that has type size_t,
the value of which is the offset in bytes, to the structure member
(designated by member-designator), from the beginning of its structure
(designated by type). The type and member designator shall be such
that given

static type t;

then the expression &(t.member-designator) evaluates to an address
constant. (If the specified member is a bit-field, the behavior is
undefined.)
========
I don't claim to know every ANSI keyword
but I've never seen it before and it looks
a lot like typeof() which is a non-standard GNU
extension IIRC.

Otherwise this looks like a pretty good
solution for my problem.

copx

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
K

Keith Thompson

copx said:
I want to save a struct to disk.... as plain text.
At the moment I do it with a function that just
writes the data using fprintf. I mean like this:
fprintf(fp, "%d %d", my_struct.a, my_struct.b)
This way I have to write another "serializing"
function for every new kind of struct I want
to write, though. Is there a way to write
functions that can write/read any struct
to/from plain text format in a portable way?

Not easily without having to do some manual work for each struct type
(e.g., setting up an array of offsets and types).

I once worked on a project to do something similar, with the intent of
allowin access to a struct member given a pointer to the struct and
the name of the member, as a string. It involved a multi-stage
process, starting with feeding the relevant header files to a C parser
that generated C code that printed out information about each struct
and member. (It was at a previous job, and I don't have the code.)

Unless you have a *lot* of structs you want to serialize, it's
probably easier to do some manual work.
 
C

copx

Jack Klein said:
[snip]
/* Information about the members of struct foo. */
const struct member_info foo_info [] = {
{TYPE_INT, offsetof (struct foo, a)},
{TYPE_ULONG, offsetof (struct foo, b)},
{TYPE_DOUBLE, offsetof (struct foo, c)},
{NIL, 0}
};
[snip]

Is offsetof() really part of ANSI/ISO C?

Has been since the first ANSI 1989 standard, still is in the latest
C99 standard. Here's the relevant text from the current standard:
[snip]

I see. Thanks.

copx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,262
Messages
2,571,045
Members
48,769
Latest member
Clifft

Latest Threads

Top