Need help serializing data using C


D

Durango

Hello I need some guidance doing serialization of data using C. I have
the following function prototype for serializing data:
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer);

What I want to be able to do is take any type of structure and of course
write the data into the buffer.

so for example if I have a struc A defined:

struct A
{
char name[32];
};

than the buffer would contain the data for x, y, and z.

I also want to prepend an opcode of type unsigned short so that I can
determine which structure type the buffer contains.

I would like to pass this to the first 2 bytes of the buffer and
afterwards pass the data within the structure.

so for example I have:
unsigned short opcode = 12;
struct A *a_data;
char *buffer

and struct A as defined above.

I would call the function in the following manner:
note: buffer gets allocated inside the function.

serializeData(opcode, a_data, &buffer);

within the function:

I try to pass the opcode to the first 2 bytes of buffer
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer)
{
struct A *a_data;
*buffer = malloc(sizeof(struct A)+2);
/* pass opcode to first 2 bytes */
(*buffer)[0] = data_type >> 8;
(*buffer)[1] = data_type;
a_data = (struct A*)structure;
strcpy(*buffer+2, start_pkt->name);
return strlen(*buffer);
}

Without trying to prepend the opcode, the code works fine and the buffer
contains the valid value from the structure passed. If I prepend the
opcode but do not use strcpy to write the structure data to the buffer
after the first 2 bytes than the buffer will contain the opcode value in
the first 2 bytes. It's the combination of both that's the issue.
I am confused in what it is, my knowledge of pointer is a bit weak so I
come here for advice.

Thank you.
 
Ad

Advertisements

I

Ian Collins

Hello I need some guidance doing serialization of data using C. I have
the following function prototype for serializing data:
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer);

What I want to be able to do is take any type of structure and of course
write the data into the buffer.

The best advice I can offer is to write a code generator that either
parses your structures and generates the serialisation code, or
generates the code and structures from another form of definition (XML,
JSON, ..).
 
B

Barry Schwarz

Hello I need some guidance doing serialization of data using C. I have
the following function prototype for serializing data:
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer);

What I want to be able to do is take any type of structure and of course
write the data into the buffer.

so for example if I have a struc A defined:

struct A
{
char name[32];
};

than the buffer would contain the data for x, y, and z.

I also want to prepend an opcode of type unsigned short so that I can
determine which structure type the buffer contains.

I would like to pass this to the first 2 bytes of the buffer and
afterwards pass the data within the structure.

so for example I have:
unsigned short opcode = 12;
struct A *a_data;
char *buffer

and struct A as defined above.

I would call the function in the following manner:
note: buffer gets allocated inside the function.

serializeData(opcode, a_data, &buffer);

within the function:

I try to pass the opcode to the first 2 bytes of buffer
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer)
{
struct A *a_data;
*buffer = malloc(sizeof(struct A)+2);
/* pass opcode to first 2 bytes */
(*buffer)[0] = data_type >> 8;
(*buffer)[1] = data_type;
a_data = (struct A*)structure;

The cast is unnecessary but it's not what you want anyway.
strcpy(*buffer+2, start_pkt->name);

This works only if the structure contains nothing but a single string.
Since you want to be able to handle any kind of structure, you would
be better off adding another parameter which is the sizeof of the
structure and then using memcpy to copy.
return strlen(*buffer);

Since you already know the size of the structure, I don't know what
you really want here.
}

Without trying to prepend the opcode, the code works fine and the buffer
contains the valid value from the structure passed. If I prepend the

By placing the structure data two bytes in, you have broken any
alignment. If you want to look at the structure data upon return, you
need to copy it back into a structure.
opcode but do not use strcpy to write the structure data to the buffer
after the first 2 bytes than the buffer will contain the opcode value in
the first 2 bytes. It's the combination of both that's the issue.
I am confused in what it is, my knowledge of pointer is a bit weak so I
come here for advice.

Given the limited applicability of the code you show, what is it that
leads you to believe it is not working?
 
E

Eric Sosman

Hello I need some guidance doing serialization of data using C. I have
the following function prototype for serializing data:
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer);

What I want to be able to do is take any type of structure and of course
write the data into the buffer.

so for example if I have a struc A defined:

struct A
{
char name[32];
};

than the buffer would contain the data for x, y, and z.

One wonders where x,y,z came from, and what became of name ...
I also want to prepend an opcode of type unsigned short so that I can
determine which structure type the buffer contains.

I would like to pass this to the first 2 bytes of the buffer and
afterwards pass the data within the structure.

so for example I have:
unsigned short opcode = 12;
struct A *a_data;
char *buffer

and struct A as defined above.

I would call the function in the following manner:
note: buffer gets allocated inside the function.

serializeData(opcode, a_data,&buffer);

within the function:

I try to pass the opcode to the first 2 bytes of buffer
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer)
{
struct A *a_data;
*buffer = malloc(sizeof(struct A)+2);

`sizeof(struct A) + sizeof(unsigned short)' would be better.
`sizeof(*a_data) + sizeof(unsigned short)' would be better yet,
and `sizeof(unsigned short) + sizeof(*a_data)' would be superior
even to that. But, oh, well ...

Also, malloc() can fail and return NULL. Calling malloc() is
like applying for a loan from the bank: Don't spend the money before
you're sure the loan has been approved ...
/* pass opcode to first 2 bytes */
(*buffer)[0] = data_type>> 8;
(*buffer)[1] = data_type;

For this sort of thing a buffer of `unsigned char' would be
marginally better than plain `char'.
a_data = (struct A*)structure;
strcpy(*buffer+2, start_pkt->name);

One wonders what this start_pkt thing is, and what became
of a_data ...

But assuming that start_pkt is a typo for a_data, the use
of strcpy() assumes two rather important things: First, that the
first '\0' in the struct marks the end of all the interesting
data (so you need serialize nothing after it), and second, that
there actually is a '\0' somewhere. For the `struct A' you've
shown that's a reasonable assumption (although by no means a
certainty), but for other struct types it's dubious indeed.
return strlen(*buffer);

Since strlen() will stop counting as soon as it finds a '\0',
the assumption that a '\0' is actually present comes into play
once again. More importantly, you're assuming there's no premature
'\0' -- not even in the two bytes of data_type!
}

Without trying to prepend the opcode, the code works fine and the buffer
contains the valid value from the structure passed. If I prepend the
opcode but do not use strcpy to write the structure data to the buffer
after the first 2 bytes than the buffer will contain the opcode value in
the first 2 bytes. It's the combination of both that's the issue.
I am confused in what it is, my knowledge of pointer is a bit weak so I
come here for advice.

Perhaps you need to study the difference between strcpy() and
memcpy(). What with all the confusion in your examples, though, I'm
not at all sure about what to recommend.
 
R

Rui Maciel

Ian said:
The best advice I can offer is to write a code generator that either
parses your structures and generates the serialisation code, or
generates the code and structures from another form of definition (XML,
JSON, ..).

For this application I would stay far away from XML as possible. The poor
thing has been abused a lot as it is. After all, the 'M' in XML means
markup, and a massive shoehorn is needed to force XML to serve as a data
interchange format.

JSON, on the other hand, was developed with data interchange in mind, and
hence any language which is developed as a JSON subset is unquestionably
superior at that.


Rui Maciel
 
R

Rui Maciel

Durango said:
I am confused in what it is, my knowledge of pointer is a bit weak so I
come here for advice.

I would suggest that you start by defining a language to serve as a
communications protocol, and then write the routines to import/export data
based on that protocol.

In your post you've stated the following:

<quote>
I also want to prepend an opcode of type unsigned short so that I can
determine which structure type the buffer contains.
</quote>

If all you need is a way to pass off opcodes and the associated data, then
this might be considerably simple to accomplish. Consider, for example,
that you develop your protocol based on the JSON data interchange language.
It provides you with a way to specify label:value pairs, which can be
grouped to form data objects. With this, you may use these label:value
pairs to specify your opcode:data pairs. For instance, let's assume you had
an opcode to set a name, which would assign the text string "foo" to a text
variable. One way to encode that opcode would be:

<code>
"set": { "identifier": "name", "value": "foo" }
</code>

As a side note, if you decide to go with any approach similar to this then
it would be a good idea to design your protocol in a way that, when opening
the connection, the protocol version is announced. This could be done as
simple as sending the following label:value pair:

<code>
"protocol version": "1.0"
</code>

The main reason for this is that, by supporting this, you will be able to
check which protocol version you are using and therefore set the appropriate
parser. To put it in another way, you will be able to future-proof your
protocol, as you will be able to freely redesign it without worrying if it
is backward compatible or not.

So, putting it all together, a proper JSON text stream (give or take any
whitespace) would appear like this:

<code>
{
"protocol version": "1.0",
"set":
{
"identifier": "name",
"value": "foo"
}
}
</code>


Hope this helps,
Rui Maciel
 
Ad

Advertisements

M

Malcolm McLean

בת×ריך ×™×•× ×¨×שון, 8 ב×פריל 2012 23:46:29 UTC+1, מ×ת Durango:
I would like to pass this to the first 2 bytes of the buffer and
afterwards pass the data within the structure.

I would call the function in the following manner:
note: buffer gets allocated inside the function.

serializeData(opcode, a_data, &buffer);

within the function:
This is something that C doesn't do well. The central problem is that if you have a struct cotnaing a pointer, it's impossible to know what the pointer points to. So serialisation can't be built in, and it is extremely difficult to write a code generator or a signature-based struct writer. There's no support for serialisation in the language.

All you can do is write a serialisation function for each structure type, serialise_strcta, serialise_structb, and so on. Then to save, switch on the type passed in as a code, to load, do the same. That means manually updating the opcodes and the master fucntion each time to add a structure. it's far form ideal, but its the only real answer.
 
I

Ian Collins

For this application I would stay far away from XML as possible. The poor
thing has been abused a lot as it is. After all, the 'M' in XML means
markup, and a massive shoehorn is needed to force XML to serve as a data
interchange format.

JSON, on the other hand, was developed with data interchange in mind, and
hence any language which is developed as a JSON subset is unquestionably
superior at that.

I agree with you regarding data interchange, but XML is a better choice
for data definition.
 
D

Durango

On Sun, 08 Apr 2012 22:46:29 +0000, Durango wrote:

I appreciate everyone's input, I apologize for any confusion in my post
and the mistakes in my code. I do want to define my objective. This is
a personal project for the sole purpose of understanding the C language
better and gaining knowledge of more advanced concepts. I know there are
more convenient ways to accomplish the task I originally posted about,
but I am not interested in using any middleware or other "tools" to
accomplish this. I might sound foolish but I feel that this method is
best for me.

Thank you again :)
 
I

Ian Collins

On Sun, 08 Apr 2012 22:46:29 +0000, Durango wrote:

I appreciate everyone's input, I apologize for any confusion in my post
and the mistakes in my code. I do want to define my objective. This is
a personal project for the sole purpose of understanding the C language
better and gaining knowledge of more advanced concepts. I know there are
more convenient ways to accomplish the task I originally posted about,
but I am not interested in using any middleware or other "tools" to
accomplish this. I might sound foolish but I feel that this method is
best for me.

I guess some would consider writing some form of code generator to be a
"more advanced concept"! You still end up writing the code to perform
the serialisation, but you only have to do it once and you can easily
change the process or on the wire format.
 
Ad

Advertisements

S

Sunus Lee

Hello I need some guidance doing serialization of data using C. I have
the following function prototype for serializing data:
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer);

What I want to be able to do is take any type of structure and of course
write the data into the buffer.

so for example if I have a struc A defined:

struct A
{
char name[32];
};

than the buffer would contain the data for x, y, and z.

I also want to prepend an opcode of type unsigned short so that I can
determine which structure type the buffer contains.

I would like to pass this to the first 2 bytes of the buffer and
afterwards pass the data within the structure.

so for example I have:
unsigned short opcode = 12;
struct A *a_data;
char *buffer

and struct A as defined above.

I would call the function in the following manner:
note: buffer gets allocated inside the function.

serializeData(opcode, a_data, &buffer);

within the function:

I try to pass the opcode to the first 2 bytes of buffer
int serializeData(unsigned short data_type, void *structure, unsigned
char **buffer)
{
struct A *a_data;
*buffer = malloc(sizeof(struct A)+2);
/* pass opcode to first 2 bytes */
(*buffer)[0] = data_type >> 8;
(*buffer)[1] = data_type;
a_data = (struct A*)structure;
strcpy(*buffer+2, start_pkt->name);
return strlen(*buffer);
}

Without trying to prepend the opcode, the code works fine and the buffer
contains the valid value from the structure passed. If I prepend the
opcode but do not use strcpy to write the structure data to the buffer
after the first 2 bytes than the buffer will contain the opcode value in
the first 2 bytes. It's the combination of both that's the issue.
I am confused in what it is, my knowledge of pointer is a bit weak so I
come here for advice.

Thank you.

in fact, i was doing the same thing days ago. but i gave up.
because i did not find a really good way to keep track of every address of how big of that object's is.
but, there's something i can offer you, something i think my code can do:
keep track of the size of pointers that using malloc/alloc/realloc to get memory spaces.
keep track of the size of basic data type, struct.

the only thing that leads me giving up is that:
i don't know how to get a size of pointers that DON'T using malloc/alloc/realloc. and i really don't know how to do that without largely modify the existing codes. so, that's why i gave up.
hoping u can let this happen, and tell me:)
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top