Returning pointers to different items and memory storage

CptDondo · Oct 28, 2005

I am working on an embedded platform which has a block of battery-backed
RAM. I need to store various types of data in this block of memory -
for example, bitmapped data for control registers, strings for logging,
and structures for data points.

I want to use one function to read data from this block and one function
to write data, for example:

sram_read(OBJECT_IDENTIFIER) would return a pointer to the appriate
object and

sram_write(OBJECT_IDENTIFIER, &object) would write the object to the memory

I have to do this in a way that is portable across architectures - we
support two platforms, one is ARM based and the other is x86 based.

Sometimes the object might be a string, sometimes a char, sometimes a
structure.....

What is the correct way to declare these functions?

Also, and a somewhat more difficult problem (for me at least) is the
actual storage mechanism....

One of the objects I have to store is a structure like this:

struct AnalogPoint {
float value;
float alarm_hi, alarm_low;
float average, avg_weight;
int log_freq;
int hyst_timer;
char ctrl_reg;
};

I've been kicking around various ways to store this structure. Figure
that I have to store 16 of these. I could store an array of 16
structures and return pointers to them. But then I see potential
problems; if the software is upgraded the newly compiled versions may
not put the same data in the same place, so I'd have garbage when the
new software tried to read the data saved by the older software.

I could 'unpack' the structure and store all of the elements as linear
arrays and build a structure on the fly and return a pointer to it, but
then I have problems with the data area being overwritten on the next call.

So.... Any suggestions? Stability is more important than speed.

Thanks,

--Yan

Eric Sosman · Oct 29, 2005

CptDondo said:
I am working on an embedded platform which has a block of battery-backed
RAM. I need to store various types of data in this block of memory -
for example, bitmapped data for control registers, strings for logging,
and structures for data points.

I want to use one function to read data from this block and one function
to write data, for example:

sram_read(OBJECT_IDENTIFIER) would return a pointer to the appriate
object and

sram_write(OBJECT_IDENTIFIER, &object) would write the object to the memory

I have to do this in a way that is portable across architectures - we
support two platforms, one is ARM based and the other is x86 based.

Sometimes the object might be a string, sometimes a char, sometimes a
structure.....

What is the correct way to declare these functions?

The most straightforward is probably

void *sram_read(int objid);
void sram_write(int objid, const void *objptr);

Variations: The first argument might be something other than
an `int', sram_read() might return a `const void*' instead of
a plain `void*', ... There are lots of possibilities, but all
that I can think of share the same advantage and the same
disadvantage:

Pro: The scheme is general and extensible. A `void*' can
point to absolutely any C data type, so if you someday decide
you need the functions to handle `double complex' or something
even weirder, the function declarations can remain the same.

Con: Since a `void*' can point to anything at all, it's up
to the caller and the functions to deduce the "real" data type
and make sure it's correct. The compiler will be of no help
here: if you call sram_read() to fetch an `int' datum and
carelessly store the returned pointer in a `double*', your
error will not be detected until run-time, and then only by
the fact that demons start flying from your nose.

Also, and a somewhat more difficult problem (for me at least) is the
actual storage mechanism....

One of the objects I have to store is a structure like this:

struct AnalogPoint {
float value;
float alarm_hi, alarm_low;
float average, avg_weight;
int log_freq;
int hyst_timer;
char ctrl_reg;
};

I've been kicking around various ways to store this structure. Figure
that I have to store 16 of these. I could store an array of 16
structures and return pointers to them. But then I see potential
problems; if the software is upgraded the newly compiled versions may
not put the same data in the same place, so I'd have garbage when the
new software tried to read the data saved by the older software.

I could 'unpack' the structure and store all of the elements as linear
arrays and build a structure on the fly and return a pointer to it, but
then I have problems with the data area being overwritten on the next call.

I fear I do not completely understand what worries you.
Specific pointer values are, in general, not portable from one
execution of a program to the next; they are only valid within a
single execution. The situation may be different with a chunk of
special storage that has been laid out by other (non-portable)
means -- but why are you worried about pointer values anyhow? Your
functions use the first argument to designate the data to read or
write, not a pointer value. If you stick to the object IDs and
stop worrying about the pointer values, you may be better off.

As for storing a struct -- well, why not simply store it as
it is? Something is clearly bothering you about this (and the
"something" is probably a reasonable concern), but I don't yet
know the nature of your difficulty.

Thad Smith · Oct 29, 2005

CptDondo said:
I am working on an embedded platform which has a block of battery-backed
RAM. I need to store various types of data in this block of memory -
for example, bitmapped data for control registers, strings for logging,
and structures for data points.

I want to use one function to read data from this block and one function
to write data, for example:

sram_read(OBJECT_IDENTIFIER) would return a pointer to the appriate
object and

sram_write(OBJECT_IDENTIFIER, &object) would write the object to the memory

I have to do this in a way that is portable across architectures - we
support two platforms, one is ARM based and the other is x86 based.

Sometimes the object might be a string, sometimes a char, sometimes a
structure.....

What is the correct way to declare these functions?

My suggestion is to make the storage module ignorant of the data type
being stored. As I understand, you want to assign handles to different
data items, then retrieve the last stored data, given the handle.

I would pass sram_write the identifier, pointer, and length:

typedef unsigned char pdtype; /* persistent data type identifier */

int /* 0= OK, 1=out of memory */
sram_write (
pdtype type, /* type of data */
const void* pd, /* data to be stored */
size_t ldata /* length of data in bytes */
};

const void* /* ptr to data */
sram_read (
pdtype type /* data type to retrieve */
);

Your module should do its own allocation of memory. I assume here that
you only have a few data types and don't need to delete data, but you
will do periodic overwrites. If the length of the data items vary, then
you need to handle new memory allocation and garbage collection to
maintain integrity over the long run.

Internally you can have an array that gives an address pointer and
length for each item, indexed by pdtype.

To provide protection against database corruption due to power loss, you
need to do updates in steps so that if you stop executing at any point,
you can, on restart, either finish the transaction or cancel it. You
may also want to provide a separate call to start an update and complete
an update. All calls to sram_write that occur after the start are
nullified if the completion call didn't occur. That way you can
guarantee a consistent state on powerup. Don't forget that garbage
collection must be restartable.

Also, and a somewhat more difficult problem (for me at least) is the
actual storage mechanism....

One of the objects I have to store is a structure like this:

struct AnalogPoint {
float value;
float alarm_hi, alarm_low;
float average, avg_weight;
int log_freq;
int hyst_timer;
char ctrl_reg;
};

I've been kicking around various ways to store this structure. Figure
that I have to store 16 of these. I could store an array of 16
structures and return pointers to them.

As suggested earlier, give each struct variable a separate handle.
Internally each one will have its own pointer and length.

But then I see potential
problems; if the software is upgraded the newly compiled versions may
not put the same data in the same place, so I'd have garbage when the
new software tried to read the data saved by the older software.

That's a different problem, but you eliminate it by storing the version
number identifying the data format used to store the data. On startup,
compare the stored data format version with the current version. If
less, do an explicit conversion of the data and restore, then update the
data format version number. The conversion, for example, could read the
old data structure and write the data with the new data structure. The
lengths could be different. Again, to be robust, you need to make the
procedure restartable in case of power loss. The easiest way to do
this, if you have sufficient memory, is to keep the old data stored
until all the data has been updated, then delete the older version.

I could 'unpack' the structure and store all of the elements as linear
arrays and build a structure on the fly and return a pointer to it, but
then I have problems with the data area being overwritten on the next call.

That is too complicated, IMO. The storage module shouldn't know what
the data is, only where it is and how long each item is (as specified by
the sram_write call). The cost is the pointer and length storage per
item, which may be a structure.

If you need to store several long arrays, it would suggest adding
mechanisms to create flexible arrays in persistent storage. I have
implemented something along the lines I mentioned for non-volatile flash
storage (without a separate file system).

Captain Dondo · Oct 29, 2005

I fear I do not completely understand what worries you.
Specific pointer values are, in general, not portable from one
execution of a program to the next; they are only valid within a
single execution. The situation may be different with a chunk of
special storage that has been laid out by other (non-portable)
means -- but why are you worried about pointer values anyhow? Your
functions use the first argument to designate the data to read or
write, not a pointer value. If you stick to the object IDs and
stop worrying about the pointer values, you may be better off.

As for storing a struct -- well, why not simply store it as
it is? Something is clearly bothering you about this (and the
"something" is probably a reasonable concern), but I don't yet
know the nature of your difficulty.

Thak you both for the comments. By way of history, our products have a
field life of 20 years, so it is likely that this code will be maintained
by someone else at some point. Our current product is based on a
proprietary, closed controller running Basic, with limited memory. Over
the years, as features have been added, variable names and all legibility
has been sacrificed, until the system is now a nearly unmaintainable mess.

The initial decision to use this product made sense; it just turned out to
be unmaintainable over the long run. Unfortunately, I am now responsible
for maintaining this beast.... So, I am *very determined* to do things
right, in a cross-platform portable way just so my successor won't hunt
me down in my retirement.... ;-)

Now back to my issue....

My concern with storage of structures is this:

Assume that I compile version 1 with plain old compiler 3.3.3; it stores
structures as declared: if I declare int a, b, c; it will allocate a total
of 12 bytes and store a in the first 4, b in the next 4, and c in the last
4.

Now I recompile the code with optimizing compiler version 4.4.4; it
'optimizes' the storage locations and stores the variables as follows: c
in the first 4, b in the next, and a in the last 4.

Is this a legitimate concern? Perhaps not. I don't know if c
guarantess a particular memory order for storage in a structure. Storing
the structure as a block of bytes pretty much assumes on retrieval that
the receiving structure has the same internal representation as the
sending structure.

But field upgrades will be done by people with little to no computer
skills, or even by an automated process, and it is essential that the data
remain intact over software upgrades.

I could build an 'upgrader', something that would suck the data out into a
text file using the old version, then put it back with the new version of
the software, but I'd rather build stability into the data storage
mechanism.

--Yan

Eric Sosman · Oct 29, 2005

Captain said:
[...]
My concern with storage of structures is this:

Assume that I compile version 1 with plain old compiler 3.3.3; it stores
structures as declared: if I declare int a, b, c; it will allocate a total
of 12 bytes and store a in the first 4, b in the next 4, and c in the last
4.

Now I recompile the code with optimizing compiler version 4.4.4; it
'optimizes' the storage locations and stores the variables as follows: c
in the first 4, b in the next, and a in the last 4.

Is this a legitimate concern? Perhaps not. I don't know if c
guarantess a particular memory order for storage in a structure. Storing
the structure as a block of bytes pretty much assumes on retrieval that
the receiving structure has the same internal representation as the
sending structure.

The compiler cannot change the order of the elements
within a struct. The first element must start at the address
of the struct itself, the next at a higher address, and so on.

HOWEVER, the compiler is allowed to insert unnamed padding
bytes after any element of the struct. The total size of the
struct can therefore be greater than the sum of the sizes of
its elements; also, different compilers may use different
amounts of padding and thus disagree about the relative offsets
of struct elements other than the first (which is necessarily
at offset zero).

The usual reason to insert padding bytes is to satisfy
alignment requirements, so it would be very strange to find
padding between two adjacent struct elements of the same type.
Even in the absence of inter-element padding, you might still
find padding at the end of the struct -- maybe the compiler
expands the struct by an extra `int'-worth so it can use some
machine's sixteen-byte-aligned instructions to manipulate it.
Such shenanigans are strange but not unheard-of -- and the
compiler is not forbidden to be strange.

If you want to be absolutely sure there's no padding, use
an array of three elements instead of a struct with three
elements. If the rest of the program really wants to deal with
a struct, just do element-by-element copying as you pass the
data to and from your storage functions.

But field upgrades will be done by people with little to no computer
skills, or even by an automated process, and it is essential that the data
remain intact over software upgrades.

I could build an 'upgrader', something that would suck the data out into a
text file using the old version, then put it back with the new version of
the software, but I'd rather build stability into the data storage
mechanism.

I'd suggest that you fix the format of the stored data,
making it independent of the compiler's whims. Then you can
write "adapter" functions that convert the stored format to
and from whatever the current compiler happens to give you:
let your program deal with structs in their native form, and
convert between "native" and "stored" on the way in and out of
your storage system. A few hints:

- An `int' is not necessarily four bytes long. It will
have at least sixteen bits, but could have more. Also,
a `char' (the basic unit of size in C) will have at
least eight bits but could be wider. You may or may not
need to worry about such things; it depends on the set of
platforms your code needs to run on.

- Different machines arrange the individual bytes of multi-
byte values in different ways. Two popular arrangements
are "Big-Endian" (the most significant byte appears at
the lowest memory address) and "Little-Endian" (the
reverse). At least one machine used a variant sometimes
called "Middle-Endian." Even if you don't need to deal
with strange sizes, you're likely to need to cope with
variations in "Endianness" if you need to exchange stored
data between dissimilar machines.

- The offsetof() macro in <stddef.h> can tell you where the
compiler has chosen to place each element of a struct. You
can use it to build an array that describes the types and
relative positions of struct elements; this can let you
write just a few wrapper functions instead of many.

Netocrat · Oct 29, 2005

<OT>
From a separate posting in comp.os.linux.misc I gather that Captain Dondo
is using the POSIX mmap() function to return a pointer to a block of
memory representing the persistent RAM device. For comp.lang.c purposes
we might treat it as a call to malloc() which happens to return memory
already initialised with data. If Captain Dondo wants to clarify this
equivalence he should probably ask on comp.unix.programmer.

It seems that the OP is doing his own memory management on a block of
memory.

If he wants help with that part of his task, he could provide more
information on his constraints:
* Are there a fixed number of objects with fixed OBJECT_IDENTIFIERS, or
can objects and OBJECT_IDENTIFIERS be dynamically added/removed?
* What is the relationship between an OBJECT_IDENTIFIER and the object it
identifies?

As for storing a struct -- well, why not simply store it as
it is? Something is clearly bothering you about this (and the
"something" is probably a reasonable concern), but I don't yet know the
nature of your difficulty.

Click to expand...

[...]
My concern with storage of structures is [the use of different versions
of the compiler to generate the program accessing the same data]
I don't know if c guarantess a particular memory order for storage in a
structure.

It does. What it doesn't guarantee is that the new compiler version won't
change the amount of padding between members (it does guarantee that there
is no initial padding).

Storing
the structure as a block of bytes pretty much assumes on retrieval that
the receiving structure has the same internal representation as the
sending structure.

One other potential problem is alignment. The initial pointer to the
memory block (when equated to a returned pointer from malloc()) is
suitably aligned for any data type. But in standard C, reading from and
writing to that memory block at a location other than the initial one
using an indirected pointer of type other than a char pointer could fail
due to incorrect alignment.

But field upgrades will be done by people with little to no computer
skills, or even by an automated process, and it is essential that the
data remain intact over software upgrades.

I could build an 'upgrader', something that would suck the data out into
a text file using the old version, then put it back with the new version
of the software, but I'd rather build stability into the data storage
mechanism.

One portable way of dealing with this type of situation (and one that I've
successfully used) is for your sram_read and sram_write functions to
decode the objects and read/write them into the memory block in a portable
representation of your devising using char * access only.

sram_read(OBJECT_IDENTIFIER) would either take an additional pointer to an
object of appropriate type that it would fill in after interpreting the
object's representation in the memory block, or it would malloc the memory
itself and return it.

Likewise, sram_write(OBJECT_IDENTIFIER, &object) would interpret the
object pointed to by &object and store it using char * (i.e. byte by byte)
into the memory block in a portable format of your own devising.

The downside is that you need to treat each structure separately, but with
some helper functions for the basic types (int, double, etc) it's not that
much work to add support for a new structure.

This doesn't deal with how you determine _where_ in the memory block to
store those objects, which as I said above you need to provide further
information about if you wish assistance in that regard, although it may
be borderline w.r.t. topicality.

I see that Eric Sosman has just responded, and my advice parallels his.

Captain Dondo · Oct 29, 2005

<OT>
From a separate posting in comp.os.linux.misc I gather that Captain Dondo
is using the POSIX mmap() function to return a pointer to a block of
memory representing the persistent RAM device. For comp.lang.c purposes
we might treat it as a call to malloc() which happens to return memory
already initialised with data. If Captain Dondo wants to clarify this
equivalence he should probably ask on comp.unix.programmer.

There is another issue that I am dealing with - mmap and the whole
POSIX/SYS V/BSD memory mapping... I am totaly clueless (but trying to
learn).... But that's OT for this group.

It seems that the OP is doing his own memory management on a block of
memory.

If he wants help with that part of his task, he could provide more
information on his constraints:
* Are there a fixed number of objects with fixed OBJECT_IDENTIFIERS, or
can objects and OBJECT_IDENTIFIERS be dynamically added/removed?
* What is the relationship between an OBJECT_IDENTIFIER and the object it
identifies?

OK, I realize that this is getting somewhat OT, but here's what I'm trying
to do:

I have a block of 512k of battery backed ram. I need to partition this
into roughly three areas:

one area contains three fixed length ring buffers for logging purposes
(these are pretty easy, just a pointer at the head of the buffer, and each
message being \n terminated)

one for config file storage (again, easy, a fixed length text area holding
a single \0 terminated string)

one for binary data storage (this is the one that's giving me fits)
The amount of binary data will be fixed - there will be a set number of
data points, control registers, etc. So I could for example lay out a map
that puts 16 data point structures first, then the control registers, then
the bus maps (16 bytes of bit-mapped data) and so on. Most of my data is
bitmapped, so I can use char for that. It's really the data points that I
am worried about. Each data point has alarm setpoints, and if the sensor
goes above or below that we need to stop the machine. This is the data I
really need to preserve.

The data storage is not for 'general purpose', it is very specific to my
uses. We can assume that sram_read and sram_write can have complete
knowledge of the type of data being stored.

From the discussion, it may be easier to unroll my data points into
arrays, and then fill a client-supplied struct to match. Since the type
and amount of data I have to deal with is fixed, I should be able to
assign blocks of memory to specific arrays to hold the unrolled structures.

Things like sizeof(float) should not change unless we change
architectures... In which case, I'm not going to be worried about
preserving data. The new hardware would be factory-programmed with the
correct data.

thus, on the client side,

struct DataPoint point;

sram_read(DATAPOINT5,&point);

and within sram_read() I do something like this:

point->alarm_hi = mmap_ptr_to_alarm_hi_array + sizeof(float) * 5;
...

sram_read and sram_write would have to have enough intelligence to know
the type of the second argument from the identifier, but that's easily
done.

This doesn't deal with how you determine _where_ in the memory block to
store those objects, which as I said above you need to provide further
information about if you wish assistance in that regard, although it may
be borderline w.r.t. topicality.

I think I can pretty much figure that part out... I just don't know
enough about the very low level stuff I have to deal with here....

--
o__
,>/'_ o__
(_)\(_) ,>/'_ o__
Yan Seiner, PE (_)\(_) ,>/'_ o__
Certified Personal Trainer (_)\(_) ,>/'_ o__
Licensed Professional Engineer (_)\(_) ,>/'_
Who says engineers have to be pencil necked geeks? (_)\(_)

Netocrat · Oct 29, 2005

]

The amount of binary data will be fixed - there will be a set number of
data points, control registers, etc.

Well that makes things easier.

[...]

From the discussion, it may be easier to unroll my data points into
arrays, and then fill a client-supplied struct to match. Since the type
and amount of data I have to deal with is fixed, I should be able to
assign blocks of memory to specific arrays to hold the unrolled
structures.

I'm not sure what you mean by "unrolling into arrays". Are you saying
that all of the members in your datapoint struct are of equal size and you
can conceptualise the struct as an array of "binary data" members?

Things like sizeof(float) should not change unless we change
architectures... In which case, I'm not going to be worried about
preserving data. The new hardware would be factory-programmed with the
correct data.

Representation as well as size is an issue, but as you say it's likely to
be largely hardware-dependent and unrelated to your compiler version (you
should probably verify that before deciding to use native float format
though).

It seems to me that the likeliest change (and it's somewhat remote unless
you use different compiler options) is structure padding.

thus, on the client side,

struct DataPoint point;

sram_read(DATAPOINT5,&point);

and within sram_read() I do something like this:

point->alarm_hi = mmap_ptr_to_alarm_hi_array + sizeof(float) * 5; ...

That still has alignment issues unless you are guaranteed that
mmap_ptr_to_alarm_hi_array is suitably aligned for a float. Of course you
may not have alignment problems on your particular platform(s) and be able
to safely write non-portable code, but that's beyond the scope of this
group.

To do it portably, you would copy the float using memcpy() or a loop with
char * pointers.

[...]

Captain Dondo · Oct 29, 2005

I'm not sure what you mean by "unrolling into arrays". Are you saying
that all of the members in your datapoint struct are of equal size and you
can conceptualise the struct as an array of "binary data" members?

The other way... I have 16 structures, each of which has float a, b,
c,.... as members. I am thinking I should store these as the equivalent
of

float a[16], b[16], .... which should get me around the padding issue, and
then reassemble the structure as requested.

That still has alignment issues unless you are guaranteed that
mmap_ptr_to_alarm_hi_array is suitably aligned for a float. Of course you
may not have alignment problems on your particular platform(s) and be able
to safely write non-portable code, but that's beyond the scope of this
group.

Good point. I have to make sure I start each array pointer on a multiple
of sizeof(float)....

--
o__
,>/'_ o__
(_)\(_) ,>/'_ o__
Yan Seiner, PE (_)\(_) ,>/'_ o__
Certified Personal Trainer (_)\(_) ,>/'_ o__
Licensed Professional Engineer (_)\(_) ,>/'_
Who says engineers have to be pencil necked geeks? (_)\(_)

Thad Smith · Oct 29, 2005

Captain said:
I have a block of 512k of battery backed ram. I need to partition this
into roughly three areas:

one area contains three fixed length ring buffers for logging purposes
(these are pretty easy, just a pointer at the head of the buffer, and each
message being \n terminated)

one for config file storage (again, easy, a fixed length text area holding
a single \0 terminated string)

one for binary data storage (this is the one that's giving me fits)
The amount of binary data will be fixed - there will be a set number of
data points, control registers, etc.

You say fixed size, but if you are concerned about software upgrades, it
is a reasonable possibility that the size may change in a newer version
due to additional variables to store. If so, think in terms of number
of items and item size changing infrequently.

The data storage is not for 'general purpose', it is very specific to my
uses. We can assume that sram_read and sram_write can have complete
knowledge of the type of data being stored.

In my earlier post I recommended making the persistent storage routines
ignorant of the type and size of data items. Let me amplify on that point.

I would recommend multiple layers of code to support your persistent data:
1. persistent storage routines, such as the versions of sram_read and
sram_write I recommended earlier.
2. access routines, which call the storage routines, that know about
data types.

The application code can then call the access routines, such as
pstore_meas (measurement_t *m) and
gstore_meas (measurement_t *m).
The access routines know about the format of the individual data item.
They, in turn, call the storage routines to put and get copies.

I recommend that only the storage routines move data to and from the
storage area, since they may need to shuffle data items in memory as
part of the data recovery strategy. The storage routines leave the
interpretation of the data to the upper layers.

>...
thus, on the client side,

struct DataPoint point;

sram_read(DATAPOINT5,&point);

and within sram_read() I do something like this:

point->alarm_hi = mmap_ptr_to_alarm_hi_array + sizeof(float) * 5;
...

sram_read and sram_write would have to have enough intelligence to know
the type of the second argument from the identifier, but that's easily
done.

Again, I recommend that knowledge about data types be placed in an upper
layer, not the persistence logic, unless you really don't need to
provide for consistent data recovery in an environment where the power
may be lost at any time. Your data recovery can be robust if you design
it that way.

POST local storage - angular	0	May 10, 2022
Collecting multiple items and saving to one list item, for eventual storage as a record.	8	Mar 5, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Sizes of pointers	233	Jul 30, 2013
Generic linked list with internal storage?	65	Apr 8, 2010
Splint warnings about dependent storage returned as implicitly only.	1	May 20, 2013
[C++] Pointers declared inside a function, how do I manage them?	5	May 3, 2023
scope, linkage and storage duration	1	Nov 4, 2008

Returning pointers to different items and memory storage

CptDondo

Eric Sosman

Thad Smith

Captain Dondo

Eric Sosman

Netocrat

Captain Dondo

Netocrat

Captain Dondo

Thad Smith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads