J
Jef Driesen
Hi,
Let me give some background info first. I'm writing a library to
download data from a number of external devices (dive computers). All
devices have different transfer protocols, but they can be grouped in
two categories:
1. Devices which support random access to the internal memory.
For these type of devices, my library exposes the following function to
read a block of data at a specific memory address:
device_status_t
device_read (device_t *device,
unsigned int address,
unsigned char data[],
unsigned int size);
2. Devices which support sequential access to the internal memory.
These type of devices only support downloading all the data as a single
continuous stream of data. In some cases the data stream is split into
multiple smaller packets, but you can only request each packet sequentially.
Currently I have this function to download the data:
device_status_t
device_dump (device_t *device,
unsigned char data[],
unsigned int size,
unsigned int *result);
To use this function, you need to pass a buffer that is large enough the
store the downloaded data, and you get the actual size in the "result"
parameter after the download.
Unfortunately for some devices the actual amount of data is not known in
advance and you have to assume a worst case scenario. This is not
exactly optimal, because the actual amount can be much less than the
maximum (e.g. only a few KB downloaded from a device with a few MB's of
memory), and the maximum amount of memory varies greatly between
different devices (e.g. ranging from a few KB to a few MB). And to make
it even worse, for some devices the maximum amount of memory is simply
not known.
So I would like to replace this api function with something better. The
question is of course how to do this? I already have a number of ideas,
but none of them is perfect.
One idea would be to turn the "data" and "size" parameters into output
parameters (optionally wrapped in a new opaque buffer object):
device_status_t
device_dump (device_t *device,
const unsigned char *data[],
unsigned int *size);
But now you have to make a choice for the ownership for the data. Either
the library or the applications needs to own the data (to be able to
free it again).
A. If the library owns the data, the data pointer needs to remain valid
after the function had returned control to the application. Thus the
buffer will have to be stored inside the device handle, and destroyed
with the device handle (or another function call). The consequences for
the application are:
+ No need to free the returned data.
- The lifetime of the buffer is now tied to the lifetime of the device
handle. If the device handle is destroyed (or another device_* functions
is called on the same handle) the internal buffer will (or might) become
invalid. Thus if you want to preserve the data, you'll need to copy it,
but than you end up with two copies!
- The internals of the device object are leaked to the outside.
B. If the application owns the data, the library can malloc the required
amount of data and pass that pointer to the application.
+ No lifetime or leak issues.
- The application needs to free the data once it doesn't need it
anymore. But if it forget to do so, you end up with a memory leak.
- Different design compared with some other areas of my api (read further).
I have also a more highlevel api that does not download raw memory, but
individual pieces of information (dive profiles). This api is built on
top of the lower level device_read() and device_dump() functions. It's
an iterator style api, that is used like this (which is very similar to
a database cursor):
device_t *device = ...;
device_entry_t *entry = NULL;
while (device_entry_next (device, &entry) == SUCCESS) {
device_entry_get_data (entry, &data, &size);
device_entry_get_datetime (entry, &dt);
device_entry_get_fingerprint (entry, &fp_data, &fp_size);
device_entry_get_devid (entry, &id);
}
device_entry_reset (device);
In this case it makes more sense to use "library owns the data" I think.
There is no real lifetime problem because the device_entry_reset()
exists and that can easily be used to destroy the buffer before the
device handle is destroyed. Using "application owns the data" would
result in extra data copying, because internally usually a single buffer
is allocated for the data of all entries. Returning a newly allocated
buffer for each entry would result in many small malloc'ed buffers and
additional copying. Also not optimal.
Any other ideas or even an alternative design?
Thanks in advance,
Jef
Let me give some background info first. I'm writing a library to
download data from a number of external devices (dive computers). All
devices have different transfer protocols, but they can be grouped in
two categories:
1. Devices which support random access to the internal memory.
For these type of devices, my library exposes the following function to
read a block of data at a specific memory address:
device_status_t
device_read (device_t *device,
unsigned int address,
unsigned char data[],
unsigned int size);
2. Devices which support sequential access to the internal memory.
These type of devices only support downloading all the data as a single
continuous stream of data. In some cases the data stream is split into
multiple smaller packets, but you can only request each packet sequentially.
Currently I have this function to download the data:
device_status_t
device_dump (device_t *device,
unsigned char data[],
unsigned int size,
unsigned int *result);
To use this function, you need to pass a buffer that is large enough the
store the downloaded data, and you get the actual size in the "result"
parameter after the download.
Unfortunately for some devices the actual amount of data is not known in
advance and you have to assume a worst case scenario. This is not
exactly optimal, because the actual amount can be much less than the
maximum (e.g. only a few KB downloaded from a device with a few MB's of
memory), and the maximum amount of memory varies greatly between
different devices (e.g. ranging from a few KB to a few MB). And to make
it even worse, for some devices the maximum amount of memory is simply
not known.
So I would like to replace this api function with something better. The
question is of course how to do this? I already have a number of ideas,
but none of them is perfect.
One idea would be to turn the "data" and "size" parameters into output
parameters (optionally wrapped in a new opaque buffer object):
device_status_t
device_dump (device_t *device,
const unsigned char *data[],
unsigned int *size);
But now you have to make a choice for the ownership for the data. Either
the library or the applications needs to own the data (to be able to
free it again).
A. If the library owns the data, the data pointer needs to remain valid
after the function had returned control to the application. Thus the
buffer will have to be stored inside the device handle, and destroyed
with the device handle (or another function call). The consequences for
the application are:
+ No need to free the returned data.
- The lifetime of the buffer is now tied to the lifetime of the device
handle. If the device handle is destroyed (or another device_* functions
is called on the same handle) the internal buffer will (or might) become
invalid. Thus if you want to preserve the data, you'll need to copy it,
but than you end up with two copies!
- The internals of the device object are leaked to the outside.
B. If the application owns the data, the library can malloc the required
amount of data and pass that pointer to the application.
+ No lifetime or leak issues.
- The application needs to free the data once it doesn't need it
anymore. But if it forget to do so, you end up with a memory leak.
- Different design compared with some other areas of my api (read further).
I have also a more highlevel api that does not download raw memory, but
individual pieces of information (dive profiles). This api is built on
top of the lower level device_read() and device_dump() functions. It's
an iterator style api, that is used like this (which is very similar to
a database cursor):
device_t *device = ...;
device_entry_t *entry = NULL;
while (device_entry_next (device, &entry) == SUCCESS) {
device_entry_get_data (entry, &data, &size);
device_entry_get_datetime (entry, &dt);
device_entry_get_fingerprint (entry, &fp_data, &fp_size);
device_entry_get_devid (entry, &id);
}
device_entry_reset (device);
In this case it makes more sense to use "library owns the data" I think.
There is no real lifetime problem because the device_entry_reset()
exists and that can easily be used to destroy the buffer before the
device handle is destroyed. Using "application owns the data" would
result in extra data copying, because internally usually a single buffer
is allocated for the data of all entries. Returning a newly allocated
buffer for each entry would result in many small malloc'ed buffers and
additional copying. Also not optimal.
Any other ideas or even an alternative design?
Thanks in advance,
Jef