Design decisions

J

jacob navia

In the new "templated" form of the containers library I have now that
all data retrieving functions return the data directly:

int (*GetElement)(intList *l,size_t idx);

The problem is that it is not possible to return an error
with this configuration: Since the return value is an
int, any error code could be actually some data!!

The solution is to return a pointer to the data and NULL
if there is an error, as the other functions of the library do.

But this doesn't look great with the syntax viewed from the
user's side:

Instead of:
int m = iintList.GetElement(mylist,23);
you should write:
int m = *iintList.GetElement(myList,23);

risking a core dump if there is a NULL return.

Obviously if the list has only 20 elements, the core dump
is actually a good thing since it will show you the error immediately.

Before returning, the "GetElement" API calls the error function
and will return ONLY IF the error function returns.

I was relying in that as error handling, and just returning zero...

After some reflection I am convinced now that this would be a bad design
decision.

What do you think?

jacob
 
A

Andrew Cooper

In the new "templated" form of the containers library I have now that
all data retrieving functions return the data directly:

int (*GetElement)(intList *l,size_t idx);

The problem is that it is not possible to return an error
with this configuration: Since the return value is an
int, any error code could be actually some data!!

The solution is to return a pointer to the data and NULL
if there is an error, as the other functions of the library do.

But this doesn't look great with the syntax viewed from the
user's side:

Instead of:
int m = iintList.GetElement(mylist,23);
you should write:
int m = *iintList.GetElement(myList,23);

risking a core dump if there is a NULL return.

Obviously if the list has only 20 elements, the core dump
is actually a good thing since it will show you the error immediately.

Before returning, the "GetElement" API calls the error function
and will return ONLY IF the error function returns.

I was relying in that as error handling, and just returning zero...

After some reflection I am convinced now that this would be a bad design
decision.

What do you think?

jacob

(Other than the fact you appear to be trying to get C++ style code into C)

One design you could use is:

bool_t (*GetElement)(intList *l, size_t idx, int * dst);

so your code looks like:
int m;
if ( iintList.GetElement(myList, 23, &m) == FALSE )
// Some error
else
// All Good

~Andrew
 
B

BartC

jacob navia said:
In the new "templated" form of the containers library I have now that all
data retrieving functions return the data directly:

int (*GetElement)(intList *l,size_t idx);

The problem is that it is not possible to return an error
with this configuration: Since the return value is an
int, any error code could be actually some data!!

The solution is to return a pointer to the data and NULL
if there is an error, as the other functions of the library do. ....
What do you think?

How does C++ deal with it?

My feeling is that GetElement should be as simple and convenient to use as
possible, especially for the 99.999% of calls (and usually 100%) where there
is no error. Perhaps have two versions, one where any errors are handled
internally (either aborting, returning an agreed default value, or
whatever), and another which has full error returns for people who insist on
checking every single list access themselves.
 
J

James Kuyper

How does C++ deal with it?

Of the C++ standard containers, only basic_string, array, deque, and
vector containers have a close equivalent of GetElement. The other
container types do not provide int-indexed access. The ones that do,
each have two GetElement equivalents. If Container is one of those
templates, then given the declaration:

Container<T> container;

the expressions container[idx] and container.at(idx) both have the
semantics of *(container.begin()+n). If idx is out of range,
container[idx] has undefined behavior; container.at(idx) throws
std::eek:ut_of_range.
 
J

jacob navia

Le 22/06/12 00:23, Gareth Owen a écrit :
What do you have in mind for a well designed, non-returning error
function. Obviously one that calls abort()/exit() would work but what
else?

Obviously a longjmp combined with a prior setjmp can establish a
recovery point where all errors in a program (or in a part of the
program) would end.

If you use pooled memory access you can release the memory pool
associated with the current error, and you have the exact mechanism
of the C++ exceptions combined with cleanup by stack unwinding in
a much cheaper and efficient way.
Could the error function provide a default value (e.g. out-of-bounds
access always returns a 0 value, or attempts to expand the vector and
pad with zero)?

I had this strategy but it is unworkable since you can't know in
advance if a zero will appear in the data!

Attempting to expand the vector is also a bad decision because:

1) In case the error is an index error you are just masking the error
making it more difficult to find. GetElement retrieves the element's
data but you have no data for an inexistent element so that you fake
it. But then the rest of the software that expects that data point
will give WRONG RESULTS!

Debugging a wrong result bug is orders of magnitude more difficult
than debugging a simple crash.

2) If you get a huge index (say, 75876) you will start trying to
allocate 75876-20 elements and you could get a memory allocation
failure, making a double fault!

3) If you expand the vector you are probably overstretching the used
memory, specially if the wrong index is too big, provoking a memory
failure later in the code. Then, you will have to start figuring
out why you are using so much memory, and it could take a LONG while
until you figure out that a wrong index forced you to allocate
several megabytes of vector space...
 
I

Ian Collins

Le 22/06/12 00:23, Gareth Owen a écrit :

Obviously a longjmp combined with a prior setjmp can establish a
recovery point where all errors in a program (or in a part of the
program) would end.

If you use pooled memory access you can release the memory pool
associated with the current error, and you have the exact mechanism
of the C++ exceptions combined with cleanup by stack unwinding in
a much cheaper and efficient way.

Er, no you don't. Allocated memory isn't the only thing you may want to
clean up.

This situation is much the same as the early return cases discussed at
great length on the recent goto thread.
 
G

gwowen

Le 22/06/12 00:23, Gareth Owen a écrit :

Obviously a longjmp combined with a prior setjmp can establish a
recovery point where all errors in a program (or in a part of the
program) would end.

If you use pooled memory access you can release the memory pool
associated with the current error, and you have the exact mechanism
of the C++ exceptions combined with cleanup by stack unwinding in
a much cheaper and efficient way.

Really? Do you really believe that? What about resources that are not
memory - mutex/critical sections, network connections, open ports,
file descriptors.

How would I use your error handler to do this:

extern DAQ_source* daq;

int npieces = 100;

while(true){
try {
std::vector<std::vector<uint8_t> datapts;
for(int i=0; i<npieces; ++i){
std::vector<uint8_t> signal;
{
// Acquire the next signal from DAQ
Mutex DAQ_lock(daq); // mutex lock acquired
signal = daq->acquire(); // can generate errors
// mutex lock released
}
Filter multitap; // can throw
multitap.process(signal); // can generate errors
datapts.push_back(signal); // can generate errors
}

analyse(datapts);
}
catch(const std::runtime_error& e)
{
// all resources automatically cleaned up
log << e.what();
sleep(100); // retry on error
}
}
I had this strategy but it is unworkable since you can't know in
advance if a zero will appear in the data!

But it might be desirable for a user to be able to specify such
behaviour in his error handler. For example, in signal processing its
often useful to assume that the data-array is zero-padded-to-infinity,
so I may use a C++ object like this:

class zero_padded_array
{
private:
public std::vector<double> v_;

public:
const double& at(size_t idx) const {
if(idx >= v_.size()){
return 0.0;
} else {
return v_[idx];
}
}
// Rest of interface.
}

Can I use an error handler to handle out-of-bounds access in you
library to get that behaviour?
Attempting to expand the vector is also a bad decision because:

1) In case the error is an index error you are just masking the error
   making it more difficult to find. GetElement retrieves the element's
   data but you have no data for an inexistent element so that you fake
   it. But then the rest of the software that expects that data point
   will give WRONG RESULTS!

You misunderstand me. My point is that insertion

e.g. In Matlab I can say:

vec = [1 2 4 8];
vec(6) = 9; % inserting out-of-bounds automatically expands the vector
% vec is now [1 2 4 8 0 9];
x = vec(9); % retrieving out-of-bounds is an error


class resizing_array
{
private:
public std::vector<double> v_;

public:
double& at(size_t idx) {
if(idx >= v_.size()){
v_.resize(idx+1);
}
return v_[idx];
}
// Rest of interface....
}

Can I use an error handler to get that behaviour? Exceptions make it
easy.
2) If you get a huge index (say, 75876) you will start trying to
    allocate 75876-20 elements and you could get a memory allocation
    failure, making a double fault!

I regularly create C++ matrices with far in excess of 76000 elements -
occasionally even as temporaries - knowing that properly designed
exceptions make memory allocation failure easy to handle. And since
these are designed to run on 64-bit processors, I know that memory
failure is actually unlikely.
3) If you expand the vector you are probably overstretching the used
    memory..

It works really well in matlab and C++. Exceptions make it easy.
 
J

jacob navia

Le 22/06/12 10:00, gwowen a écrit :
Really? Do you really believe that? What about resources that are not
memory - mutex/critical sections, network connections, open ports,
file descriptors.

How would I use your error handler to do this:

[snip code]


Very easy. Just use a pushdown stack. Each time you acquire a
mutex lock you push it in the stack, each time you release it you pop it.

If an exception occurs, you pop the whole mutex stack. This would
require just a few lines more of code.
 
J

jacob navia

Le 22/06/12 10:00, gwowen a écrit :
I had this strategy but it is unworkable since you can't know in
But it might be desirable for a user to be able to specify such
behaviour in his error handler. For example, in signal processing its
often useful to assume that the data-array is zero-padded-to-infinity,
so I may use a C++ object like this:

class zero_padded_array
{
private:
public std::vector<double> v_;

public:
const double& at(size_t idx) const {
if(idx >= v_.size()){
return 0.0;
} else {
return v_[idx];
}
}
// Rest of interface.
}

Can I use an error handler to handle out-of-bounds access in you
library to get that behaviour?

Yes.

Method 1: Modify a vector's VTable (long)

You write a new "GetElement" function:


typedef void *(*fnPtr)(Vector *,size_t);

fnPtr oldGetElementFunction;

static double error_return = 0.0;

static double *MyGetElement(doubleVector *v,size_t idx)
{
if (idx >= iVector.Size(v))
return &error_return;
// Call the original function;
return oldGetElementFunction(v,idx);
}

Now, when you want to create an infinite vector you write:

Vector *CreateExtensibleVector(size_t realSize)
{
VectorInterface *intf;
Vector *result = idoubleVector.Create(20); // 20 real elements
// Save the original function to be used later
oldGetElementFunction = result->VTable->GetElement;
intf = malloc(sizeof(VectorInterface));
memcpy(intf,result->VTable,sizeof(VectorInterface));
result->VTable = intf;
// Replace the GetElement in THIS vector
result->VTable->GetElement = MyGetElement;
result->Vtable->Finalize = MyFinalize;
return result;
}

Instead of writing iVector.GetElement you use the macro:

#define iVector_GetElement(vector,idx) \
vector->VTable->GetElement(vector,idx);

That will work with normal vectors AND with extensible ones.
Note that the trivial new Finalize procedure is skipped. It just frees
the copied VTable allocated with malloc.

True, this approach is not as easy as the C++ one, but it has the
advantage of being much more flexible. You can subclass before
or after calling the original procedure, or you can bypass the
called procedure entirely.

Method 2: Shorter
Make ALL vectors extensible:
Vector *CreateExtensibleVector(size_t realSize)
{
Vector *result = idoubleVector.Create(20); // 20 real elements
// Save the original function to be used later
oldGetElementFunction = result->VTable->GetElement;
// Replace the GetElement in ALL vectors
iVector.GetElement = MyGetElement;
return result;
}
-----------------------------------------------------------------------------

Note that I haven't answered your original question:
Can I use an error handler to handle out-of-bounds access in you
library to get that behaviour?

I think you have a point here. I will modify the behavior of the
GetElement function so that it RETURNS the value returned by the
error handling procedure (if that procedure returns of course).

The return value of the error handling function should be a pointer
to a replacement data, or NULL or whatever.

Thanks for your input. Really appreciated.

jacob
 
N

Neil Cerutti

In the new "templated" form of the containers library I have
now that all data retrieving functions return the data
directly:

int (*GetElement)(intList *l,size_t idx);
...
What do you think?

Taking a step back, why not just return garbage for bad indices?
That seems more like what a C programmer would expect.
 
G

gwowen

Taking a step back, why not just return garbage for bad indices?
That seems more like what a C programmer would expect.

Certainly "undefined behaviour", rather than returning garbage is the
most C-like thing to do. It "works" for arrays.
 
B

Ben Bacarisse

Neil Cerutti said:
Taking a step back, why not just return garbage for bad indices?
That seems more like what a C programmer would expect.

In some contexts, that's a good solution. I remember implementing a
simple scripting language that had typed containers. The 'get' and
'put' operations acted on pointers, but 'get' never returned a NULL
pointer. Every container always had a spare element, a pointer which
was returned by get for out-of-bounds indexes, and whose value would be
used for dereferenced out-of-bounds accesses.

The case is more murky for a C container API, but since Jacob also
provides the generic (untyped) interface, I think returning junk (zero?)
could be justified. Programs that need to know every error condition
can use the full interface.

Jacob might also want to consider having a dummy element. The most
efficient plan would probably be to have a single dummy element for each
size (properly aligned for all accesses of course). An erroneous lookup
would return this pointer rather than NULL. It could be tested for,
just like, NULL, but it would be valid to dereference it. Such a scheme
can simplify things a bit, but I'm not familiar enough with the existing
API to be sure.
 
D

David Resnick

Taking a step back, why not just return garbage for bad indices?
That seems more like what a C programmer would expect.

Not even that bad advice to declare it "undefined" to used an out of boundsindex. Bear in mind that you are only checking one of the two parameters anyway -- passing in an invalid pointer to an intList itself would also be bad and (unless NULL is passed) be not readily subject to validation. Not that I'm against bounds checking in the library, just not totally clear it needs to be the libraries problem.

For ints, you could always go the strtol route, namely return a value
that might be good or bad(e.g. LONG_MAX) on specified errors and set
errno. Users would have to check errno on the sentinel return. errno mechanism isn't so great, but, well, there you go, and it avoids cluttering up your interface with a pointer argument to use to return your value/error status...

That said, my personal inclination if I wanted to use C with Templates would be to use a C++ compiler and specify that a restricted set of features isdesired, i.e., in this case, Templates (yes, yes, with the caveats that C++ isn't quite a superset of C). It would be nice if such restrictions or policies could be enforced at the compiler level, mind you, or before you can say multiple inheritance other features you may or many not desire to allow would be in use..
 
N

Neil Cerutti

Certainly "undefined behaviour", rather than returning garbage
is the most C-like thing to do. It "works" for arrays.

Returning garbage invokes undefined behavior, surely.
 
J

jacob navia

Le 22/06/12 18:18, Gareth Owen a écrit :
Which is, of course, precisely C++ exceptions give you for (almost) free.


Which is fine if *all* your resources are mutexes. Again, if you have
nested resources that must be acquired and released in precisely the
correct order, everything gets a little more complicated.

Maybe, depends on what you have in mind. In any case the unwinding of
the stack done by C++ is just that: stack unwinding and you can
simulate that with a generic stack without any big problems. Of course
it is a bit ofmore work for you but you do NOT pay for an
enormous complexity increase!

I guess you just have to use a double stack of function pointers / data
pointers that get called in reverse order?

Not really. A generic stack using generic pointers prefixed by an opcode
would suffice:

#define MUTEX 1
#define FILE_TO_CLOSE 2
#define OBSERVER_TO_FREE 3

typedef struct tagStackInfo {
int opcode;
void *data;
} StackUnwindInfo;

You push that, giving the opcode and the pointer (FILE MUTEX, whatever)
The unwind procedure acts given the opcode and closes the file, releases
the mutex, or breaks the binding of the observer procedure.

NOTE:

This is C, not C++. The philosophy is that you do NOT rely in an
incredible complex compiler/runtime that nobody fully understands.
You rely on your own skills.

Unsatisfied?

Use C++, it is a great language!
Any chance of an example?

Well, pushing and popping from a stack is not that really complicated.
 
J

Jens Gustedt

Hello,

Am 21.06.2012 22:19, schrieb jacob navia:
In the new "templated" form of the containers library I have now that
all data retrieving functions return the data directly:

int (*GetElement)(intList *l,size_t idx);

The problem is that it is not possible to return an error
with this configuration: Since the return value is an
int, any error code could be actually some data!!

The solution is to return a pointer to the data and NULL
if there is an error, as the other functions of the library do.

But this doesn't look great with the syntax viewed from the
user's side:

Instead of:
int m = iintList.GetElement(mylist,23);
you should write:
int m = *iintList.GetElement(myList,23);

risking a core dump if there is a NULL return.

Obviously if the list has only 20 elements, the core dump
is actually a good thing since it will show you the error immediately.

Before returning, the "GetElement" API calls the error function
and will return ONLY IF the error function returns.

I was relying in that as error handling, and just returning zero...

After some reflection I am convinced now that this would be a bad design
decision.

What do you think?

I would go for a solution similar to what some (already) standard
functions do when they could return optional information. Have an
additional pointer argument to store that information. If that pointer
argument is 0, the user isn't interested.

(You could combine that with techniques that provide the feature of
optional arguments to C functions/macros in a way that most users
wouldn't actually have to know that this optional parameter exists.)

Jens
 
I

Ian Collins

Le 22/06/12 18:18, Gareth Owen a écrit :

Maybe, depends on what you have in mind. In any case the unwinding of
the stack done by C++ is just that: stack unwinding and you can
simulate that with a generic stack without any big problems. Of course
it is a bit ofmore work for you but you do NOT pay for an
enormous complexity increase!

The complexity in both case will be equal. In your case the complexity
has to be managed by the manually programmer, in the C++ case it is
managed automatically by the compiler.

The complexity becomes really hard to manage manually when the jump
target is several layers up the call stack.
 
J

jacob navia

Le 22/06/12 22:44, Ian Collins a écrit :
The complexity in both case will be equal. In your case the complexity
has to be managed by the manually programmer, in the C++ case it is
managed automatically by the compiler.


Sure, but that "aautomatic" managemengt by an incredibly complex
compiler has a price in difficulty of learning, understanding and using
all that machinery.
The complexity becomes really hard to manage manually when the jump
target is several layers up the call stack.

No, since it is conceptually simple: a stack. For each pointer that you
need to free/release whatever you add an opcode and a pointer. You push
that into the stack. It is very easy.
 
I

Ian Collins

Le 22/06/12 22:44, Ian Collins a écrit :

Sure, but that "aautomatic" managemengt by an incredibly complex
compiler has a price in difficulty of learning, understanding and using
all that machinery.

If you find

struct ManageSomething
{
Something thing;

ManageSomething( Something s ) : thing(s) {}
~ManageSomething() { release( thing ); }
};

difficult to understand, I'd be worried.

I guess at some time in the past, our distant ancestors found wheels
complex, but they sure beat pushing things along the ground. Developing
technical solutions to repeating problems is what we do.
No, since it is conceptually simple: a stack. For each pointer that you
need to free/release whatever you add an opcode and a pointer. You push
that into the stack. It is very easy.

Easy, tedious, repetitive and error prone. Isn't the normal
programmer's reaction to a tedious, repetitive and error prone problem
automation?
 
J

jacob navia

Le 23/06/12 00:21, Ian Collins a écrit :
If you find

struct ManageSomething
{
Something thing;

ManageSomething( Something s ) : thing(s) {}
~ManageSomething() { release( thing ); }
};

difficult to understand, I'd be worried.

It is difficult to understand, maybe because "s" is undefined :)

ManageSomething(Something s) looks like a function call but probably
isn't one since you are declaring a struct, so it must be a declaration
of a constructor. Yes, and then a further function call (or similar)
separated by a colon. Mmmm I have seen that sometimes in C++ code
and "thing" is a "Something" so that must be a function call?

Or not?

You can be worried, I do not parse all that and I do not care actually
to parse that. You like C++Ian, go ahead, it is a GREAT language when
you dedicate 50% of your memory space (brain memory space I mean) to
memorizing by heart all the fine points of a language that has no rules
but a "tradition" that you must also learn by heart.

Googling around I find
http://www.cplusplus.com/forum/beginner/4124/
where "Initializer lists" are explained and it should mean that
the constructor of "ManageSomething" will be passed "Something"
that will initialize "thing" to the passed parameter "s".

Why that?

Because it can save the call to the default constructor. OK.
And why I should learn all that?

I do not care, I like a language with LESS syntax, with LESS
complexity.

If you do not use initializer lists this can happen:

MyClass::MyClass( const std::string& s1, const std::string& s2 )
{
string1 = s1;
string2 = s2;
}

1. std::string default constructor is called to default construct string1;
2. std::string default constructor is called to default construct string2;
3. std::string assignment operator is called to assign s1 to string1;
4. std::string assignment operator is called to assign s2 to string2.

With initializer lists there are only two extra function calls
"by using an initializer list, you are telling the compiler that instead
of calling the default constructor for string1 and string2 it can
instead call the copy constructor, which means only two function calls
instead of four."

BIG PROGRESS.

Ian, go ahead and keep C++. It is a great language but for people
outside its scope it looks like an abomination came true.
I guess at some time in the past, our distant ancestors found wheels
complex, but they sure beat pushing things along the ground. Developing
technical solutions to repeating problems is what we do.

Sure, sure, and then somebody invented square wheels and all people
but some die hards cried GREAT!

SQUARE WHEELS that can climb up the stairs instead of just
rolling around. They are our "BETTER WHEELS".

And everybody started using those. Cars bumped horribly, the ride was
awful but... PROGRESS IS PROGRESS cried the crowd.

And then came a "solution to the bumping problem" proposed by the square
wheels people: a new motor would push down and up the carriage
automatically so that the passangers did not felt the square wheels.

And now really everybody was happy: fuel consumption skyrocketed but
that was a small hardware problem that got pushed beneath the rug.

:)


Easy, tedious, repetitive and error prone. Isn't the normal
programmer's reaction to a tedious, repetitive and error prone problem
automation?

Yes. Up to 30% or more of space are now taken by tables describing the
stack movements to the unwinding virtual machine that the C++ run time
needs.

But this has no sense Ian. Keep C++. I will go on developing simple
solutions to simple problems. A stack is a stack, and I prefer pushing
the things to cleanup into a stack with:

pushCleanup( Opcode,pointer);

each time I need to cleanup something, and then I just write:

UnwindCleanupStack();

and I am done. And I write pushCleanup() and Unwind once and I can
use them again and again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top