What's your experience with optional values?


D

DeMarcus

Hi,

In databases you provide NULL if the value is missing. When using
objects on the free store in C++ you can use NULL to represent a missing
value. When using non-pointer values you may not.

I know several ways to solve this, but what I'm looking for is a
/uniform/ way to deal with missing values both for non-pointer values
and free store values.

When dealing with non-pointer values I can do the following.
boost::eek:ptional<int> getMyInt();

When dealing with free store values I can do the following.
std::unique_ptr<int> getMyInt();

However, I would like to have something uniform for my whole
application. Something like.
my::eek:pt<int> getMyInt();
my::eek:pt<int*> getMyInt();
my::eek:pt<std::unique_ptr<int>> getMyInt();

where my::eek:pt works like boost::eek:ptional for non-pointer values and as a
transparent container if a pointer (to avoid the unnecessary check for
the object since we already can check if the pointer is NULL).

With some hard work I could probably come up with some template
solution, but my question is:

* Does this look intuitive to you? Would this uniformity add to the
understanding and consistency of the code?


I could also just use std::unique_ptr everywhere I need optional values,
but with big structures of data it would be a performance hit to
allocate memory for each non-pointer member.

The simplest solution would be to just go for boost::eek:ptional when using
non-pointer types and std::unique_ptr for pointer types, but I thought
it could be a value having a uniform name. Do you agree with me?


Thanks,
Daniel
 
Ad

Advertisements

Ö

Öö Tiib

Hi,

In databases you provide NULL if the value is missing. When using
objects on the free store in C++ you can use NULL to represent a missing
value. When using non-pointer values you may not.

I know several ways to solve this, but what I'm looking for is a
/uniform/ way to deal with missing values both for non-pointer values
and free store values.

What sort of uniformity you want? There is difference when something
is empty, unknown, failed/invalid (as fallible) or missing (as
optional).
When dealing with non-pointer values I can do the following.
boost::eek:ptional<int> getMyInt();

It is member function declaration? It may result with get-and-get-
getted idiom:

int intFromT = t.getMyInt().get();
When dealing with free store values I can do the following.
std::unique_ptr<int> getMyInt();

However, I would like to have something uniform for my whole
application. Something like.
my::eek:pt<int> getMyInt();
my::eek:pt<int*> getMyInt();
my::eek:pt<std::unique_ptr<int>> getMyInt();
where my::eek:pt works like boost::eek:ptional for non-pointer values and as a
transparent container if a pointer (to avoid the unnecessary check for
the object since we already can check if the pointer is NULL).

With some hard work I could probably come up with some template
solution, but my question is:

* Does this look intuitive to you? Would this uniformity add to the
understanding and consistency of the code?

This feels over-engineered. Lot of classes have invalid or exceptional
states anyway in them and it is more likely that they do right thing
in context. So there is a danger that by making them all uniform you
lose valuable context information (is it empty, unknown, not
available, failed, invalid or missing).
I could also just use std::unique_ptr everywhere I need optional values,
but with big structures of data it would be a performance hit to
allocate memory for each non-pointer member.

The simplest solution would be to just go for boost::eek:ptional when using
non-pointer types and std::unique_ptr for pointer types, but I thought
it could be a value having a uniform name. Do you agree with me?

When i want uniform semantics for different types then i write a
family of function overloads. Something like bool IsMissing( T
const& ) that expresses such state.
 
D

DeMarcus

What sort of uniformity you want? There is difference when something
is empty, unknown, failed/invalid (as fallible) or missing (as
optional).

A very good point actually if one wants to be truly consistent. When
looking at the Motivation for boost::eek:ptional they mention that failing
values should probably be signaled as an exception.

http://www.boost.org/doc/libs/1_45_0/libs/optional/doc/html/index.html#optional.motivation

You mention several other situations. I see them as this (please give
your comments if you don't agree).

* Empty - That's a valid value, like an empty string. Dangerous for all
entries where an empty string doesn't have a clear meaning. If it does
not, it should be caught and signaled with an exception.

* Failed, invalid - Should be signaled with an exception.

* Unknown, missing - Should be signaled in a special way, like NULL for
pointers and some other way for non-pointers.

Now, since you spotted exactly the things that goes around in my head,
my question is; do you see it would be important to be able to
distinguish between Unknown and Missing, and maybe even be able to say
that a value is Empty? Or can we group them together; Unknown, Missing,
Uninitialized, Not available, Not applicable, Empty, all represented
with NULL?

It is member function declaration? It may result with get-and-get-
getted idiom:

int intFromT = t.getMyInt().get();

Yes, it has to be that way.
This feels over-engineered. Lot of classes have invalid or exceptional
states anyway in them and it is more likely that they do right thing
in context. So there is a danger that by making them all uniform you
lose valuable context information (is it empty, unknown, not
available, failed, invalid or missing).

You have a point, but I don't know how to proceed. I find it useful to
be able to say that an integer is uninitialized or unset. To me, that
would be represented with NULL. But if you say that it would be valuable
to be able to distinguish between e.g. Not available and unknown, then I
need to rethink the design.
When i want uniform semantics for different types then i write a
family of function overloads. Something like bool IsMissing( T
const& ) that expresses such state.

But where do you store that information? Let's say you have a struct
like this.

struct MyData
{
std::string name;
std::string address;
int age;
double height;
int weight;
};

How would you distinguish missing, invalid, empty, etc. values in this
struct? And how would would you implement the signaling of those?


Thanks,
Daniel
 
Ö

Öö Tiib

A very good point actually if one wants to be truly consistent. When
looking at the Motivation for boost::eek:ptional they mention that failing
values should probably be signaled as an exception.

http://www.boost.org/doc/libs/1_45_0/libs/optional/doc/html/index.htm...

Depends on nature of failure. Exception is active and expensive way to
give up in hope that someone up there will resolve it. For example:
Requested was a "strategy for doing something"; such was not found or
proven was that it is impossible; responder may throw or may return a
"strategy of doing nothing". On lot of cases requester has nothing to
do anyway (and if it has then it can always reconsider on response to
do nothing) so why to throw it?
You mention several other situations. I see them as this (please give
your comments if you don't agree).

* Empty - That's a valid value, like an empty string. Dangerous for all
entries where an empty string doesn't have a clear meaning. If it does
not, it should be caught and signaled with an exception.

* Failed, invalid - Should be signaled with an exception.

* Unknown, missing - Should be signaled in a special way, like NULL for
pointers and some other way for non-pointers.

Now, since you spotted exactly the things that goes around in my head,
my question is; do you see it would be important to be able to
distinguish between Unknown and Missing, and maybe even be able to say
that a value is Empty? Or can we group them together; Unknown, Missing,
Uninitialized, Not available, Not applicable, Empty, all represented
with NULL?

If distinguishing makes sense or not depends on problem domain and
objects nature. Missing bag is very different from empty bag.
Additionally there is saturated "null object" pattern that is real
object that does nothing on any request. Very useful sometimes. Andy
gave a good example else thread ... floating point NaN. You add
substract, or multiply it as lot you want, it is still a NaN.

[...]
But where do you store that information? Let's say you have a struct
like this.

struct MyData
{
    std::string name;
    std::string address;
    int age;
    double height;
    int weight;

};

How would you distinguish missing, invalid, empty, etc. values in this
struct? And how would would you implement the signaling of those?

You think too much about data. You even name it as "Data". For me
behavior of object is way more important aspect than data. So for any
other object in system designed by me it is an interface exposed to
them and not a pile of data. Interface may be abstract and references
to all missing MyDatas in system may point at single immutable
MissingMyData that behaves in all situations like missing one should.
When it is so on case of MyData then it can be distinguished by memory
address:

bool IsMissing( MyData const& data )
{
return (&data == &MissingMyData);
}

<Nitpick> "age of something" is usually considered as time-span
between "start of it" and "now" so if "now" ever advances then "age"
should also increase. </Nitpick>
 
D

DeMarcus

On 27/12/2010 11:04, DeMarcus wrote:

</snip>

Can I draw a comparison with floating point? There are lots of NaN
values (Not A Number) - and it can be important to distinguish them.

Andy

That's a good point! How do you handle NaN? In my opinion these things
are important to have a uniform way to handle in your framework.
Otherwise all programmers in your company come up with their own
solutions for their particular task, reinventing the wheel every time.
 
Ö

Öö Tiib

That's a good point! How do you handle NaN? In my opinion these things
are important to have a uniform way to handle in your framework.
Otherwise all programmers in your company come up with their own
solutions for their particular task, reinventing the wheel every time.

For example your responsibility is to record results of some sort of
measurements 100 times per second, every 10 milliseconds. When your
sensors fail you record NaN. You don't break your work with exception,
because *you* are fine. It is responsibility of others to analyze the
records. When such failures happen only once per second then there are
still 99 fine data-points per second and your measurements can be
considered very reliable. When your sensors are broken or gone off-
line the there are row of NaNs and then analyzers raise alarms. You
still continue recording until explicitly told to stop. If you did
instead filter out the noise then you had to record also the time
together with each measurement and that would double up the
requirement for data storage.
 
Ad

Advertisements

D

DeMarcus

For example your responsibility is to record results of some sort of
measurements 100 times per second, every 10 milliseconds. When your
sensors fail you record NaN. You don't break your work with exception,
because *you* are fine. It is responsibility of others to analyze the
records. When such failures happen only once per second then there are
still 99 fine data-points per second and your measurements can be
considered very reliable. When your sensors are broken or gone off-
line the there are row of NaNs and then analyzers raise alarms. You
still continue recording until explicitly told to stop. If you did
instead filter out the noise then you had to record also the time
together with each measurement and that would double up the
requirement for data storage.

I think we speak the same language. My point is not to decide what's a
NaN, an Empty entry or Missing Value, and which should raise an
exception or not.

What I'm striving to solve is how to store that second level
information. We can't store Nan in a double saying that -999.9 is NaN,
and we can't say that -1 in an int is a missing value.

So my question remains; is there a nice and uniform way of storing this?
We could use a std::pair for everything with pair.first giving the
signal (NaN, Missing, etc.) and pair.second giving the value.

I could probably come up with something, but I was just wondering if the
community had had similar experience and how it's been solved in
different contexts.
 
D

DeMarcus

Depends on nature of failure. Exception is active and expensive way to
give up in hope that someone up there will resolve it. For example:
Requested was a "strategy for doing something"; such was not found or
proven was that it is impossible; responder may throw or may return a
"strategy of doing nothing". On lot of cases requester has nothing to
do anyway (and if it has then it can always reconsider on response to
do nothing) so why to throw it?

You're right. The programmer has to decide whether a throw is appropriate.
You mention several other situations. I see them as this (please give
your comments if you don't agree).

* Empty - That's a valid value, like an empty string. Dangerous for all
entries where an empty string doesn't have a clear meaning. If it does
not, it should be caught and signaled with an exception.

* Failed, invalid - Should be signaled with an exception.

* Unknown, missing - Should be signaled in a special way, like NULL for
pointers and some other way for non-pointers.

Now, since you spotted exactly the things that goes around in my head,
my question is; do you see it would be important to be able to
distinguish between Unknown and Missing, and maybe even be able to say
that a value is Empty? Or can we group them together; Unknown, Missing,
Uninitialized, Not available, Not applicable, Empty, all represented
with NULL?

If distinguishing makes sense or not depends on problem domain and
objects nature. Missing bag is very different from empty bag.
Additionally there is saturated "null object" pattern that is real
object that does nothing on any request. Very useful sometimes. Andy
gave a good example else thread ... floating point NaN. You add
substract, or multiply it as lot you want, it is still a NaN.

[...]
But where do you store that information? Let's say you have a struct
like this.

struct MyData
{
std::string name;
std::string address;
int age;
double height;
int weight;

};

How would you distinguish missing, invalid, empty, etc. values in this
struct? And how would would you implement the signaling of those?

You think too much about data. You even name it as "Data". For me
behavior of object is way more important aspect than data. So for any
other object in system designed by me it is an interface exposed to
them and not a pile of data. Interface may be abstract and references
to all missing MyDatas in system may point at single immutable
MissingMyData that behaves in all situations like missing one should.
When it is so on case of MyData then it can be distinguished by memory
address:

bool IsMissing( MyData const& data )
{
return (&data ==&MissingMyData);
}

Hm, interesting. Do you define MissingMyData as a global constant?
Anyway, still, with this solution you force all MyData to be pointers,
right?
 
B

Bart van Ingen Schenau

I think we speak the same language. My point is not to decide what's a
NaN, an Empty entry or Missing Value, and which should raise an
exception or not.

What I'm striving to solve is how to store that second level
information. We can't store Nan in a double saying that -999.9 is NaN,
and we can't say that -1 in an int is a missing value.

So my question remains; is there a nice and uniform way of storing this?
We could use a std::pair for everything with pair.first giving the
signal (NaN, Missing, etc.) and pair.second giving the value.

I could probably come up with something, but I was just wondering if the
community had had similar experience and how it's been solved in
different contexts.

My experience is that there are no "one size fits all" solutions in
programming. All attempts at such a solution eventually end up being a
"one size fits none".

There are essentially two ways of reporting an absent/empty/whatever
value: in-band or out-of-band. And which method works best depends on
the types involved and the possible values that can occur within the
application.
In-band signalling (null pointers, NaN, special value outside the
possible range) is often the easiest to work with. But sometimes there
are no spare values that could be used for such a purpose, so you have
to resort to out-of-band signalling, like boost::Optional.

Bart v Ingen Schenau
 
P

Pavel Lepin

Bart said:
There are essentially two ways of reporting an absent/empty/whatever
value: in-band or out-of-band. And which method works best depends on
the types involved and the possible values that can occur within the
application.
In-band signalling (null pointers, NaN, special value outside the
possible range) is often the easiest to work with. But sometimes there
are no spare values that could be used for such a purpose, so you have
to resort to out-of-band signalling, like boost::Optional.

This is probably a very sensible default in HPC, embedded or otherwise
space-sensitive contexts, but in general application programming I believe
using an explicit sum type is better in a sense of avoiding subtle gotchas,
by making your intent clear in all related declarations, thus potentially
reducing maintenance costs six months down the road.
 
Ö

Öö Tiib

On 12/27/2010 02:47 AM, Tiib wrote:
When i want uniform semantics for different types then i write a
family of function overloads. Something like bool IsMissing( T
const&    ) that expresses such state.
But where do you store that information? Let's say you have a struct
like this.
struct MyData
{
     std::string name;
     std::string address;
     int age;
     double height;
     int weight;
};
How would you distinguish missing, invalid, empty, etc. values in this
struct? And how would would you implement the signaling of those?
You think too much about data. You even name it as "Data". For me
behavior of object is way more important aspect than data. So for any
other object in system designed by me it is an interface exposed to
them and not a pile of data. Interface may be abstract and references
to all missing MyDatas in system may point at single immutable
MissingMyData that behaves in all situations like missing one should.
When it is so on case of MyData then it can be distinguished by memory
address:
  bool IsMissing( MyData const&  data )
  {
      return (&data ==&MissingMyData);
  }

Hm, interesting. Do you define MissingMyData as a global constant?
Anyway, still, with this solution you force all MyData to be pointers,
right?

All interfaces are anyway passed as pointers or references so that
solution should fit on lot of cases. MissingMyData can be immutable
constant in translation unit implementing MyData (and
IsMissing(MyData) or MyData::isMissing() ). Actual MissingMyData and
ExistingMyData are both derived from MyData interface.

Modification of it is together with pimpl idiom where ImplMissing is
used as special private implementation for exceptional Missing state.
That one works on more cases, only that classes should be built then
using pimpl idiom. Objects in exceptional state may even externally
appear mutable since they may switch underlying implementations
dynamically.

If it is good solution depends what behavioral flexibility is expected
from objects in exceptional state. If the object in Exceptional state
must be mutable during being in that exceptional state (rather rare
need) then you obviously need to instantiate such and can not use
single MissingMyData instance to incarnate them all. Exceptional
states can be built also as options into MyData class but that results
with if-else-polymorphism or switch-case-polymorphism that i consider
not that elegant.
 
Ad

Advertisements

D

DeMarcus

On 12/27/2010 02:47 AM, Tiib wrote:

When i want uniform semantics for different types then i write a
family of function overloads. Something like bool IsMissing( T
const& ) that expresses such state.
But where do you store that information? Let's say you have a struct
like this.
struct MyData
{
std::string name;
std::string address;
int age;
double height;
int weight;
};
How would you distinguish missing, invalid, empty, etc. values in this
struct? And how would would you implement the signaling of those?
You think too much about data. You even name it as "Data". For me
behavior of object is way more important aspect than data. So for any
other object in system designed by me it is an interface exposed to
them and not a pile of data. Interface may be abstract and references
to all missing MyDatas in system may point at single immutable
MissingMyData that behaves in all situations like missing one should.
When it is so on case of MyData then it can be distinguished by memory
address:
bool IsMissing( MyData const& data )
{
return (&data ==&MissingMyData);
}

Hm, interesting. Do you define MissingMyData as a global constant?
Anyway, still, with this solution you force all MyData to be pointers,
right?

All interfaces are anyway passed as pointers or references so that
solution should fit on lot of cases. MissingMyData can be immutable
constant in translation unit implementing MyData (and
IsMissing(MyData) or MyData::isMissing() ). Actual MissingMyData and
ExistingMyData are both derived from MyData interface.

Modification of it is together with pimpl idiom where ImplMissing is
used as special private implementation for exceptional Missing state.
That one works on more cases, only that classes should be built then
using pimpl idiom. Objects in exceptional state may even externally
appear mutable since they may switch underlying implementations
dynamically.

If it is good solution depends what behavioral flexibility is expected
from objects in exceptional state. If the object in Exceptional state
must be mutable during being in that exceptional state (rather rare
need) then you obviously need to instantiate such and can not use
single MissingMyData instance to incarnate them all. Exceptional
states can be built also as options into MyData class but that results
with if-else-polymorphism or switch-case-polymorphism that i consider
not that elegant.

Your solution is a good alternative. The only thing that seems
impossible in this whole discussion is to have a same solution for
non-pointer- and pointer types.
 
D

DeMarcus

My experience is that there are no "one size fits all" solutions in
programming. All attempts at such a solution eventually end up being a
"one size fits none".

There are essentially two ways of reporting an absent/empty/whatever
value: in-band or out-of-band. And which method works best depends on
the types involved and the possible values that can occur within the
application.
In-band signalling (null pointers, NaN, special value outside the
possible range) is often the easiest to work with. But sometimes there
are no spare values that could be used for such a purpose, so you have
to resort to out-of-band signalling, like boost::Optional.

Bart v Ingen Schenau

I like your reasoning. It triggered my analysis. Let me add a subgroup.

In-band: Special values in the type. E.g. 9999.99 := Inf
Out-of-band:
Non-intrusive: E.g. boost::eek:ptional
Intrusive: E.g. MyDouble md; if( md.signal() == MyDouble::NaN ) ...

What I'm looking for in this post is something general that will work in
most situations.

* In-band are really dangerous unless you know exactly what you're
doing. Therefore that's not an option for something general.

* The intrusive solution may work in specific situations but is not general.

* The non-intrusive way is the only option left, and it's the one I
prefer, but unfortunately boost::eek:ptional forces a double check on
pointer types. It also doesn't support other signals like NaN, etc.

If anyone has any good tip on some math library that solves the NaN,
Inf, etc. in a neat way, please let me know.
 
B

Bart van Ingen Schenau

This is probably a very sensible default in HPC, embedded or otherwise
space-sensitive contexts, but in general application programming I believe
using an explicit sum type is better in a sense of avoiding subtle gotchas,
by making your intent clear in all related declarations, thus potentially
reducing maintenance costs six months down the road.

I am sorry, but I don't see how your reply relates to what I wrote.
Can you elaborate?

Bart v Ingen Schenau
 
P

Pavel Lepin

Bart said:
I am sorry, but I don't see how your reply relates to what I wrote.
Can you elaborate?

Perhaps I misunderstood the intended message, but it seemed to me you
advocated using exceptional values as a default implementation for option
types, with the only reason against that being inability to spare a value
for a NULL/Nothing/None/... flag. I wanted to offer an alternate opinion,
as I feel implementation using exceptional values is inferior (even if you
wrap it in an interface somehow precluding the possibility of incorrect
usage of said exceptional value) unless there are performance
considerations involved. There are several main reasons why I find
optional<T> a better default choice:

1. It seems unnecessarily clunky to me to roll out a generic implementation
of a safe interface to option types implemented using exceptional values.

- The corollary here being that a single generic option type, such as
optional<T>, makes the intent clear and explicit at the
point-of-declaration and uniform across the codebase.

2. Option types using exceptional values do not generalize well to variant
types.

3. Option types using exceptional values are more fragile in face of spec
changes (although this is probably only a minor consideration in most
practical settings).

I hope this does clarify my viewpoint a bit? As I mentioned before, perhaps
I simply misinterpreted your words upthread.
 
J

James Kanze

What sort of uniformity you want? There is difference when
something is empty, unknown, failed/invalid (as fallible) or
missing (as optional).

This is a naming problem. The traditional name for the idiom
implemented in boost::eek:ptional is Fallible: although the
implementations are more or less identical, the names imply
a completely different use. In data base type uses, Nullable
would be the preferred name. The best name I've seen to date is
Maybe, which seems fairly neutral with regards to why the value
might not be there. (Historically, Fallible was more or less
universal before someone at Boost decided otherwise. And I've
used my "Fallible" for things like cached values, where neither
Fallible nor Optional really correspond to the use case.)
It is member function declaration? It may result with
get-and-get- getted idiom:
int intFromT = t.getMyInt().get();

Or more likely:
int intFromT = t.getMyInt().elseDefaultTo(someDefault);
(I'm not familiar with boost::eek:ptional, so I don't know what
name they actually use for "elseDefaultTo", but some such
function should definitely exist.) Otherwise, you save the
optional, and verify its validity before trying to access its
value.

Or just int*. (int* is, of course, the idiomatic equivalent of
optional<int>.)

If the object actually is on the free store, and the getter is
This feels over-engineered. Lot of classes have invalid or exceptional
states anyway in them and it is more likely that they do right thing
in context. So there is a danger that by making them all uniform you
lose valuable context information (is it empty, unknown, not
available, failed, invalid or missing).

No you couldn't. std::unique_ptr has very specific semantics,
which often don't correspond to what you want. In cases where
you are actually returning an lvalue (if the object exists),
e.g. something like AssocArray<T>::get(key), returning a pointer
is the idiomatic equivalent of optional<T&>/Fallible<T&> etc.
(Implementing such a class so that it works for references is
over-engineering.)
When i want uniform semantics for different types then i write
a family of function overloads. Something like bool IsMissing(
T const& ) that expresses such state.

Good idea. In general. Most of the time, it's
over-engineering, but from time to time, it's justified.

Note that both the pointer idiom and data base use would suggest
something along the lines of == NULL or != NULL. I've never had
to go that route systematically, but the possibility should also
be considered.
 
Ad

Advertisements

J

James Kanze

On 12/27/2010 02:47 AM, Tiib wrote: [...]
What sort of uniformity you want? There is difference when something
is empty, unknown, failed/invalid (as fallible) or missing (as
optional).
A very good point actually if one wants to be truly consistent. When
looking at the Motivation for boost::eek:ptional they mention that failing
values should probably be signaled as an exception.

That doesn't sound right. The whole point of optional is to
provide an out of band value to signal errors. Afterwards, it
is the responsibility of the callee to decide what he does with
this value, but a precondition of accessing the in band value is
that it is present. Precondition failure is normally best
handled as an assertion failure, *not* an exception.
You mention several other situations. I see them as this
(please give your comments if you don't agree).
* Empty - That's a valid value, like an empty string.
Dangerous for all entries where an empty string doesn't have
a clear meaning. If it does not, it should be caught and
signaled with an exception.

An empty string is *not* an empty value, it is a valid string
value. There is a difference between a valid string, even
empty, and no string.
* Failed, invalid - Should be signaled with an exception.

It depends on what the failure is. If it really should be
signaled by an exception, then you don't need optional; just
throw the exception in the called function if it can't provide
the value.
* Unknown, missing - Should be signaled in a special way, like NULL for
pointers and some other way for non-pointers.

Like NULL for pointer, yes. I'd not considered it before, but
it does make sense, sort of:

optional<MyType> o = someFunction();
if (o != NULL) {
// got it...
}

Although I still prefer a simple isValid() function; an optional
is *not* a pointer, and NULL definitly suggests pointer in the
minds of most C++ programmers. And:

optional<MyType> o = someFunction();
if (o.isValid()) {
// got it...
}

seems far clearer and less confusing.

[...]
Yes, it has to be that way.

No, it can't be that way. The call to get() is undefined
behavior (an assertion failure, in most cases) unless you've
prooved that t.getMyInt().isValid() (or whatever the name of the
function is in Boost).

[...]
You have a point, but I don't know how to proceed. I find it useful to
be able to say that an integer is uninitialized or unset. To me, that
would be represented with NULL.

Not in C++. In C++, NULL is *not* the same thing as null in SQL.

[...]
But where do you store that information? Let's say you have a struct
like this.
struct MyData
{
std::string name;
std::string address;
int age;
double height;
int weight;
};
How would you distinguish missing, invalid, empty, etc. values in this
struct?

That sounds like a data base use. In that case, the logical
solution would be "Nullable<T>". Except that it's probably not
worth the bother of implementing another class just to have
a different name---I'd use whatever class was usually used in
house for this sort of thing: Fallible, Maybe, boost::eek:ptional
or whatever.
And how would would you implement the signaling of those?

My preference goes for an isValid member function, but comparing
with a predefined nulValue object seems acceptable as well. Of
course, if you're using an existing class, you do what it
requires.
 
J

James Kanze

On 27/12/2010 11:04, DeMarcus wrote:


Can I draw a comparison with floating point? There are lots of NaN
values (Not A Number) - and it can be important to distinguish them.

But not always, or not even very often. I used my Fallible
class for something like 20 years before I found the need to add
support for extended error codes to it. And even now, I think
I've only used that support once or twice.
 
J

James Kanze

On 12/28/2010 10:17 AM, Öö Tiib wrote:

[...]
What I'm striving to solve is how to store that second level
information. We can't store Nan in a double saying that -999.9 is NaN,
and we can't say that -1 in an int is a missing value.
So my question remains; is there a nice and uniform way of storing this?
We could use a std::pair for everything with pair.first giving the
signal (NaN, Missing, etc.) and pair.second giving the value.
I could probably come up with something, but I was just wondering if the
community had had similar experience and how it's been solved in
different contexts.

It's certainly not universal, but... This is larger than I like
to post, but since my site is currently unavailable, here's the
code I use. (There are some dependencies in there, but they
should be more or less obvious, and easy to replace.)

Fallible.hh:
============
/
****************************************************************************/
/* File:
Fallible.hh */
/* Author: J.
Kanze */
/* Date:
25/10/1994 */
/* Copyright (c) 1994,2003 James
Kanze */
/*
------------------------------------------------------------------------
*/
/* Modified: 22/05/2003 J.
Kanze */
/* Converted documentation to
Doxygen. */
/* Modified: 26/04/2008 J.
Kanze */
/* Eliminated need for default
constructor. */
/*
------------------------------------------------------------------------
*/
//[email protected] Fallible.hh
//[email protected]
//! A generic class for returning "maybe" values.
//!
//! Here, we've extended the version in Barton and Nackman to
//! support a more generalized error code, by means of a
second
//! traits template parameter. By default, the class works as
//! before (and in fact, we didn't have to modify a single
line of
//! code because of the added facility, although the class is
//! widely used in our code), but the user can add a second,
//! traits template parameter to specify how to use a
different
//! status type.
//
---------------------------------------------------------------------------

#ifndef GB_Fallible_hh_20061203izn6Lk4kky3qvxlFVfxSpKam
#define GB_Fallible_hh_20061203izn6Lk4kky3qvxlFVfxSpKam

#include "Gabi/Global.hh"
#include "Gabi/Util/HashCode.hh"

namespace GabiNS {
namespace Util {

// DefaultFallibleTraits:
// ======================
//
//[email protected]
//! The default <tt>Traits</tt> for <tt>Fallible</tt>, giving
the
//! behavior of the <tt>Fallible</tt> in Barton and Nackman.
//
---------------------------------------------------------------------------
class DefaultFallibleTraits
{
public:
// StatusType:
// ===========
//
//! Logically, the status type should be <tt>bool</tt>.
//! However... <tt>Fallible<&nbsp;std::string&nbsp;></tt>
is
//! probably the most frequent single instantiation of
//! <tt>Fallible</tt>, et it's not rare to want to use it
to
//! return a string constant. Which has the type <tt>char
//! const[]</tt>, which degenerates rapidly into a
<tt>char
//! const*</tt>. Regretfully, for historical reasons,
//! pointers (including the <tt>char const*</tt> here)
convert
//! implicitly into <tt>bool</tt>. Which means that if
//! <tt>StatusType</tt> were simply <tt>bool</tt>, using a
//! "constant string" to initialize a
//! <tt>Fallible<&nbsp;std::string&nbsp;></tt> would
resolve
//! to the constructor taking a <tt>StatusType</tt> (the
one
//! for invalid values), and not to the one taking a
//! <tt>std::string const&</tt> (for valid values). So we
//! wrap. And without an implicit conversion, since that
//! would still result in an ambiguity.
//!
//! A priori, this makes it more difficult to use the
various
//! functions which take an explicit StatusType. But
given
//! that here, there are only two possible values, and
that
//! for any given function, only one is legal (and that
one is
//! the default value), it's not a problem.
//
-----------------------------------------------------------------------
struct StatusType
{
explicit StatusType( bool value ) : value( value ) {}
bool value ;
} ;

static bool isOk( StatusType status )
{
return status.value ;
}
static StatusType defaultInvalid()
{
return StatusType( false ) ;
}
static StatusType defaultValid()
{
return StatusType( true ) ;
}
} ;

// Fallible :
// ==========
//
//[email protected]
//! A generic class for returning "maybe" values.
//!
//! This class is used to return values from functions that
may
//! fail. Normally (supposing no failure), the class converts
//! automatically (but with a "user-defined" conversion) to
the
//! type on which it is instantiated. In the failure case, an
//! attempted conversion causes an assertion failure. There
are
//! also functions for testing for failure.
//!
//! See Barton and Nackman, <i>Scientific and Engineering C++</
i>,
//! Addison-Wesley, 1994, section 6.4.4.
//!
//! Compared to the version described in Barton and Nackman,
this
//! version has been extended to take a second template
parameter,
//! which allows defining a user specific type for the
validation;
//! in particular, this type allows distinguishing between
//! different types of errors, or even different types of
//! non-errors.
//!
//! Also, unlike the implementation described in Barton and
//! Nackman, this implementation does not require a default
//! constructor (except for one function); it is sufficient
that
//! ValueType be CopyConstructible and Assignable.
//!
//! The template is thus defined over two parameters:
//!
//! <dl>
//! <dt><tt>ValueType</tt></dt>
//! <dd>
//! The type of the value when everything is OK. This
type
//! requires a default constructor, a copy constructor and
an
//! assignment operator, all accessible.</dd>
//!
//! <dt><tt>Traits</tt></dt>
//! <dd>
//! This type determines how we decide whether an instance
is
//! a valid value or not. It must contains:
//!
//! <ul>
//! <li>The definition or the declaration (e.g.
//! <tt>typedef</tt>) of a type <tt>StatusType</tt>,
which
//! can be used to evaluate validity. For the
classical
//! Fallible, as described in Barton and Nackman, it
would
//! be <tt>bool</tt>, but it may also be an <tt>enum</
tt>,
//! with values for good and different types of bad,
or a
//! string. All that is needed is that somehow, it be
//! possible, using just a value of this type, to
//! determine whether the object is valid or not.
This
//! type must support copy and assignment.
//!
//! <li>A static function <tt>bool
//! isOk(&nbsp;StatusType&nbsp;)</tt>, which given a
//! <tt>StatusType</tt>, returns true if it
corresponds to
//! a valid value, and false otherwise.
//!
//! <li>Two static funtions <tt>StatusType
//! defaultInvalid()</tt> and <tt>StatusType
//! defaultValid()</tt>. With the exception of
assignment
//! of a <tt>ValueType</tt> to a <tt>Fallible</tt>,
these
//! functions are only used as default arguments.
//! </ul></dd>
//! </dl>
//!
//! \warning
//! The types <tt>ValueType</tt> and
//! <tt>Traits::StatusType</tt> are used to disabiguate
some
//! of the functions, including the constructors, and must
//! thus be distinct.
//!
//! Note that the copy constructor, the copy assignment
operator
//! and the destructor are furnished by the compiler.
//
---------------------------------------------------------------------------
template< typename ValueType,
typename Traits = DefaultFallibleTraits >
class Fallible
{
public :
typedef typename Traits::StatusType
StatusType ;

//! \pre
//! <tt>! Traits::isOk( status )</tt>
//!
//! \post
//! - <tt>! isValid()</tt>
//! - <tt>status() == status</tt>
//!
//! Note that because of the default argument, this
//! constructor also serves as the default constructor.
//
-----------------------------------------------------------------------
explicit Fallible( StatusType status
=
Traits::defaultInvalid() ) ;

//! \pre
//! <tt>Traits::isOk( status )</tt>
//!
//! \param value
//! The value of the object.
//!
//! \param status
//! The status to be associated with the object.
(This
//! defaults to the default valid status, as defined
by
//! the traits class.)
//!
//! \post
//! - <tt>isValid()</tt>
//! - <tt>value() == value</tt>
//! - <tt>status() == status</tt>
//
-----------------------------------------------------------------------
explicit Fallible( ValueType const& value,
StatusType status
=
Traits::defaultValid() ) ;

//! \post
//! - <tt>status() == other.status()</tt>
//! - <tt>! isValid() || value() == other.value()</
tt>
//
-----------------------------------------------------------------------
Fallible( Fallible const& other ) ;

~Fallible() ;

//! \post
//! - <tt>status() == other.status()</tt>
//! - <tt>! isValid() || value() == other.value()</
tt>
//
-----------------------------------------------------------------------
Fallible& operator=( Fallible const& other ) ;

//! \param value
//! The value of the object.
//!
//! \post
//! - <tt>isValid()</tt>
//! - <tt>value() == value</tt>
//! - <tt>status() == Traits::defaultValid()</tt>
//
-----------------------------------------------------------------------
Fallible& operator=( ValueType const& value ) ;

//! \param status
//! The new status.
//!
//! \pre
//! <tt>! Traits::isOk( status )</tt>
//!
//! \post
//! - <tt>! isValid()</tt>
//! - <tt>status() == status</tt>
//
-----------------------------------------------------------------------
Fallible& operator=( StatusType status ) ;

//! \return
//! <tt>Traits::isOk( status() )</tt>
//
-----------------------------------------------------------------------
bool isValid() const ;

//! \return
//! The validity state.
//
-----------------------------------------------------------------------
StatusType status() const ;

//! \return
//! The current value.
//!
//! \pre
//! <tt>isValid()</tt>
//
-----------------------------------------------------------------------
ValueType const& value() const ;

//! \return
//! <tt>value()</tt>
//!
//! \pre
//! <tt>isValid()</tt>
//
-----------------------------------------------------------------------
operator ValueType() const ;

//! \param defaultValue
//! The value to be returned if <tt>!&nbsp;isValid()</
tt>.
//!
//! \return
//! <tt>isValid() ? value() : defaultValue</tt>
//
-----------------------------------------------------------------------
ValueType const& elseDefaultTo( ValueType const&
defaultValue ) const ;

//! \param newStatus
//! The new validity status.
//!
//! \pre
//! <tt>! Traits::isOk( newStatus )</tt>
//!
//! \post
//! <tt>! isValid()</tt>
//!
//! The equivalent of assigning
//! <tt>Fallible&lt;&nbsp;ValueType,&nbsp;Traits&nbsp;&gt;
(&nbsp;newStatus&nbsp;)</tt>
//! to the object.
//
-----------------------------------------------------------------------
void invalidate( StatusType newStatus
=
Traits::defaultInvalid() ) ;

//! \param newValue
//! The value of the object.
//!
//! \param newStatus
//! Le nouvel état de validité.
//!
//! \post
//! - <tt>isValid()</tt>
//! - <tt>value() == value</tt>
//! - <tt>status() == newStatus</tt>
//!
//! The equivalent of assigning
//! <tt>Fallible&lt;&nbsp;ValueType,&nbsp;Traits&nbsp;&gt;
(&nbsp;value,&nbsp;newStatus&nbsp;)</tt>
//! to the object.
//
-----------------------------------------------------------------------
void validate( ValueType const& newValue,
StatusType newStatus
=
Traits::defaultValid() ) ;

//! A special version of <tt>validate()</tt>, designed to
be
//! used with base types which are expensive to copy. It
can
//! be used efficiently, for example, even when the base
type
//! is an <tt>std::vector</tt>. Rather than take a new
value
//! as an argument, it returns a non-<tt>const</tt>
reference
//! to the value contained in the Fallible object, which
the
//! caller can then modify as he wishes.
//!
//! \param newStatus
//! The new validation status.
//!
//! \pre
//! <tt>Traits::isOk( newStatus )</tt>
//!
//! \return
//! Non-<tt>const</tt> reference to the contained
object.
//!
//! \post
//! - <tt>isValid()</tt>
//!
//! \attention
//! This function requires ValueType to be
//! DefaultConstructible. If <tt>valid()</tt> before
//! calling this function, the state of the value
object
//! is not changed; otherwise, the value object will
be
//! default constructed.
//!
//! \warning
//! Note too that using this function may require some
//! particular attention to issues of thread and
exception
//! safety, since it sets the state as valid \e before
//! having made the necessary modifications in the
value.
//!
//! (Note that if a function returns a <tt>Fallible</tt>,
//! rather than a reference to a <tt>Fallible</tt>, the
//! contained object will also be copied. Thus, this
function
//! is not particularly interesting in such cases. On the
//! other hand, it can be quite useful for objects which
//! contain <tt>Fallible</tt> in an implemention of lazy
//! evaluation.)
//
-----------------------------------------------------------------------
ValueType& validate( StatusType newStatus
=
Traits::defaultValid() ) ;

//! \return
//! True if both objects are invalid, or if both are
valid
//! and <tt>value() == other.value()</tt>.
//
-----------------------------------------------------------------------
bool isEqual( Fallible const& other ) const ;

//! Defines an ordering relationship between objects, with
all
//! invalid objects considered equal, and inferior to all
//! valid objects.
//!
//! \return
//! <tt>value().compare( other.value() )</tt> if both
//! objects are valid, otherwise 1 if this object is
//! valid, and the other not, otherwise 0 if both
objects
//! are invalid, otherwise -1 (i.e. if this object is
//! invalid, and the other not).
//
-----------------------------------------------------------------------
int compare( Fallible const& other ) const ;

//! \return
//! <tt>isValid()
//! ? HashTraits< ValueType >( value() )
//! : hashInit()</tt>
//
-----------------------------------------------------------------------
HashType hashCode() const ;

//[email protected] implementation
private :
StatusType myStatus ;
union {
unsigned char myValue[ sizeof( ValueType ) ] ;
MaxAlignFor< ValueType >
dummyForAlignement ;
} ;

ValueType* valueAddress() ;
ValueType const* valueAddress() const ;
void construct( ValueType const& newValue ) ;
void assign( ValueType const& newValue ) ;
void setValue( ValueType const& newValue ) ;
void destroy() ;
//[email protected]
} ;
}
}
#include "Fallible.tcc"
#endif
// Local Variables: --- for emacs
// mode: c++ --- for emacs
// tab-width: 8 --- for emacs
// End: --- for emacs
// vim: set ts=8 sw=4 filetype=cpp: --- for vim
=== end ===
Fallible.tcc:
=============
/
****************************************************************************/
/* File:
Fallible.tcc */
/* Author: J.
Kanze */
/* Date:
03/11/1994 */
/* Copyright (c) 1994 James
Kanze */
/*
------------------------------------------------------------------------
*/

#include <assert.h>
#include <new>

namespace GabiNS {
namespace Util {

template< typename ValueType, typename Traits >
inline ValueType*
Fallible< ValueType, Traits >::valueAddress()
{
assert( Traits::isOk( myStatus ) ) ;
return static_cast said:
( myValue ) ) ;
}

template< typename ValueType, typename Traits >
inline ValueType const*
Fallible< ValueType, Traits >::valueAddress() const
{
assert( Traits::isOk( myStatus ) ) ;
return static_cast< ValueType const* >(
static_cast< void const* >( myValue ) ) ;
}

template< typename ValueType, typename Traits >
inline void
Fallible< ValueType, Traits >::construct(
ValueType const& newValue )
{
::new ( myValue ) ValueType( newValue ) ;
}

template< typename ValueType, typename Traits >
inline void
Fallible< ValueType, Traits >::assign(
ValueType const& newValue )
{
*valueAddress() = newValue ;
}

template< typename ValueType, typename Traits >
inline void
Fallible< ValueType, Traits >::setValue(
ValueType const& newValue )
{
if ( Traits::isOk( myStatus ) ) {
assign( newValue ) ;
} else {
construct( newValue ) ;
}
}

template< typename ValueType, typename Traits >
inline void
Fallible< ValueType, Traits >::destroy()
{
valueAddress()->~ValueType() ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >::Fallible(
StatusType status )
: myStatus( status )
{
assert( ! Traits::isOk( status ) ) ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >::Fallible(
ValueType const& value,
StatusType status )
: myStatus( status )
{
assert( Traits::isOk( status ) ) ;
construct( value ) ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >::Fallible(
Fallible const& other )
: myStatus( other.myStatus )
{
if ( Traits::isOk( myStatus ) ) {
construct( other.value() ) ;
}
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >::~Fallible()
{
if ( Traits::isOk( myStatus ) ) {
destroy() ;
}
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >&
Fallible< ValueType, Traits >::eek:perator=(
Fallible const& other )
{
if ( other.isValid() ) {
setValue( other.value() ) ;
} else if ( isValid() ) {
destroy() ;
}
myStatus = other.status() ;
return *this ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >&
Fallible< ValueType, Traits >::eek:perator=(
ValueType const& value )
{
validate( value, Traits::defaultValid() ) ;
return *this ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >&
Fallible< ValueType, Traits >::eek:perator=(
StatusType status )
{
invalidate( status ) ;
return *this ;
}

template< typename ValueType, typename Traits >
bool
Fallible< ValueType, Traits >::isValid() const
{
return Traits::isOk( myStatus ) ;
}

template< typename ValueType, typename Traits >
typename Traits::StatusType
Fallible< ValueType, Traits >::status() const
{
return myStatus ;
}

template< typename ValueType, typename Traits >
ValueType const&
Fallible< ValueType, Traits >::value() const
{
assert( isValid() ) ;
return *valueAddress() ;
}

template< typename ValueType, typename Traits >
Fallible< ValueType, Traits >::eek:perator ValueType() const
{
return value() ;
}

template< typename ValueType, typename Traits >
ValueType const&
Fallible< ValueType, Traits >::elseDefaultTo(
ValueType const& defaultValue ) const
{
return isValid() ? value() : defaultValue ;
}

template< typename ValueType, typename Traits >
void
Fallible< ValueType, Traits >::invalidate(
StatusType newStatus )
{
assert( ! Traits::isOk( newStatus ) ) ;
if ( isValid() ) {
destroy() ;
}
myStatus = newStatus ;
}

template< typename ValueType, typename Traits >
void
Fallible< ValueType, Traits >::validate(
ValueType const& value,
StatusType newStatus )
{
assert( Traits::isOk( newStatus ) ) ;
setValue( value ) ;
myStatus = newStatus ;
}

template< typename ValueType, typename Traits >
ValueType&
Fallible< ValueType, Traits >::validate(
StatusType newStatus )
{
assert( Traits::isOk( newStatus ) ) ;
if ( ! isValid() ) {
construct( ValueType() ) ;
}
myStatus = newStatus ;
return *valueAddress() ;
}

template< typename ValueType, typename Traits >
bool
Fallible< ValueType, Traits >::isEqual(
Fallible const& other ) const
{
return isValid()
? ( other.isValid()
&& HashTraits< ValueType >::isEqual( value(),
other.value() ) )
: ! other.isValid() ;
}

template< typename ValueType, typename Traits >
int
Fallible< ValueType, Traits >::compare(
Fallible const& other ) const
{
return isValid()
? ( other.isValid()
? value().compare( other.value() )
: 1 )
: ( other.isValid()
? -1
: 0 ) ;
}

template< typename ValueType, typename Traits >
HashType
Fallible< ValueType, Traits >::hashCode() const
{
return isValid()
? HashTraits< ValueType >::hashCode( value() )
: hashInit() ;
}
}
}
// Local Variables: --- for emacs
// mode: c++ --- for emacs
// tab-width: 8 --- for emacs
// End: --- for emacs
// vim: set ts=8 sw=4 filetype=cpp: --- for vim
 
Ad

Advertisements

J

Jorgen Grahn

Hi,

In databases you provide NULL if the value is missing. When using
objects on the free store in C++ you can use NULL to represent a missing
value. When using non-pointer values you may not.

I know several ways to solve this, but what I'm looking for is a
/uniform/ way to deal with missing values both for non-pointer values
and free store values. ....

The simplest solution would be to just go for boost::eek:ptional when using
non-pointer types and std::unique_ptr for pointer types, but I thought
it could be a value having a uniform name. Do you agree with me?

No. Pointers and e.g. scalar/struct/class values have so different
semantics that I don't want to unify them too much. Not in general, at
least.

I also happily let the integers 0 or -1 have special meanings where it
makes sense, for example when dealing with APIs that do the same.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top