Encapsulation and Operator[]

Roger Lakner · Mar 18, 2006

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Thank you,

Roger

Bob Hairgrove · Mar 18, 2006

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Bo Persson · Mar 18, 2006

Roger Lakner said:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use?

That is an important feature for an interface!

Or do you just bite the bullet and accept it?

It is not about accepting or not, it is about what you want to do to
an object. Without knowing what a Foo is, it is hard to tell if

FooList f;

f[5] = someFoo;

is "natural" or not. It depends!

Considering that C++ provides a std::vector with this kind of
interface, it must be correct some of the time.

Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

There aren't always universal rules for all situations. You have to
consider each one individually.

Bo Persson

Greg · Mar 18, 2006

Roger said:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?

"Wrecks encapsulation?" On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is the
primary benefit of encapsulation. For instance we could imagine an
implementation in which FooList accessed a network server, or a
database, or some other kind of store to retrieve the Foo object
returned by operator[]. To the client, this change to FooList would go
undetected - because even though its data representation may have
changed - its public interface, which its clients all use - would not
have changed.

Greg

Daniel T. · Mar 18, 2006

"Roger Lakner said:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation.

Returning a reference/pointer to a member object breaks both the LSP and
UAP (and thus "wrecks encapsulation".) You are entirely correct on that
point. However there are times when both can be broken... Take
std::vector for example, the vector does not logically own the objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is simply
an implementation artifact.

Also, although returning a const reference does break the OO principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const T&"
to "T" should not break client code, only slow down the function call.
Any client code that *does* break as a result of such change should be
modified.

In summary, by returning a Foo& in FooList:

p[], you (as the designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.

Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const { return array[index]; }
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change the
code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

Daniel T. · Mar 18, 2006

Bob Hairgrove said:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Click to expand...

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Roger Lakner · Mar 19, 2006

Bob,
I see your point about the fact that more can go on in the op[]
implementation than just passing a reference. And in that context, it
makes some sense, though I still think encapsulation is violated. But
I have seen several contexts in which nothing goes on except as I've
illustrated (e.g., in std::vector), and it is in this context that it
seems to me just a conceit to make the array private.

Roger

Bob Hairgrove said:
I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept
it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Click to expand...

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct
public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown.
The
implementation can also be very complex. Consider that there might
not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Roger Lakner · Mar 19, 2006

Bo Persson said:
There aren't always universal rules for all situations. You have to
consider each one individually.

Amen to that. But if one of the primary advantages of C++ over C is
encapsulation, and that feature is routinely and easily subverted,
then...

Roger

Roger Lakner · Mar 19, 2006

Greg said:
On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members
and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's
public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

I guess I don't see the difference between op[] as instantiated here
and making array public.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is
the
primary benefit of encapsulation.

I agree that is one of the primary benefits of encapsulation, but so
is data hiding. The data is not being hidden in any robust sense here.
I don't see the difference, in my example, between op[] and making
array public. It's disheartening to me to see that one of the
much-vaunted advantages of C++ over C is so easily, so naturally, so
intuitively and quite often subverted.

Roger

Roger Lakner · Mar 19, 2006

Returning a reference/pointer to a member object breaks both the LSP
and
UAP (and thus "wrecks encapsulation".) You are entirely correct on
that
point. However there are times when both can be broken...

What sense of "can" do you mean in "can be broken"? I know it is
possible to break both. Do you mean "should" be broken? Should be
broken only in rare cases? Should be broken when other considerations
warrant?

Take std::vector for example, the vector does not logically own the
objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is
simply
an implementation artifact.

Also, although returning a const reference does break the OO
principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const
T&"
to "T" should not break client code, only slow down the function
call.
Any client code that *does* break as a result of such change should
be
modified.

So performance is what makes it OK to break a primary feature of OO
principles? I agree its advantageous to the programmer, but at what
cost?

In summary, by returning a Foo& in FooList:p[], you (as the
designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.

This is a very interesting way of looking at this. I need to give this
some more thought.

Is there some standard way to avoid this
transgression and still provide the client with an interface that
is
natural and easy to use? Or do you just bite the bullet and accept
it?

Click to expand...

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can
provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const { return
array[index]; }
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change
the
code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with
any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

Thank you very much, this is very helpful.

Roger

Daniel T. · Mar 19, 2006

"Roger Lakner said:
What sense of "can" do you mean in "can be broken"? I know it is
possible to break both. Do you mean "should" be broken? Should be
broken only in rare cases? Should be broken when other considerations
warrant?

What I mean to say is that at times, other conditions warrant breaking
LSP and or UAP. One issue that can cause us to break encapsulation is,
as I have already mentioned, performance. C++ is, first and foremost I
should think, designed for high performance time critical applications.
More so than any other OO language. Another reason to break
encapsulation has also been alluded to by me and that has to do with
lifetime issues and the feature unique to C++ among OO languages, the
destructor.

Take std::vector for example, the vector does not logically own the
objects
it contains, it is simply in charge of deleting said objects at the
appropriate time. The object that owns the vector is the one who
actually owns the objects contained in the vector. In other words:

class Foo {
vector<Bar> itsBars;
};

Logically, Foo objects own the Bar objects in the vector, *not* the
vector. If you were to write the above in UML it would look like:

0..n
[Foo]<#>--------->[Bar]

Note how the vector<Bar> is not expressed in the diagram, it is
simply
an implementation artifact.

Also, although returning a const reference does break the OO
principles
above, as long as it is only used as a performance optimization over
returning an object, it's OK to do. In other words, changing "const
T&"
to "T" should not break client code, only slow down the function
call.
Any client code that *does* break as a result of such change should
be
modified.

Click to expand...

So performance is what makes it OK to break a primary feature of OO
principles?

The cost in other matters (such as maintainability, reuse, extensibility
&c.) is irrelevant if the system performs so slowly that it cannot be
used for its intended purpose. Please don't get me wrong, I heartily
agree that premature optimization is the root of many problems in
programming in general and possibly especially so in C++, but let's not
throw the baby out with the bath water...

I agree its advantageous to the programmer, but at what cost?

You don't agree with me. I think breaking encapsulation is
disadvantageous to the programmer. However, it is sometimes a necessary
evil.

In summary, by returning a Foo& in FooList:p[], you (as the
designer
of FooList) are saying that FooList doesn't actually own the Foos it
contains, it is simply managing their lifetime for some client who
*does* own them.

Click to expand...

This is a very interesting way of looking at this. I need to give this
some more thought.

Well then, let me expound on it some more. C++ is unique (in my
experience at least) among OO languages in that every class has one
member function who's semantics are such that it must be the last method
called on the object, and it *must* be called. This unique requirement
means that C++ has some correspondingly unique idioms to deal with it
that other languages need not deal with.

The best and brightest in the C++ community, have found over the years,
that it isn't necessarily advantages for the parent object in a
composite relationship to also be the object responsible for ensuring
that the destructor is properly called. C++ has a plethora of classes
who's soul responsibility is to ensure the destructor is called on one
or more objects at the appropriate time (namely smart pointers and the
standard containers.) So, we often find classes in C++ sharing their
aggregates with others, the latter of which has the sole responsibility
of monitoring the aggregate's lifetime.

Is there some standard way to avoid this
transgression and still provide the client with an interface that
is
natural and easy to use? Or do you just bite the bullet and accept
it?

Click to expand...

First you have to ask yourself, "who owns the Foos that FooList
contains?" If the answer is "FooList" then you should not provide an
op[] for the contained objects (although if you want, you can
provide an
op[] const.) Use something like this instead:

class FooList {
Foo array[num];
public:
const Foo operator[](unsigned index) const {
return array[index];
}
void setFoo( unsigned id, const Foo& foo ) {
array[id] = foo;
}
};

(If Foo's are expensive to copy and you find that performance is
suffering because of the return by const value, you can later change
the code to "const Foo& operator[](unsigned)const".)

This way, encapsulation is preserved. You can replace 'array' with
any
other class, set of classes, or remove it completely as your needs
dictate without affecting FooList clients.

Click to expand...

Thank you very much, this is very helpful.

It is my pleasure.

Bob Hairgrove · Mar 19, 2006

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Click to expand...

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:

perator[]). This is probably an example of
encapsulation at its best!

Bob Hairgrove · Mar 19, 2006

[top-posting corrected]

Bob Hairgrove said:
Bob Hairgrove said:

I often see operator[] implemented something like this:

class Foo { ... };

class FooList
{
public:
const Foo& operator[] (unsigned index) const {return
array[index];};
Foo& operator[] (unsigned index) {return
array[index];};
private:
Foo array[num];
};

And this seems natural and intuitive (at least to me). But it seems
to
wreck encapsulation. Is there some standard way to avoid this
transgression and still provide the client with an interface that is
natural and easy to use? Or do you just bite the bullet and accept
it?
Please pitch responses to someone whose level of knowledge is about
one year of C++ experience.

Click to expand...

Why do you say that it wrecks encapsulation? Admittedly, the
implementation shown above doesn't buy you anything over direct
public
access to the array member variable, but that doesn't mean that it
can't be done differently.

For example, if index is out of range, an exception can be thrown.
The
implementation can also be very complex. Consider that there might
not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Click to expand...

Bob,
I see your point about the fact that more can go on in the op[]
implementation than just passing a reference. And in that context, it
makes some sense, though I still think encapsulation is violated. But
I have seen several contexts in which nothing goes on except as I've
illustrated (e.g., in std::vector), and it is in this context that it
seems to me just a conceit to make the array private.

It does leave room for later modifications to the implementation
without having to change the interface. Often, you will see things
like these seemingly silly "do-nothing" implementations in the
pre-production stages of code. When the preliminary testing phase
passes, programmers can "harden up" the code by adding stuff to the
implementation body. Clients only see the headers, so they wouldn't
have to recompile their own code.

Bob Hairgrove · Mar 19, 2006

Amen to that. But if one of the primary advantages of C++ over C is
encapsulation, and that feature is routinely and easily subverted,
then...

Encapsulation doesn't mean "non-hackable". In the "Design and
Evolution of C++" by Bjarne Stroustrup, there is a passage about
public/private access mechanisms on page 55 which addresses this
similar issue (quote from his "Annotated C++ Reference Manual"):

"The C++ access control mechanisms provide protection against accident
-- not against fraud. Any programming language that supports access to
raw memory will leave data open to deliberate tampering in ways that
violate the explicit type rules specified for a given data item."

Bob Hairgrove · Mar 19, 2006

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Click to expand...

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Click to expand...

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:perator[]). This is probably an example of
encapsulation at its best!

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

Daniel T. · Mar 19, 2006

Bob Hairgrove said:
For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

Click to expand...

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Click to expand...

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:perator[]). This is probably an example of
encapsulation at its best!

By all means, show me some code. I'd love to be proven wrong...

class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.

Greg · Mar 19, 2006

Roger said:
Greg said:

On the contrary, FooList demonstrates exactly
how a class should encapsulate its data - with private data members
and
a public interface. Note that clients cannot access FooList's data
member directly, instead they must invoke methods in FooList's
public
interface to access FooList's data. In other words, FooList has
encapsulated its data by mediating all access to it.

Click to expand...

I guess I don't see the difference between op[] as instantiated here
and making array public.

There is a huge difference between using operator[] and making the
array public. With a public array, clients can bypass FooList's
interface (and FooList's methods) and obtain the data FooList stores -
directly. And without FooList's interface interposing itself between
its stored data and its clients there is no easy way for FooList to use
a different data storage mechanism in the future, since its clients
will all be relying on the data being stored in an array.

Accessing the data objects through the operator[] is a completely
different story. The operator[] is a function - it is therefore code in
FooList that clients must call in order to retrieve data from a FooList
object. Since FooList can implement operator[] however it likes, it can
get the data it returns from anywhere it likes. And since every client
must call this method to get the data, no client is relying on any
detail of how FooList actually stores its data. In this example FooList
happens to use an array data member - but it need not to. For clients,
the overloaded operator[] creates the illusion that FooList is - or has
- an array. But an interface is separate from the implementation. In
fact, with operator[] access, FooList clients have no way of knowing
how FooList actually stores its data.

Therefore we can conclude from these two cases, that FooList properly
encapsulates the details of its data storage implementation when making
public the operator[] but not when making its array data member public.

Encapsulation makes it possible for FooList to change its underlying
storage model without affecting its clients - and that quality is
the
primary benefit of encapsulation.

Click to expand...

I agree that is one of the primary benefits of encapsulation, but so
is data hiding. The data is not being hidden in any robust sense here.
I don't see the difference, in my example, between op[] and making
array public. It's disheartening to me to see that one of the
much-vaunted advantages of C++ over C is so easily, so naturally, so
intuitively and quite often subverted.

It's important to ask what exactly FooList aims to encapsulate - and
the answer is not Foo objects, not by any means. Simply put, FooList
does not encapsulate the Foo objects that it stores. Containment has
nothing to do with encapsulation. Since FooList does not implement Foo,
FooList cannot encapsulate Foo. Foo is an independent class with its
own interface and its own, encapsulated implementation.

So what then does FooList encapsulate? It encapsulates just what it
implements: a storage mechanism for Foo objects. That's it. But
otherwise FooList knows almost nothing about Foo objects. The classes
are not related, so FooList is just a client (and barely one at that)
of Foo.

So does the fact that FooList returns Foo objects break its
encapsulation? Absolutely not. FooList is a container - a container is
expected to return whatever it logically contains. That's why it's
called a container after all. As we can conclude from the FooList
example, a container does not encapsulate the items that it stores -
only the way that it stores them. Encapsulation is not about data
organization or relationships between classes - it describes solely the
relationship (within a single class) between a public interface and an
implementation.

Greg

Bob Hairgrove · Mar 19, 2006

Bob Hairgrove said:
Bob Hairgrove said:

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Click to expand...

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:perator[]). This is probably an example of
encapsulation at its best!

Click to expand...

By all means, show me some code. I'd love to be proven wrong...

Look at your favorite STL's implementation of vector said:
class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.

First, please read my follow-up post which you probably didn't see
before posting this.

Also, we were talking about operator[] which is overloaded for
const/non-const. They are two separate functions and don't necessarily
have to return the same reference at all (unless you expect them to be
consistent, which needs to have a requirement/business rule).

As to the challenge, those assertions of yours are new requirements,
and I didn't say anything about how the object whose reference
returned wasn't stored in RAM ... that's not really possible. It just
doesn't have to reference the private data member (array) contained
within the class. It can be a static object or a dummy member
variable, for example. Or you could even implement operator[] to
return a temporary proxy object which has an automatic conversion to
the reference type.

I've actually done this before for an ODBC class library wrapper I
wrote about two or three years ago. The cursor class had overloaded
operator[] which was used for reading and writing to the columns of
the current row. Since each column can have a completely different
data type, it wasn't possible to have it return a reference to the
real type at all, so we returned a proxy class which handled the
actual conversion. The gory details were all in the proxy, but
transparent to the clients. And it worked just fine, although there
was a lot of casting from raw memory buffers going on behind the
scenes. Clients could write stuff like:

OdbcConnection db(/*...*/);
// x is an int, y is a string, and z is a double...
OdbcCursor cr(db, "SELECT x,y,z FROM my_table;");
// prepare or execute query...
while(!cr.EOF()) {
cr.Edit();
cr[0] = 123;
cr[1] = "some string";
cr[2] = 3.14159;
cr.Update();
cr.Next();
}
cr.Close();

Operator[] was also overloaded to take a string argument so that
columns could be accessed by name, of course. I just used the above
for illustration purposes. (Now if I had been a little more STL savvy
at the time, I would have probably implemented iterators for
OdbcCursor...)

Even then, Foo::bar() doesn't necessarily have to return the same
value that you assign to i; it can even silently change i inside its
implementation...which is why we need to document the fact or disallow
keeping a pointer or reference to the object returned. Think about
vector<bool>:

perator[]. It must return a reference to bool, yet we
know that it is not possible to take the address of a single bit.
(Hmmm ... how do they do it??)

I will show you another example of what I am saying. Consider having a
class which controls access to elements of a vector -- conceptually
speaking, that is. Internally it doesn't have to be a vector at all
unless maybe there are O(1) time random-access requirements on it. We
want to grant access according to user permissions. You want some
users to be able to read and write all values via operator[], but
limit what other users with less permissions can read and write.
Furthermore, the stored values are encrypted, but you want to read and
write plain-text (e.g. passwords). Also, you want each user to think
that they are accessing values at index 0..n, where in reality these
are located just about anywhere in the vector or might be looked up
somewhere else. Finally, in order to prohibit doing pointer arithmetic
to access elements, just in case we are using a vector, we use proxy
elements which serialize access (i.e. one user at a time) by
implementing a locking mechanism. This prevents clients who ARE being
sneaky and holding a reference to what Foo::bar() returns from
accessing the next user's password or code -- we allocate the dummy or
proxy object dynamically for one call only, then delete it and
reallocate it on the next call, so that the previous references or
pointers would dangle (of course, we will have to document that
somewhere...<g>)

Shall I continue? I think you get where I am going... but if you want,
I'll take a few more mintues and post some code. It's not hard, but
I'm a little lazy. <g> Somehow I think we aren't really talking about
the same thing, though...

Daniel T. · Mar 19, 2006

Bob Hairgrove said:
For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

Click to expand...

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:perator[]). This is probably an example of
encapsulation at its best!

Click to expand...

I would like to amend this a little ... of course, for the example
given by the OP, you are correct. I was only trying to illustrate that
an implementation of operator[] can be done in other, non-trivial
ways, and that having an operator[] which returns a non-const lvalue
doesn't necessarily break encapsulation.

But let's also consider that it would be perfectly legal for
operator[] to return a reference to a static object or a dummy member
variable which acts as a proxy for the real array element. One could
then document this fact somewhere so that clients would know not to
attempt to store a pointer or reference to the object.

// assume the methods below update some file or database such that
// assigning a value to foo.bar() calls update and retrieving a value
// from foo.bar() calls get_value()
void update( unsigned id, int i );
int get( unsigned id );

class Foo {
public:
Foo( unsigned id );
~Foo();

int& bar();
};

int main() {
Foo foo1( 1 );
foo1.bar() = 1963;
Foo foo2( 1 ); // note, same ID
assert( foo2.bar() == 1963 );
assert( get( 1 ) == 1963 );
}

Implement Foo however you see fit such that the asserts in main won't
fire...

Daniel T. · Mar 19, 2006

Bob Hairgrove said:
Bob Hairgrove said:

On Sat, 18 Mar 2006 19:31:25 GMT, "Daniel T."

For example, if index is out of range, an exception can be thrown. The
implementation can also be very complex. Consider that there might not
even be a member "array", but operator[] does a database lookup
instead (somehow). Or that the real array is held in another class,
and FooList holds a pointer or reference to that class to which it
forwards the call. There are many possibilities here.

The above is not quite true. The Foos in FooList *must* be objects in
RAM because clients of FooList may keep a pointer/reference to the value
returned, or modify the state of a FooList object by modifying the Foo
returned. That breaks the UAP (clients know that the return value was
not computed, but rather stored,) thus encapsulation is broken.

I'm sorry, but you are wrong. The reference returned may not
necessarily even be a reference (see section 23.2.5, paragraph 2 of
the C++ standard for what it says about
std::vector<bool>:perator[]). This is probably an example of
encapsulation at its best!

Click to expand...

By all means, show me some code. I'd love to be proven wrong...

Click to expand...

Look at your favorite STL's implementation of vector said:

class Foo {
// implement as you see fit.
public:
int& bar(); // implement as you see fit.
};

int main() {
Foo f;
int& i = f.bar();
i = 1967;
assert( f.bar() == 1967 );
i = 1942;
assert( f.bar() == 1942 );
}

If you can implement the Foo interface such that the int returned is not
stored in RAM and the assertions in main don't abort the program, I'd
love to see how. The learning experience would be wonderful.

Click to expand...

First, please read my follow-up post which you probably didn't see
before posting this.

Also, we were talking about operator[] which is overloaded for
const/non-const. They are two separate functions and don't necessarily
have to return the same reference at all (unless you expect them to be
consistent, which needs to have a requirement/business rule).

As to the challenge, those assertions of yours are new requirements,
and I didn't say anything about how the object whose reference
returned wasn't stored in RAM ... that's not really possible.

And there you go. You might want to look up the UAP (which is what I
said a reference return breaks.) The Uniform Access Principle says, "All
services offered by a module should be available through a uniform
notation, which does not betray whether they are implemented through
storage or through computation."

Obviously, and by your own admission, returning a reference in a
member-function betrays whether the return value is implemented through
storage or though computation, you can't switch from one to the other.
Sure you can go through all kinds of contortions to try to keep the RAM
location returned synchronized with some computation, and if the client
only uses the reference in a prescribed set of ways, your contortions
will work, but it is so much easer to just follow the UAP in the first
place.

Shall I continue? I think you get where I am going... but if you want,
I'll take a few more mintues and post some code. It's not hard, but
I'm a little lazy. <g> Somehow I think we aren't really talking about
the same thing, though...

No need for you to continue, I think we are basically agreeing, it's
just that you are trying to put a more positive spin on reference
returns than I.

Exception handling and encapsulation	8	Nov 10, 2007
Idiomatic module encapsulation	10	May 22, 2012
Encapsulation idiom	5	Feb 3, 2006
fixing absence of operator[] and at in list	3	Apr 22, 2013
Operator overloading in memberclass.	10	Aug 21, 2011
operator << and conversion operator	12	Oct 27, 2010
operator overloading	9	Oct 27, 2010
Custom matrix multiplication produces different results to glm	0	Sep 16, 2023

Encapsulation and Operator[]

Roger Lakner

Bob Hairgrove

Bo Persson

Greg

Daniel T.

Daniel T.

Roger Lakner

Roger Lakner

Roger Lakner

Roger Lakner

Daniel T.

Bob Hairgrove

Bob Hairgrove

Bob Hairgrove

Bob Hairgrove

Daniel T.

Greg

Bob Hairgrove

Daniel T.

Daniel T.

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads