An interesting blog

J

jacob navia

http://apenwarr.ca/log/?m=201007#22

<quote>

Okay, one more example of C++ terribleness. This one is actually a
tricky one, so I can almost forgive the C++ guys for not thinking up the
"right" solution. But it came up again for me the other day, so I'll
rant about it too: dictionary item assignment.

What happens when you have, say, a std::map of std::string and you do
m[5] = "chicken"? Moreover, what happens if there is no m[5] and you do
std::string x = m[5]?

Answer: m[5] "autovivifies" a new, empty string and stores it in
location 5. Then it returns a reference to that location, which in the
first example, you reassign using std::string::eek:perator=. In the second
example, the autovivified string is copied to x - and left happily
floating around, empty, in m[5].

Ha ha! In what universe are these semantics reasonable? In what rational
set of rules does the right-hand-side of an assignment statement get
modified by default? Maybe I'm crazy - no, that's not it - but when I
write m[5] and there's no m[5], I think there are only two things that
are okay to happen. Either m[5] returns NULL (a passive indicator that
there is no m[5], like you'd expect from C) or m[5] throws an exception
(an aggressive indicator that there is no m[5], like you'd see in python).

Ah, you say. But look! If that happened, then the first statement - the
one assigning to m[5] - wouldn't work! It would crash because you end up
assigning to NULL!

Yes. Yes it would. In C++ it would, because the people who designed C++
are idiots.

But in python, it works perfectly (even for user-defined types). How?
Simple. Python's parser has a little hack in it - which I'm sure must
hurt the python people down to the cores of their souls, so much do they
hate hacks - that makes m[5]= parse differently than just plain m[5].

The python parser converts o[x]=y directly into o.__setitem__(x,y).
Whereas o[x] without a trailing equal sign converts directly into
o.__getitem__(x). It's very sad that the parser has to do such utterly
different things with two identical-looking uses of the square bracket
operator. But the result is you get what you expect: __getitem__ throws
an exception if there's no m[5]. __setitem__ doesn't. __setitem__ puts
stuff into your object; it doesn't waste time pulling stuff out of your
object (unless that's a necessary internal detail for your data
structure implementation).

But even that isn't the worst thing. Here's what's worse: C++'s crazy
autovivification stuff makes it slower, because you have to construct an
object just so you can throw it away and reassign it. Ha ha! The crazy
language where supposedly performance is all-important actually assigns
to maps slower than python can! All in the name of having language
purity, so we don't have to have stupid parser hacks to make [] behave
two different ways!

....

"...Well," said the C++ people. "Well. We can't have that."

So here's what they invented. Instead of inventing a sensible new []=
operator, they went even more crazy. They redefined things such that, if
your optimizer is sufficiently smart, it can make all the extra crap go
away.

There's something in C++ called the "return value optimization."
Normally, if you do something like "MyObj x = f()", and f returns a
MyObj, then what would need to happen is that 'x' gets constructed using
the default constructor, then f() constructs a new object and returns
it, and then we call x.operator= to copy the object from f()'s return
value, then we destroy f()'s return value.

As you might imagine, when implementing the [] setter on a map, this
would be kind of inefficient.

But because the C++ people so desperately wanted this sort of thing to
be fast, they allowed the compiler to optimize out the creation of x and
the copy operation; instead, they just tell f() to construct its return
value right into x. If you think about it hard enough, you can see that,
assuming the stars all align perfectly, m[5] = "foo" can benefit from
this operation. Probably only if m.operator[] is inlined, but of course
it is - it's a template! Everything in a template is inlined! Ha ha!

So actually C++ maps are as fast as python maps, assuming your compiler
writers are amazingly great, and a) implement the (optional)
return-value optimization; b) inline the right stuff; and c) don't screw
up their overcomplicated optimizer so that it makes your code randomly
not work in other places.

Okay, cool, right? Isn't this a triumph of engineering - an amazingly
world class optimizer plus an amazingly supercomplex specification that
allows just the right combination of craziness to get what you want?

NO!

No it is not!

It is an absolute failure of engineering! Do you want to know what real
engineering is? It's this:

map_set(m, 5, "foo");
char *x = map_get(m, 5);

That plain C code runs exactly as fast as the above hyperoptimized
ultracomplex C++. *And* it returns NULL when m[5] doesn't exist, which
C++ fails to do.

In the heat of the moment, it's easy to lose sight of just how much of
C++ is absolutely senseless wankery.

And this, my friends, is the problem.

<end quote>

<publicity mode ON>

Note that lcc-win implements operator []= as a different operator than
the plain operator [].

<publicity mode OFF>
 
J

jacob navia

Ian Collins a écrit :
This has already been discussed where the discussion belongs, on c.l.c++.
There is no discussion there. Just the usual dismissing without any real
arguments, because there are none. The guy is right.

Besides, since you said that you stopped reading, you can't discuss it
anyway.

It is interesting how well founded criticism is just dismissed with
vague justifications like "insults" or "c++ market share is not
dismissing" and then ignoring all the facts the author of the blog proposes.
 
D

Dennis \(Icarus\)

jacob navia said:
http://apenwarr.ca/log/?m=201007#22

<quote>

Okay, one more example of C++ terribleness. This one is actually a tricky
one, so I can almost forgive the C++ guys for not thinking up the "right"
solution. But it came up again for me the other day, so I'll rant about it
too: dictionary item assignment.

What happens when you have, say, a std::map of std::string and you do m[5]
= "chicken"? Moreover, what happens if there is no m[5] and you do
std::string x = m[5]?

Answer: m[5] "autovivifies" a new, empty string and stores it in location
5. Then it returns a reference to that location, which in the first
example, you reassign using std::string::eek:perator=. In the second example,
the autovivified string is copied to x - and left happily floating around,
empty, in m[5].

Ha ha! In what universe are these semantics reasonable? In what rational

Mine, and likely several others :)
set of rules does the right-hand-side of an assignment statement get
modified by default? Maybe I'm crazy - no, that's not it - but when I
write m[5] and there's no m[5], I think there are only two things that are
okay to happen. Either m[5] returns NULL (a passive indicator that there
is no m[5], like you'd expect from C) or m[5] throws an exception (an
aggressive indicator that there is no m[5], like you'd see in python).

How about foo(m[5]) Same thing right? m[5] is being
assigned/copied/rererenced to the first argument of foo. Thart won't be
helped by defining a new operator. If you want the object created in that
case, but not for assignment, why?

Just use find and de-reference the iterator. the de-reference of the
iterator when it's m.end() may indeed throw an exception
It's a shortcut for insert, except that it provides immediate access to the
mapped item.

If you don't like the semantics of std::map[]
operator......don't....use.....it.

Dennis
 
J

jacob navia

Dennis (Icarus) a écrit :
jacob navia said:
http://apenwarr.ca/log/?m=201007#22

<quote>

Okay, one more example of C++ terribleness. This one is actually a
tricky one, so I can almost forgive the C++ guys for not thinking up
the "right" solution. But it came up again for me the other day, so
I'll rant about it too: dictionary item assignment.

What happens when you have, say, a std::map of std::string and you do
m[5] = "chicken"? Moreover, what happens if there is no m[5] and you
do std::string x = m[5]?

Answer: m[5] "autovivifies" a new, empty string and stores it in
location 5. Then it returns a reference to that location, which in the
first example, you reassign using std::string::eek:perator=. In the
second example, the autovivified string is copied to x - and left
happily floating around, empty, in m[5].

Ha ha! In what universe are these semantics reasonable? In what rational

Mine, and likely several others :)
set of rules does the right-hand-side of an assignment statement get
modified by default? Maybe I'm crazy - no, that's not it - but when I
write m[5] and there's no m[5], I think there are only two things that
are okay to happen. Either m[5] returns NULL (a passive indicator that
there is no m[5], like you'd expect from C) or m[5] throws an
exception (an aggressive indicator that there is no m[5], like you'd
see in python).

How about foo(m[5]) Same thing right? m[5] is being
assigned/copied/rererenced to the first argument of foo. Thart won't be
helped by defining a new operator.

Look, he wrote:

when I write m[5] and there's no m[5], I think there are only two
things that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5],
like you'd expect from C) or m[5] throws an exception (an aggressive
indicator that there is no m[5], like you'd see in python).


What is so difficult to understand there?


???
 
B

Ben Bacarisse

jacob navia said:
Dennis (Icarus) a écrit :
How about foo(m[5]) Same thing right? m[5] is being
assigned/copied/rererenced to the first argument of foo. Thart won't
be helped by defining a new operator.

Look, he wrote:

when I write m[5] and there's no m[5], I think there are only two
things that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5],
like you'd expect from C) or m[5] throws an exception (an aggressive
indicator that there is no m[5], like you'd see in python).

What is so difficult to understand there?

There is a C point here. The original author (at least I think this is
a quote) is mistaken about C. If m[5] does not exist, it would be
wrong to expect NULL.
 
J

jacob navia

Ben Bacarisse a écrit :
jacob navia said:
Dennis (Icarus) a écrit :
How about foo(m[5]) Same thing right? m[5] is being
assigned/copied/rererenced to the first argument of foo. Thart won't
be helped by defining a new operator.
Look, he wrote:

when I write m[5] and there's no m[5], I think there are only two
things that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5],
like you'd expect from C) or m[5] throws an exception (an aggressive
indicator that there is no m[5], like you'd see in python).

What is so difficult to understand there?

There is a C point here. The original author (at least I think this is
a quote) is mistaken about C. If m[5] does not exist, it would be
wrong to expect NULL.
The C behavior should be an exception
(exception address violation).

In my implementation, overloading of the operator [] can throw an exception
if you define it like that.

I quoted the blog because it confirms my implementation of operator
overloading
using the []= operator as a different operator than [] and =.
 
D

Dennis \(Icarus\)

jacob navia said:
Dennis (Icarus) a écrit :
jacob navia said:
http://apenwarr.ca/log/?m=201007#22

<quote>

Okay, one more example of C++ terribleness. This one is actually a
tricky one, so I can almost forgive the C++ guys for not thinking up the
"right" solution. But it came up again for me the other day, so I'll
rant about it too: dictionary item assignment.

What happens when you have, say, a std::map of std::string and you do
m[5] = "chicken"? Moreover, what happens if there is no m[5] and you do
std::string x = m[5]?

Answer: m[5] "autovivifies" a new, empty string and stores it in
location 5. Then it returns a reference to that location, which in the
first example, you reassign using std::string::eek:perator=. In the second
example, the autovivified string is copied to x - and left happily
floating around, empty, in m[5].

Ha ha! In what universe are these semantics reasonable? In what rational

Mine, and likely several others :)
set of rules does the right-hand-side of an assignment statement get
modified by default? Maybe I'm crazy - no, that's not it - but when I
write m[5] and there's no m[5], I think there are only two things that
are okay to happen. Either m[5] returns NULL (a passive indicator that
there is no m[5], like you'd expect from C) or m[5] throws an exception
(an aggressive indicator that there is no m[5], like you'd see in
python).

How about foo(m[5]) Same thing right? m[5] is being
assigned/copied/rererenced to the first argument of foo. Thart won't be
helped by defining a new operator.

Look, he wrote:

when I write m[5] and there's no m[5], I think there are only two things
that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5], like
you'd expect from C) or m[5] throws an exception (an aggressive indicator
that there is no m[5], like you'd see in python).

My point was that
m[5] = some_value;
is no different than a function that'd write to m[5] as an output.

Dennis
 
U

Uno

Ian said:
This has already been discussed where the discussion belongs, on c.l.c++.

I wasn't invited.

I found this interesting:

Things you absolutely must not do if you want to replace C

1. Do not remove the ability to directly call into (and be called
by) C and ASM without any wrapper/translation layers. When I want to
call printf() from C or C++, I #include stdio.h and move on with my
life. No other language makes it that easy. None. Zero. Do not be those
other languages.

2. Do not remove the cpp preprocessor. Look, I realize you are
morally opposed to preprocessors. Well **** you too. Your moralizing is
getting in my way. If you take it out, I can't #include stdio.h, and I
can't implement awesome assert-like macros. (Note: see update below.)

3. Avoid garbage collection. Garbage collection is fine as a
concept, but you will never, ever, be able to write a good kernel if you
try to use garbage collection. This is non-negotiable. Also, plugins to
existing C programs won't fly with garbage collection, because you won't
be able to usefully mark-and-sweep through the majority of
non-garbage-collected memory, and you can't safely pass gc'd vs.
non-gc'd memory back and forth between C and your language. Maybe your
language can have optional garbage collection, but optional has to mean
globally disabled across the entire executable.

4. Avoid mandatory "system" threads. If you're writing a kernel,
you're the guy implementing the threading system, so if your language
requires threads, you're instantly dead in the water. Garbage collection
often uses a separate mark-and-sweep thread, which is another reason gc
just isn't an option. But it's even more insidious than that: what
happens when you fork() a program that has threads? Do you even know? If
the threads were created by the runtime, will it be sane even 1% of the
time? You can't invent Unix if you can't fork().

5. Avoid a mandatory standard library. People can - and do - compile
entire C programs without using any standard library functions at all.
Think about a kernel, for example. Even memory allocation is undefined
until the kernel defines it. Most modern languages are integrated with
their standard library - ie. some syntax secretly calls into functions -
and this destroys their suitability for some jobs that C can do.

6. Avoid dynamic typing. Dynamic typing always requires some sort of
dictionary lookups and is, at best, slightly slower than static typing.
To replace C in the cases where it refuses to die, you can't have a
language that's almost as fast as C. It has to be as fast as C. Period.
Google Go has some great innovations here with its static duck typing.
Objective C is okay here because the dynamic typing is optional.

7. Avoid support for exception handling. It's just too complicated,
and moreover, C people just hate exceptions so they will hate you, too.
And since C doesn't know about exceptions, you will make a mess when C
calls you, then you throw (but don't catch) an exception. Just leave it out.

8. Do not make it harder to do things in your language than they
would be in C. Maybe this isn't even worth mentioning. But the upper
bound on the lines of code it takes to do something should be whatever
it would take in C. Making your language backward-compatible with C is
one way (not the only way) to achieve this.
 
R

Richard Bos

jacob navia said:
Ben Bacarisse a écrit :
jacob navia said:
Look, he wrote:

when I write m[5] and there's no m[5], I think there are only two
things that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5],
like you'd expect from C) or m[5] throws an exception (an aggressive
indicator that there is no m[5], like you'd see in python).

What is so difficult to understand there?

There is a C point here. The original author (at least I think this is
a quote) is mistaken about C. If m[5] does not exist, it would be
wrong to expect NULL.
The C behavior should be an exception
(exception address violation).

Bollocks. The C behaviour is undefined. Pay attention, please:
_un_defined. Not "reasonably" defined, or POSIX-defined, or even
navia-defined. _Un_-defined.

Richard
 
K

Keith Thompson

jacob navia said:
Ben Bacarisse a écrit : [...]
There is a C point here. The original author (at least I think this is
a quote) is mistaken about C. If m[5] does not exist, it would be
wrong to expect NULL.
The C behavior should be an exception
(exception address violation).

Bollocks. The C behaviour is undefined. Pay attention, please:
_un_defined. Not "reasonably" defined, or POSIX-defined, or even
navia-defined. _Un_-defined.

Didn't we recently have a debate about the meaning of "should"?

It's correct, of course, that the behavior of m[5] is undefined
if m is, or points to the first element of, an array with
insufficient elements. But it's perfectly reasonable to say that an
implementation *should* cause it to behave in some particular way.
(I'm only guessing that that's what jacob meant by the "C behavior".)
 
I

Ike Naar

[snip]
when I write m[5] and there's no m[5], I think there are only two
things that are okay to happen.

Either m[5] returns NULL (a passive indicator that there is no m[5],
like you'd expect from C) or m[5] throws an exception (an aggressive
indicator that there is no m[5], like you'd see in python).

What is so difficult to understand there?

If NULL cannot be a value of the element type of m, then m[5] cannot
return NULL because it is of the wrong type (like, in the original
example, where m[5] is of type std::string, and thus cannot be NULL).

For instance, this test

if (m[5] == NULL) /* does m[5] exist ? */

would be a syntax error because the operands of ``=='' are of
incompatible type.

If NULL can be a value of the element type of m, then returning NULL
for a missing element is not particularly useful, because you would
not able to distinguish the situation "m[5] exists and contains NULL"
from "m[5] does not exists".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top