Dennis Ritchie -- An Appreciation

James Kuyper · Nov 1, 2011

On 11/01/2011 12:25 PM, Malcolm McLean wrote:
....

There are two components to the system, the computer and the human
programmer.

It's no good having a programming language which is superbly
efficient, malleable, with automatic error-checking and so on, if for
some reason humans find it difficult to use. ...

Everything is difficult to use, for some humans. C++ is a substantially
more complicated language than C, which means that there's larger number
of people who find it difficult. I personally think that it comes
uncomfortably close to being too difficult to learn to be useful; but I
don't think it's actually crossed that line - yet. It looks to me like
the next version of the C++ standard might do the trick.

... There's not much point
blaming the human for his stupidity, unless you can hire someone else,

However, people who find these aspects of C++ easier to deal with than
you do are not too hard to find. Whether they can be hired depends upon
how much you can afford to pay them. I've never been authorized to hire
someone to do C++ programming, so I've no idea what that level of
expertise costs in the current market.

nroberts · Nov 1, 2011

On 11/01/2011 12:25 PM, Malcolm McLean wrote:
...

Everything is difficult to use, for some humans. C++ is a substantially
more complicated language than C...

http://stackoverflow.com/questions/3027177/what-are-the-differences-between-c-and-c/3027347#3027347

James Kuyper · Nov 1, 2011

http://stackoverflow.com/questions/3027177/what-are-the-differences-between-c-and-c/3027347#3027347

I notice that the answer you pointed me at is number 4 in terms of
number of votes, and conflicts with all three answers that got a higher
number of votes. Voting isn't proof, but when a lot of people vote in
favor of answers asserting that C++ is the harder language, I think we
can assume, at the very least, that those particular people found it
more difficult.

The fundamental problem with the idea that C++ has an "easy" part that
you can learn without needing to worry about the "difficult" C part, is
that both are parts of the same language, and that most people writing
in that language write code that mixes features from both parts.

Therefore, you won't be able to understand the code written by those
other people without gaining a certain amount of mastery of the C-like
part of C++. Indeed, a fair number of people consider themselves to be
C++ programmers because they used a C++ compiler to compile their code,
despite the fact that their code that makes little or no use of C++
features that are not also supported by the C standard.

Therefore, you can't gain an understanding of C++ that is sufficient to
do maintenance on other people's code, without also mastering most of
the features that it shares in common with C; and those features add up
to something like 99% of the features specified in the C90 standard (a
slightly smaller percentage of the C99 standard).

nroberts · Nov 1, 2011

[snip]

The array of structs and the struct of arrays have the same data, but
the data ordering is different.
Is this clear?

Click to expand...

Q1:

OK, what is C specific about preferring one over the other?

Click to expand...

Ans: C syntax generally favors one over the other. In particular
pointers work better with an array of structs rather than a struct of
arrays.

This isn't an answer at all, just a restatement of the assertion under
query. You haven't shown how any aspect of C's syntax alters the
preference over design when deciding whether to treat a bunch of
records as an array of records or a record of arrays of its elements.

Q2:

Why is
*C* biased to using arrays of structs rather than these struct of
arrays [snip editorializing]?

Click to expand...

I surmise that data order wasn't an important issue when C was being
developed. Treating a record as contiguous storage was a very
convenient default decision.

Well, I suppose you did answer the question as you quoted it.
Unfortunately since you snipped the important part, your answer
doesn't help much when applied to the actual question asked.

I "surmise" that the issue is intrinsic to the data being structured,
and has nothing to do with syntax. Structures and arrays are in no
way interchangeable whether you are writing in C, C++, BASIC, or
Brainfuck.

nroberts · Nov 1, 2011

I notice that the answer you pointed me at is number 4 in terms of
number of votes, and conflicts with all three answers that got a higher
number of votes. Voting isn't proof, but when a lot of people vote in
favor of answers asserting that C++ is the harder language, I think we
can assume, at the very least, that those particular people found it
more difficult.

I don't think that's an assumption that can be made either way. As
can be seen through the ample evidence here in this thread, people
like to chatter on about the difficulty of things they've never
bothered to learn.

99% of the words that come out of the mouths of human beings is
complete and total crap based more on their preconceptions, biases,
and ignorance than on reason or experience. Measuring votes is even
more tenuous as people vote for things they like to hear, if they're
not just voting for the people without even reading what was said (a
common occurrence on stackoverflow where there's even chat rooms for
people to clique up in and vote up their friends and down people they
don't like). This is especially true of religiously motivated
conceptions like programming language bigotry.

I posted the link because I agree with what it says. I should, I
wrote it. It also responds to this "C++ is too big and complex" issue
you were bringing up. That site's voting system though is more an
interesting social experiment than a viable technological rating.

James Kuyper · Nov 1, 2011

.....
99% of the words that come out of the mouths of human beings is
complete and total crap based more on their preconceptions, biases,
and ignorance than on reason or experience. ...

That's a pretty misanthropic assessment. Even Sturgeon's law tops out at
90%. Even if it's true, I see no reason why it shouldn't apply equally
strongly to both sides of that question. You might also want to consider
the possibility that your claim is self-referential.

Rather than posting a rather comprehensive slander aimed at ... just
about everybody - wouldn't it have been more productive to address the
issue I raised? You know - the one about the need to learn enough about
the C-like side of C++ to be able to read and understand code written by
people who do not choose to avoid that part of the language.

nroberts · Nov 1, 2011

That's a pretty misanthropic assessment. Even Sturgeon's law tops out at
90%.

Weren't paying much attention, where you.

James Kuyper · Nov 1, 2011

On 11/01/2011 03:59 PM, nroberts wrote:
....

and has nothing to do with syntax. Structures and arrays are in no
way interchangeable whether you are writing in C, C++, BASIC, or
Brainfuck.

No one has suggested that they are interchangeable. Only that data which
can be stored as a single array of structures can also be stored as a
single structure by converting each member to an array. The syntax to
access the data in either form is quite similar: object.member vs.
objects.member, though clearly different.

nroberts · Nov 1, 2011

On 11/01/2011 03:59 PM, nroberts wrote:
...

No one has suggested that they are interchangeable. Only that data which
can be stored as a single array of structures can also be stored as a
single structure by converting each member to an array.

2+2 = 27?

Unless you're making a completely pointless statement then you're
implying that it makes equal amount of sense either way. This is
especially true when it is stated that the reason for preference of
one over the other is a byproduct of syntax and not something else.
This is the essence of interchangeability.

Seebs · Nov 1, 2011

Do you have the same problem with int32_t? The only difference is
that one is *standardized* and the other 'DWORD' is not.

No, there's a much bigger difference:

One of them says what it actually means, the other says something that
is almost exactly the opposite of what it means.

The
underlying concept is the same; you want a double-word on
architectures that support 16-bit integers and those that support 32-
bit integers.

No, the underlying concept is "you want a 32-bit integer". DWORD is, on
many modern machines, a *half-word*. But int32_t is always 32 bits.

-s

Keith Thompson · Nov 1, 2011

nroberts said:
"I once wrote a matrix class that was only capable of working on
integers. I didn't need anything more than that. Thus I don't see
why anyone would need something that worked on more types."

Got it.

No, you haven't got it.

He didn't say "I don't see why anyone would need something that
worked on more types.". He didn't even imply it. He said that *he*
didn't need it to work on anything other than integers.

Making it generic might have been fairly easy (depending on the
language), but it would have been a non-zero effort. Not only
that, but it's something that could be done later, when *and if*
he had a need to make it work on something other than integers.

And what about testing? If he'd made it generic from the beginning,
then either he'd have to expend extra effort testing it with other
types, *or* he could leave it untested, possibly resulting in unused
untested code.

My "Hello, world" program might need an e-mail client some day; that
doesn't mean I should build it into the first version.

[snip]

James Kuyper · Nov 1, 2011

Weren't paying much attention, where you.

OK - self-referential it is.

James Kuyper · Nov 1, 2011

2+2 = 27?

Unless you're making a completely pointless statement ...

Richard Harter's point was precisely that it's significantly more
complicated iterating through the arrays in a structure than through an
array of structures. He's right, though I would not make a big deal
about the difference. You seem to have missed that point in your
obsession with challenging the concept that the two data structures are
otherwise interchangeable.

The point of my statement was to try to remind you of the fact that no
one has made any such claim. It seems not to have worked:

... then you're
implying that it makes equal amount of sense either way. ...

but it was indeed my intention that you realize that no one is saying
anything of the kind.

Nick Keighly said on 2011-10-31 at 03:20:15 -0700 (PDT):

again wouldn't arrays of structs be more natural than strcuts of
arrays?

Richard Harter said at 15:07:43 -0500:

With an array of structs we have something like:

for (ap=a;ap<ap_end;ap++) {
if (f(ap->x)) calc(ap);
}

Writing the equivalent code using a struct of arrays is not quite so
simple.
The catch is that sometimes it makes a real difference in performance.
In the array of structs code the stride is the width of the struct; in
the struct of arrays code the stride is the width of x. and on 22:59:48 -0500:
Nobody is saying that arrays and structs are interchangable.

and on 2011-11-01 at 13:56:54 -0500:

As a followup: In some programs data order matters a lot. (Latency
and caches, you know.) Suppose we have a data set ds consisting of n
records with fields f1,f2,..,fm. There are two natural ways to store
the data, record by record, or field by field. If it is record by
record (the array of structs order) the data is stored as

Click to expand...

Every single one of those quotes emphasizes that the two options are not
equivalent. Everyone who's talking with you about this subject has made
at least one such quote. Yet for some reason you keep feeling the need
to counter the suggestion, that no one has made, that they are equivalent.

Nick Keighley · Nov 2, 2011

Exactly my point. Thanks.

you made it poorly then

ImpalerCore · Nov 2, 2011

No, there's a much bigger difference:

One of them says what it actually means, the other says something that
is almost exactly the opposite of what it means.

No, the underlying concept is "you want a 32-bit integer". DWORD is, on
many modern machines, a *half-word*. But int32_t is always 32 bits.

Interesting. I suppose my conception of "word" has been more defined
by its use in the documentation of the hardware protocols I work with
rather than a property of the processor.

But I agree with you, DWORD is too ambiguous of a name to represent a
fixed-width integer because of WORD's ambiguity.

Best regards,
John D.

James Kuyper · Nov 2, 2011

On Tue, 1 Nov 2011 12:59:43 -0700 (PDT), nroberts ....
struct of arrays we can't. With an array of structs we can do things
like

func(a) /* pass a pointer to a struct to a function */

I think you meant func(&a), or more simply (but obscurely) func(a+i)?

nroberts · Nov 2, 2011

Richard Harter's point was precisely that it's significantly more
complicated iterating through the arrays in a structure than through an
array of structures.

Because *C* has a bias in that direction. I was trying to understand
that assertion because it seems like nonsense to me, and now everyone
wants to pretend it was never made. That's fine. If you guys want to
make nonsensical assertions and then pretend you were making sense the
whole time it's no problem with me, I'll just remain unconvinced.

Malcolm McLean · Nov 2, 2011

Because *C* has a bias in that direction. I was trying to understand
that assertion because it seems like nonsense to me, and now everyone
wants to pretend it was never made. That's fine. If you guys want to
make nonsensical assertions and then pretend you were making sense the
whole time it's no problem with me, I'll just remain unconvinced.

We've got ten employees, each with name, payroll id, and salary.

we can represent the data like this

typedef struct
{
char name[64];
int id;
float salary;
} EMPLOYEE;

EMPLOYEE employees[10];
or like this

typedef struct
{
char name[10][64];
int id[10];
float salary[10];
} EMPLOYEES;

EMPLOYEES employees;

The two methods hold the same data, and have the same access
characteristics - iterating through the list is done in O(N) time,
random access is in O(constant) time, searching for the maximum salary
takes O(N) time, etc. Theya are logically equivalent. The only
difference is the way the employees are laid out in memory.

The first way is better for C, but that's largely because of C's
syntax. We can imagine language x where you can declare arrays of
records, and all the fields are contiguous, and this is transparent to
the user. You'd use a syntax like field(employees, i, salary) *= 1.1; /
* increment employee's salary by 10 % */

The second method has certain advantages. For instance, if we want the
average salary, we have an array of floats ready to be passed to a
generic mean() function. With the first method, ypu've got to either
write an special mean_employee_salary() function, use a temporary
buffer, or fake up the "field" syntax using a stride, offset, and some
pointer jiggery-pokery.

nroberts · Nov 2, 2011

Because *C* has a bias in that direction. I was trying to understand
that assertion because it seems like nonsense to me, and now everyone
wants to pretend it was never made. That's fine. If you guys wantto
make nonsensical assertions and then pretend you were making sense the
whole time it's no problem with me, I'll just remain unconvinced.

Click to expand...

We've got ten employees, each with name, payroll id, and salary.

we can represent the data like this

typedef struct
{
char name[64];
int id;
float salary;

} EMPLOYEE;

EMPLOYEE employees[10];
or like this

typedef struct
{
char name[10][64];
int id[10];
float salary[10];

} EMPLOYEES;

EMPLOYEES employees;

The two methods hold the same data, and have the same access
characteristics - iterating through the list is done in O(N) time,
random access is in O(constant) time, searching for the maximum salary
takes O(N) time, etc. Theya are logically equivalent.

I thought nobody was saying that!!!!

The truth is that they are not at all logically equivalent. The
difference between the two is 100% logical.

The only
difference is the way the employees are laid out in memory.

There could be no difference in how they're laid out in memory, the
difference is in their *logical* structure--completely the opposite of
what you're saying. I can write the statements to access a field in
English and the difference is still there:

Get the N'th employee from the employee's array and access its id
field.
Get the ids field from the employees structure and access its N'th
element.

No matter what syntax I use that is specific enough for talking to a
computer, I'm still going to have to access the elements in different
manners.

The first way is better for C, but that's largely because of C's
syntax. We can imagine language x where you can declare arrays of
records, and all the fields are contiguous, and this is transparent to
the user. You'd use a syntax like field(employees, i, salary) *= 1.1; /
* increment employee's salary by 10 % */

The second method has certain advantages. For instance, if we want the
average salary, we have an array of floats ready to be passed to a
generic mean() function.

Not in language X. In language X you're using the field() syntax.
You've essentially got an array of structures in language X no matter
what you might want and you're writing your generic mean function to
use binder expressions like: mean(field(employees, _1, salary)). You
can do this in C by the way though not at quite that high level
(you'll be writing custom functions to serve as binder expressions).

With the first method, ypu've got to either
write an special mean_employee_salary() function, use a temporary
buffer, or fake up the "field" syntax using a stride, offset, and some
pointer jiggery-pokery.

Which is all "language X" is. You could write it in C as a macro and
then still access the underlying memory. It's all going to depend
upon the needs of your program.

All you're doing here is arguing for abstractions. You're not showing
that C's syntax forces you into one form of data expression or
another, you're saying that it would be nice to be able to manipulate
your data at a higher level than structures and arrays. There's
nothing stopping you from doing that in C and there's nothing that
makes it more or less difficult except perhaps a lack of utility
functions in the standard library, which is not a syntax issue.

It is true that there are many languages at higher levels than C that
support the kind of expressions you're talking about. C++ in fact
provides much of the functionality you want here, or at least better
facilities to create it. But when you get to this point you're no
longer talking about structures and arrays, you're talking about even
higher level concepts. I do agree that abstractions can be a
wonderful thing, but this is clearly not a C specific issue.

jameskuyper · Nov 2, 2011

nroberts said:
On Nov 2, 10:14 am, Malcolm McLean <[email protected]>
wrote: ....

We've got ten employees, each with name, payroll id, and salary.

we can represent the data like this

typedef struct
{
char name[64];
int id;
float salary;

} EMPLOYEE;

EMPLOYEE employees[10];
or like this

typedef struct
{
char name[10][64];
int id[10];
float salary[10];

} EMPLOYEES;

EMPLOYEES employees;

The two methods hold the same data, and have the same access
characteristics - iterating through the list is done in O(N) time,
random access is in O(constant) time, searching for the maximum salary
takes O(N) time, etc. Theya are logically equivalent.

Click to expand...

I thought nobody was saying that!!!!

No, the assertion that no one was making was that they were
interchangeable. While all of his O-notation statements are correct,
the actual coefficients in front of the corresponding power of N would
be quite different, on average, between the two data structures.

The truth is that they are not at all logically equivalent. The
difference between the two is 100% logical.

I would not have used the phrase "logically equivalent" for this
concept: I'm not at all sure what precisely that phrase means to him
in this context; neither am I sure what it means to you. However, I do
understand precisely what the two data structures have in common, and
I presume that he's using the term "logically equivalent" to describe
those things. Pay more attention to the list of common features he
gave above, and less to the particular phrase "logically equivalent"
that he used to cover those similarities.

There could be no difference in how they're laid out in memory,

At this point, we're still talking about C, aren't we? They must be
laid out in memory quite differently by any conforming implementation
of C. With the first version, employees[0].salary must be followed by
employees[1].name, with nothing between them except possibly some
padding. If there is padding, it must NOT be used to store any other
part of the data in that array of structures. With the second version
employees.salary[0] must be immediately followed by
employees.salary[1] - and in this case, no padding is allowed.
Strictly conforming code can test these requirements by converting
appropriate pointers to (char*) and comparing them for relative order.

Not in language X. In language X you're using the field() syntax.
You've essentially got an array of structures in language X no matter
what you might want and you're writing your generic mean function to
use binder expressions like: mean(field(employees, _1, salary)). ...

No, while I think he worded it poorly, what he appears to be saying is
that in language X, something which syntactically appears to be an
array of structures is actually a structure of arrays, laid out in
memory the same way as in his second C example. In language X, the
syntax field(employees, i, salary) has the same meaning that
employees.salary would have in C, while field(employees, _1,
salary) apparently gives you the equivalent of the C expression
employees.salary. I'm not sure how the "_1" is meant to be
interpreted; it's Malcolm's hypothetical language.

... You
can do this in C by the way though not at quite that high level
(you'll be writing custom functions to serve as binder expressions).

Click to expand...

Well, yes - the point is, in C, one method is supported directly, and
the other way requires "writing custom functions to serve as binder
expressions". That's precisely the bias he's talking about. I can't
say its a very important bias, but it's a real one. It doesn't deserve
the amount of attention it's received so far, but that amount of
attention has been due almost entirely to the challenges you've made
against it.

Dennis Ritchie Has Died	28	Oct 12, 2011
Query from Dennis Ritchie	6	Mar 29, 2011
[OT] Dennis Ritchie dies at 70	37	Oct 13, 2011
Errata for The C Programming Language, Second Edition, by Brian Kernighanand Dennis Ritchie	4	May 16, 2009
simple question regarding 5.5 of Ritchie & Kernighan	9	Jun 19, 2005
Help with an algorythm	5	Aug 29, 2024
Solutions for the Kernighan and Ritchie	13	Oct 2, 2008
Hello from an Ubuntu Enthusiast and Python Hobbyist!	0	Jan 2, 2025

Dennis Ritchie -- An Appreciation

James Kuyper

nroberts

James Kuyper

nroberts

nroberts

James Kuyper

nroberts

James Kuyper

nroberts

Seebs

Keith Thompson

James Kuyper

James Kuyper

Nick Keighley

ImpalerCore

James Kuyper

nroberts

Malcolm McLean

nroberts

jameskuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads