Justification for "->"?

not.here.now · Jul 19, 2007

A quick search of this group and its FAQ, and elsewhere, have not
answered this question to my satisfaction. Apologies if I missed
something obvious, either in the literature or my reasoning.

Can someone tell me why "->" exists? The compiler knows the difference
between a structure and a pointer to a structure, so why can't it just
let me write "foo.bar" in both cases and not have to go back and
rewrite things when I later decide I want a pointer instead of a value
or vice versa? I often change my mind about whether e.g. a function
should receive a pointer or the thing itself, and it's mildly
inconvenient to have to change all the code that references the
members.

Is there a historical reason? I read that dmr article on the history
of C, and he mentions a time when "foo->bar" would work regardless of
the type of "foo", with "bar" just describing an offset and the type
of the member value, but this confuses me -- was one not allowed to
use the same member name in two different structure types? But even if
the original reason was related to this, why was the division kept
after the type system got stronger?

Chris Torek · Jul 19, 2007

Can someone tell me why "->" exists?

History, as you surmise:

The compiler knows the difference between a structure and a
pointer to a structure, so why can't it just let me write
"foo.bar" in both cases [...] ?

There are some languages that do this, including some "debuggers
that speak C" (gdb in particular). C is just different, as usual.

I read that dmr article on the history
of C, and he mentions a time when "foo->bar" would work regardless of
the type of "foo", with "bar" just describing an offset and the type
of the member value, but this confuses me -- was one not allowed to
use the same member name in two different structure types?

Correct -- one was not; constructs like:

struct S { int a, b; };
struct T { int c, a; };

created problems, because the member named "a" had to mean both
"offset 0" and "offset 2" simultaneously (sizeof(int) being 2,
since this Prehistoric C was found only on the PDP-11).

In fact, you could even write:

struct { char lo, hi; };
int var = someval;
...
var.lo = 'a';

which produced machine code similar to:

*(char *)&var = 'a';

The "adb" debugger used this.

But even if the original reason was related to this, why was the
division kept after the type system got stronger?

Some source code -- such as the "adb" debugger, and perhaps the
Bourne shell -- depended on this. As of the early 1990s, a few
programs were still compiled with "cc -W", where the -W option
*suppressed* warnings. (This was, of course, *not* gcc, where -W
*adds* warnings.)

(I helped rewrite "adb", as part of the effort at UCB to make
the "BSD networking release" generally available and remove
various machine dependencies.)

jacob navia · Jul 19, 2007

Chris said:
Can someone tell me why "->" exists?

Click to expand...

History, as you surmise:

The compiler knows the difference between a structure and a
pointer to a structure, so why can't it just let me write
"foo.bar" in both cases [...] ?

Click to expand...

There are some languages that do this, including some "debuggers
that speak C" (gdb in particular). C is just different, as usual.

I read that dmr article on the history
of C, and he mentions a time when "foo->bar" would work regardless of
the type of "foo", with "bar" just describing an offset and the type
of the member value, but this confuses me -- was one not allowed to
use the same member name in two different structure types?

Click to expand...

Correct -- one was not; constructs like:

struct S { int a, b; };
struct T { int c, a; };

created problems, because the member named "a" had to mean both
"offset 0" and "offset 2" simultaneously (sizeof(int) being 2,
since this Prehistoric C was found only on the PDP-11).

In fact, you could even write:

struct { char lo, hi; };
int var = someval;
...
var.lo = 'a';

which produced machine code similar to:

*(char *)&var = 'a';

The "adb" debugger used this.

But even if the original reason was related to this, why was the
division kept after the type system got stronger?

Click to expand...

Some source code -- such as the "adb" debugger, and perhaps the
Bourne shell -- depended on this. As of the early 1990s, a few
programs were still compiled with "cc -W", where the -W option
*suppressed* warnings. (This was, of course, *not* gcc, where -W
*adds* warnings.)

(I helped rewrite "adb", as part of the effort at UCB to make
the "BSD networking release" generally available and remove
various machine dependencies.)

OK, OK. Those are "hysterical reasons"...

But those are no reasons, just explanations of why the current
situation.

There are no reasons then, for continuing to make that confusion...
What would happen if we dropped "->"?

Do you see any real problems?

jacob

Chris Dollin · Jul 19, 2007

jacob said:
There are no reasons then, for continuing to make that confusion...
What would happen if we dropped "->"?

Lots of existing code would break; no-one would continue to use the
offending compilers, for the good reason that they have better things
to do with their time than copy with a feature arbitrarily being
switched off.

Richard Bos · Jul 19, 2007

OK, OK. Those are "hysterical reasons"...

But those are no reasons, just explanations of why the current
situation.

There are no reasons then, for continuing to make that confusion...
What would happen if we dropped "->"?

Do you see any real problems?

Yes. Massive amounts of code would stop working for no good reason.

OTOH, supporting ptr.member _as well as_ ptr->member would, AFAICT,
break nothing.

Richard

Richard Heathfield · Jul 19, 2007

jacob navia said:

What would happen if we dropped "->"?

Do you see any real problems?

Others have already commented on the huge amount of code it would break.
The flip-side "solution" (to this non-existent problem), i.e. codifying
p.m as a synonym for p->m, would blur a perfectly good distinction
between the concept of a structure and the concept of a pointer to a
structure. This blurring would make C *more* confusing, not less.

santosh · Jul 19, 2007

jacob navia said:

Others have already commented on the huge amount of code it would
break. The flip-side "solution" (to this non-existent problem),
i.e. codifying p.m as a synonym for p->m, would blur a perfectly
good distinction between the concept of a structure and the concept
of a pointer to a structure. This blurring would make C *more*
confusing, not less.

Indeed. If 'ptr' is an int * and 'i' is an int object, we write:

i = some_value;
or
*ptr = some_value;

to store a value into i.

We don't write:

ptr = some_value;

Similarly the '->' is not just a syntactic convenience, it's also a
form of documentation for the programmer. IMHO of course.

jacob navia · Jul 19, 2007

Richard said:
Yes. Massive amounts of code would stop working for no good reason.

OTOH, supporting ptr.member _as well as_ ptr->member would, AFAICT,
break nothing.

Richard

Obviously I expressed myself wrong.

Of course we would retain the ->.

Only in the case of

foo *pFoo;

pFoo.field = 56;

instead of emitting an error we would just do the right thing.

jacob

jacob navia · Jul 19, 2007

Chris said:
Lots of existing code would break; no-one would continue to use the
offending compilers, for the good reason that they have better things
to do with their time than copy with a feature arbitrarily being
switched off.

Sorry, I did not say what I thought. Of course (for the same
historical reasons) we would retain the "->" !!!)

Only when
Foo *pFoo;
pFoo.field = 67;

we would do the right thing instead of emitting an error.

David Trallero · Jul 19, 2007

santosh said:
i = some_value;
or
*ptr = some_value;

pss pss wake up!

*ptr = some_value
means that the address pointed by ptr is "some_value".

ptr = some_value
means that ptr points to some_value.

Someone (from ADA if I remember well) said that if you have to write
long things in an statement then you will have to think twice before
writing them, so your code will be safer. I am not really agree with
the person who said that, but what I think is true is that if you see

a -> b = 10

you suddenly recognize some problems with one eye-shoot, like for
example that a cannot be null or *a will have lateral effects. Other
good thing about having '->' is that in C++ it can be overloaded while
'.' cannot.

IMHO -> helps programmer a lot to understand a program (specially when
working at low level). If you really upset with -> you can compile your
program in C++ and use references

Eric Sosman · Jul 19, 2007

David said:
pss pss wake up!

Hunh? Wha--? Whuzzup?

*ptr = some_value
means that the address pointed by ptr is "some_value".

No; ptr points wherever it points, and the statement stores
some_value in that pointed-to object (converting its type if
needed). The value of ptr -- the address it points to -- does
not change.

Mark Bluemel · Jul 19, 2007

David Trallero wrote:

David - you snipped some vital context from santosh's post. He wrote
"... If 'ptr' is an int * and 'i' is an int object, we write:"

pss pss wake up!

I'm not convinced Santosh is the one who needs to wake up.

His point was perfectly valid. If we wish to store "some_value" in the
int "i" we use the first form he gave. If we wish to store "some_value"
in the int whose address is in "ptr", we used the second form.

His underlying point was that in the same way that we don't use the same
syntax for updating an int directly and updating an int via a pointer,
we shouldn't (for clarity's sake) use the same syntax for updating a
struct directly and updating a struct via a pointer.

*ptr = some_value
means that the address pointed by ptr is "some_value".

No - it means that "some_value" (converted if need be) is stored at the
address which ptr contains.

ptr = some_value
means that ptr points to some_value.

Poorly phrased. It means that ptr now contains "some_value" (translated
as necessary) and hence that ptr points to the address represented by
"some_value". Of course, how "some_value" is mapped to an address is
another question entirely.

Richard · Jul 19, 2007

David Trallero said:
pss pss wake up!

*ptr = some_value
means that the address pointed by ptr is "some_value".

No it doesn't. I bet you're blushing now.

ptr = some_value
means that ptr points to some_value.

Err? Are you sure?

Someone (from ADA if I remember well) said that if you have to write
long things in an statement then you will have to think twice before
writing them, so your code will be safer. I am not really agree with

That person is clearly an idiot of the first order. One of the first
things you should think about is NOT long things, but breaking long
statements into smaller things in order to aid debugging and code
maintenance for future generations of developers.

the person who said that, but what I think is true is that if you see

a -> b = 10

you suddenly recognize some problems with one eye-shoot, like for
example that a cannot be null or *a will have lateral effects. Other
good thing about having '->' is that in C++ it can be overloaded while
'.' cannot.

Are you sure you should be in a C group?

IMHO -> helps programmer a lot to understand a program (specially when
working at low level). If you really upset with -> you can compile your
program in C++ and use references

Nice troll! Amusing

Mark Bluemel · Jul 19, 2007

Richard said:
Nice troll! Amusing

Was it? I didn't notice.

Keith Thompson · Jul 19, 2007

jacob navia said:
Sorry, I did not say what I thought. Of course (for the same
historical reasons) we would retain the "->" !!!)

Only when
Foo *pFoo;
pFoo.field = 67;

we would do the right thing instead of emitting an error.

So we'd be stuck (probably forever) with two distinct ways to indicate
exactly the same operation.

I have no problem with a language that uses "." for both structs and
pointers (<OT>Ada does this</OT>). I have no problem with a language
that uses "." for structs and "->" for pointers, and C *could* have
been designed that way from the beginning; even in the early versions
of C that didn't associate member names with specific structure types,
I don't think it would have produced any ambiguity. I'd have a big
problem with a language that suddenly introduced a new syntax for
something that's already perfectly easy to write, creating a
meaningless syntactic distinction (between ptr->member and ptr.member)
where there's no semantic distinction.

If you're designing a new language from scratch, you have a lot of
freedom to make it look exactly the way you want. If you're modifying
an existing language, you *have* to maintain backward compatibility;
introducing a better way to do something doesn't help much if you
still have to support the old way, and it doesn't help at all if the
old way was already good enough.

A counterargument might be that introducing function prototypes while
keeping support for old-style function declarations and definitions
was similar to what's being proposed. The difference is that
prototypes are clearly better than the old forms. I just don't see
that 'ptr.member' is clearly better than 'ptr->member'. And,
inevitably, many programmers will continue to believe that
'ptr->member' is *better* and will continue to use it, so deprecating
it won't be a realistic possibility.

Mark McIntyre · Jul 19, 2007

There are no reasons then, for continuing to make that confusion...

what confusion?

What would happen if we dropped "->"?

About a billion lines of existing code would break. Fixing this would
be absurdly costly, and some shops might find it cheaper to switch
compiler.

Do you see any real problems?

The other problem I see is that the change buys almost nothing, at
great potential pain. It might be "neater" to remove this feature, but
the cost is too high.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Lew Pitcher · Jul 20, 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A quick search of this group and its FAQ, and elsewhere, have not
answered this question to my satisfaction. Apologies if I missed
something obvious, either in the literature or my reasoning.

Can someone tell me why "->" exists?

"The declaration
struct date *pd;
says that pd is a pointer to a structure of type date. The notation
exemplified by
pd->year
is new. If p is a pointer to a structhre, then
p->member-of-structure
refers to the particular member.

Since pd points to the structure, the year member could also be
referred to as
(*pd).year
but pointers to structures are so frequently used that the -> notation
...======================================================================
is provided as a convenient shorthand."
...=======================================

(The C Programming Language, 1978)

- --
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
- ---------- Slackware - Because I know what I'm doing. ------

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Armoured with GnuPG

iD8DBQFGoBDvagVFX4UWr64RAhMQAKDj1EkxjPrC381YrxXCg3qTbAQzxgCgswps
ImAAlzZIbc3QRbFkP0UzwQk=
=dmdD
-----END PGP SIGNATURE-----

Keith Thompson · Jul 20, 2007

Lew Pitcher said:
"The declaration
struct date *pd;
says that pd is a pointer to a structure of type date. The notation
exemplified by
pd->year
is new. If p is a pointer to a structhre, then
p->member-of-structure
refers to the particular member.

Since pd points to the structure, the year member could also be
referred to as
(*pd).year
but pointers to structures are so frequently used that the -> notation
..======================================================================
is provided as a convenient shorthand."
..=======================================

(The C Programming Language, 1978)

Yes, that answers the question of why we have 'ptr->member' rather
than having to write '(*ptr)->member'. But the OP's question was
different; he was asking why we can't just write 'ptr.member' where we
now write 'ptr->member'.

The obvious answer is that '.' and '->' are distinct operations -- but
then, '+' for int and '+' for float are also distinct operations.
A language *could* use '.' for both C's '.' operator and C's
'->' operator, and in fact some languages do exactly that (Ada, for
one).

The real answer is that that's the way Dennis Ritchie decided to
define it. He could have defined it differently; either he didn't
think of it, or he didn't like the idea, or he was maintaining
compatibility with some pre-C language (which pushes the question back
to the designer(s) of B and BCPL, and perhaps Algol, most likely with
the same answers).

IMHO it's not a big deal one way or the other. Changing it now would
be totally impractical. Adding '.' while keeping '->' would just make
a mess. If you're designing a new language with no concern for C
compatibility, do whatever you like.

Richard · Jul 20, 2007

Mark Bluemel said:
Was it? I didn't notice.

You were probably too busy trying to get first in with "Off Topic".

Nice snipping.

Richard Bos · Jul 20, 2007

antosh said:
Indeed. If 'ptr' is an int * and 'i' is an int object, we write:

i = some_value;
or
*ptr = some_value;

to store a value into i.

We don't write:

ptr = some_value;

No, but that is because - at least in theory, although in C it requires
a cast, and is one of the few contexts which IMO _should_ require one -
assigning an int to a pointer has a meaning. It's a non-portable,
implementation-defined meaning, but it _does_ have a meaning.

OTOH,

ptr.member = value;

has only one conceivable meaning. It can only reasonably mean the same
thing as

(*ptr).member = value;

alias

ptr->member = value;

There's nothing you can _do_ with a struct pointer and a member name
except dereference the pointer to access the member of the struct
pointed to. You cannot, even in theory, even non-portably, access the
member of the pointer _itself_, because the pointer doesn't have any
members - the struct pointed to does.
Seen in this light,

ptr.member = value;

is less similar to

ptr = int_value;

and more to

function_ptr();

which should, in theory, be written

*function_ptr();

but for which we have also been given leave to eliminate the explicit
dereference, because there's no other way it can be interpreted anyway.

IMO, the same reasoning which gave us dereferenceless function pointer
calling - and that, /nota bene/, newly introduced by the first Standard,
because TTBOMK it was not allowed in K&R - could, and perhaps should,
give us dereferenceless struct and union pointer member taking.

Richard

Using for loops in Python?	5	Dec 29, 2023
Tips for using Github???	3	Jan 6, 2024
Looking for programmers!	3	Feb 9, 2024
Why does "->" exist?	0	Jul 19, 2007
Sonet Pointer justification Concept	4	Mar 13, 2008
Looping for checking input integer	2	Feb 13, 2023
Looking For Advice	1	Dec 10, 2022
[C#] Extend main interface on child level	0	Aug 31, 2023

Justification for "->"?

not.here.now

Chris Torek

jacob navia

Chris Dollin

Richard Bos

Richard Heathfield

santosh

jacob navia

jacob navia

David Trallero

Eric Sosman

Mark Bluemel

Richard

Mark Bluemel

Keith Thompson

Mark McIntyre

Lew Pitcher

Keith Thompson

Richard

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads