A C Adventure: your comments are welcome

B

bartc

[Be able to write a.b instead of a->b]
This seems to be the wrong way round. You say the auto-dereference in
C would not be appropriate but it might be in a language with
dynamic typing but then you show a problem that C would not have and
the hypothetical dynamic language would.

My 'problem' is probably a red herring. I just tried this auto-deference
change on a not-so hypothetical language. It works. And it does hang on
circular references (until I put in a deref limit anyway).

I still contend that auto-dereferencing of struct pointers is better suited
to a dynamic language than a static one.

Suppose you start off in C with a type of say *T. If you take the address of
this, the type is **T. If you dereference *T, you get type T. All very
strict and proper.

Now start off with a type of say T in a dynamically typed language. If you
take the address of this, the type is still T. If you dereference T, you get
type T still! This is as far as the compiler is concerned of course.

That's why I'm less concerned with auto-derefencing of these pointers in the
more informal dynamic language than in C:

struct point{int x,y;};
struct point a = {10,20};
struct point *b = &a;
struct point **c = &b;
struct point ***d = &c;
int i;

i=(***d).x;
i=d.x; /* Doesn't look right */
i=a.x;
 
B

Ben Bacarisse

bartc said:
[Be able to write a.b instead of a->b]
This seems to be the wrong way round. You say the auto-dereference in
C would not be appropriate but it might be in a language with
dynamic typing but then you show a problem that C would not have and
the hypothetical dynamic language would.

My 'problem' is probably a red herring. I just tried this
auto-deference change on a not-so hypothetical language. It works. And
it does hang on circular references (until I put in a deref limit
anyway).

I still contend that auto-dereferencing of struct pointers is better
suited to a dynamic language than a static one.

Suppose you start off in C with a type of say *T. If you take the
address of this, the type is **T. If you dereference *T, you get type
T. All very strict and proper.

Yes, this is why I think the idea flies in C (theoretically -- the
boat has sailed on this one). The operator can be unambiguously
decided, at compile time, to mean the normal . or as an alternate from
of ->. BTW, my preference would be to allow only one level of
dereference on the theory that a single operator should not imply an
essentially unbounded number of indirections (this would be fine in
Algol 68 but not in the spirit of C).
Now start off with a type of say T in a dynamically typed language. If
you take the address of this, the type is still T. If you dereference
T, you get type T still! This is as far as the compiler is concerned
of course.

That seems to be a property of the type system rather than it being
dynamic. The type system you describe seems to have no pointer types.
That's why I'm less concerned with auto-derefencing of these pointers
in the more informal dynamic language than in C:

I'd want to keep separate the static/dynamic axis from the strict/lax
axis. Dynamic typing does have to mean none, little or loose typing,
just as C's static typing does not actually mean strict typing (by the
normal meaning of the term, that is).
struct point{int x,y;};
struct point a = {10,20};
struct point *b = &a;
struct point **c = &b;
struct point ***d = &c;
int i;

i=(***d).x;
i=d.x; /* Doesn't look right */

I agree. I'd allow only one indirection if I were Emperor of C.
 
B

bartc

Ben said:
Yes, this is why I think the idea flies in C (theoretically -- the
boat has sailed on this one). The operator can be unambiguously
decided, at compile time, to mean the normal . or as an alternate from
of ->. BTW, my preference would be to allow only one level of
dereference on the theory that a single operator should not imply an
essentially unbounded number of indirections (this would be fine in
Algol 68 but not in the spirit of C).

Yes allowing "." to replace "->" for a single dereference is workable
(although I don't like it).

"->" doesn't scale however to multiple dereferencing (such as (**a),
(***a).b etc), unless you allow -->, ---> or some such construct.

So double deferencing would still need (*a)->b. The whole thing is messy
because of the parentheses, or this quirky -> operator. That's why the ^.
syntax where you add or remove one pointer level by adding or removing one
"^" works well.

That seems to be a property of the type system rather than it being
dynamic. The type system you describe seems to have no pointer types.

I forgot to say that T would be a variant type. Then, although the compiler
might infer some of the types from the operators, it could just leave it to
the runtime to worry about.
I'd want to keep separate the static/dynamic axis from the strict/lax
axis. Dynamic typing does have to mean none, little or loose typing,
just as C's static typing does not actually mean strict typing (by the
normal meaning of the term, that is).


I agree. I'd allow only one indirection if I were Emperor of C.

Who should be Emperor then, Jacob Navia? He seems some vision about these
things.
 
D

David Thompson

On Mon, 17 Aug 2009 14:17:49 +0000, Richard Heathfield
How short is short? Depends on your news client. As long as you send
out the code correctly, how it is received is not your problem. But
you do need to send it out correctly, and that means configuring your
news client with a wide enough wrap setting (or no wrap setting) to
deal with your code.
Almost. The RFC only allows (reliably) 990-something, so setting AND
using more than that is officially broken. But that's unimportant,
since more than about 150 or at most 200 is totally crazy from a human
factors perspective; no one can read it. And less than 80 (typically
about 70, or even less) is conventional and much more courteous.
The parser can simply look out for the opening parenthesis's matching
closing parenthesis to know when the arguments have all been parsed.
This is normally pretty easy because the lexer has done a lot of the
hard work already.

Well, the lexer identifies operator and punctuator tokens, among
others, here and elsewhere, but I wouldn't call that very hard.

Another part of the answer, arguably more important, is that
<ObCaveat> in typical compiler designs, not required by the standard
</> the parser doesn't know about number (or types) of arguments;
it treats all function calls uniformly as having arbitrary arglists.
Matching to declarations is done by a subsequent semantic phase --
and only for fixed prototypes; for K&R1-declared functions, and C89
implicitly-declared ones, and vararg ones, such checking is usually
not possible and definitely not required by the standard.
 
D

David Thompson

If you cut yourself on IBM cards, you needed to slow down a little.

Although I did once scratch a cornea lifting a clump of (11x14-7/8)
fanfold paper MUCH too forcefully.
Oh well, at that time I was formatting beautifully in 1971, using punched
bloody tape. So, no line numbers at the end...

I never knew anyone who managed to cut on paper tape, even fanfold
paper tape. If you were using the metal(lized), yes, but that was
expensive enough it shouldn't be used for programming as meant here.
[NC] 'programming' was also the term for the tapes for some machine
tools and industrial robots, where metal was usually appropriate. And
also some lineprinter VFUs, which is kind of in-between.

while(42) { #include <sufficient_smileys> }
 
D

David Thompson

spinoza1111 said:
Yes, it does mean that. Comma as separator [in function call]
trumps comma as operator,

Which means that although having an operator do what the comma does
is a good idea, since it lets you code operations with side effects
clearly, the comma was a poor choice. The tilde would have been
better, but it may not have been available on Teletype keyboards in
1971.
It was on all the Teletype models that had lowercase and { | } which C
(also) required. And TTBOMR all other makers' ASCIIoid terminals the
same. It was one of the two graphics not 'perfectly' translatable to
EBCDIC, but I don't think that was a big concern early on.

And comma was certainly more associated with this meaning in
mathematical and common notation than anything like tilde was.
The tilde is otherwise engaged.

Monadic/unary tilde is; dyadic/binary tilde is available, although per
above I wouldn't have liked it for this meaning.
$ would have worked (and would
certainly be available). But, as it happens, the comma works out
quite well in the long term.
Although there was a fair bit of use of dollar as an (additional)
identifier character, particularly in Multics which was a strong
influence (good or ill) on the Unix and C work. Maybe they were
leaving open the possibility of using it that way. Maybe not.
VMS, and I believe VAX C, did use dollar, quite a bit later.
 
C

Chris M. Thomasson

[...]
Here is the code as of today, Tue 18 Aug. I spent both of my 30 mn
commutes making improvements and desk checking before I submit it for
a compile again, so there will be errors, in all probability. And
after it compiles the first time, I need to instrument it with
routines to display the rope and inspect it for errors.
// ***************************************************************
// * *
// * Rope implementation *
// * *
// * This program implements strings, called ropes, which can *
// * contain any character and can be of any length.
[...]

http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue12/spe986.pdf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top