Writing single bits to a file

C

cr88192

CBFalconer said:
If you write code and try to routinely compile it with both C and
C++ compilers you deserve whatever happens to you. The languages
are _different_. The multiple compilers is not normally useful
anyhow, since you can always compile C in such a manner as to be
linkable to C++, but not the reverse.

at the time, it was an issue of whether or not I wanted C++ style name
mangling in the object files (after all, this mangling gives at least some
useful type info). eventually I decided that I did not (the proposition
posed more problems than it was worth).

as for C and C++ being different:
they are similar enough here that there is not much real problem in making
code that will work on both, as most of the differences at this level are
minor and fairly trivial to deal with.

eventually, I normalized on using good old plain and unmangled names (and,
more so, internally normalizing on not having underscore prefixes, though on
windows, this is still the convention within the object and exe files...).

and so on...
 
W

Willem

cr88192 wrote:
) 'const' is a keyword I don't really know if I have ever really used...
)
) reason: it doesn't really do anything, beyond telling the compiler to
) complain to me about stuff I should sanely know already anyways...

And telling the compiler that it can do certain optimizations.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
C

cr88192

Willem said:
cr88192 wrote:
) 'const' is a keyword I don't really know if I have ever really used...
)
) reason: it doesn't really do anything, beyond telling the compiler to
) complain to me about stuff I should sanely know already anyways...

And telling the compiler that it can do certain optimizations.

potentially, I guess, if the compiler does not figure this out on its own...

I guess it provides a means for constant folding:
const int foo=3;
....

if(foo) //well now, here we know it is 3...
{ ... }

but, I don't really see why one can't do similar by use of a dirty/clean
flag here (where a variable is dirty if not provable clean). this approach
worked in the past when I was writing interpreters (I guess const serves to
indicate that it is always clean...).


actually, in my compiler, some keywords are parsed but ignored.
at present this is the case with const.
 
J

James Kuyper

cr88192 wrote:
....
'const' is a keyword I don't really know if I have ever really used...

reason: it doesn't really do anything, beyond telling the compiler to
complain to me about stuff I should sanely know already anyways...

The 'const' keyword provides the same kind of benefit that prototypes
do: you declare something about how an identifier is intended to be
used, enabling the compiler to warn you if it detects the fact that you
accidentally use it in a manner different from the what the declaration
says. Then you get to decide whether it's the declaration or the usage
that is incorrect. This is one of the many things that the compiler can
check far quicker and more reliably than I can.

True, if I were a perfect programmer, I would never need that warning. I
don't know any perfect programmers. I'm certainly not one, and I'll
happily do what's needed to enable this kind of warning.
 
C

Charlie Gordon

Martin Wells said:
Chqrlie:



By "field names", do you mean the names I choose for functions and
variables? What in particular is wrong with them? Please take one of
them as an example and point out the (perceived) flaws to me.

No, I meant struct member names.
That's not what I mean. I meant the likes of:

int i;
double x;

...

i = (int const)x;

Writint "int const" instead of "int" has no merit.

I agree completely.
(Let's assume that the cast was present to suppress a compiler warning).

Stupid compiler is this case.
But even if we take a pointer type, we could have:

int i;

char *p = (char*)&i;
char *p = (char const *)&i; /* This is stupid */

I agree as well.
I disagree. The merit is that you'll get a compiler error if you try
to alter something, which, you didn't intend to alter in the first
place.

The be-all and end-all of it though is that I do be consistent with
const -- i.e. if I'm don't plan on changing it, then make it const.

Consistency has its merits. But do you also write:

int main(int const argc, char * const * const argv) { ... }
 
C

cr88192

James Kuyper said:
cr88192 wrote:
...

The 'const' keyword provides the same kind of benefit that prototypes do:
you declare something about how an identifier is intended to be used,
enabling the compiler to warn you if it detects the fact that you
accidentally use it in a manner different from the what the declaration
says. Then you get to decide whether it's the declaration or the usage
that is incorrect. This is one of the many things that the compiler can
check far quicker and more reliably than I can.

prototypes provide a lot more:
they actually make the type handling work right...

(well, that and as a side benefiet, I use them to help reinfoce
modularity...).

True, if I were a perfect programmer, I would never need that warning. I
don't know any perfect programmers. I'm certainly not one, and I'll
happily do what's needed to enable this kind of warning.

and I am also a person who writes some amount of stuff in assembler as well,
where assembler provides no such niceties...


however, it is my belief that what const offers, for the most part, is
something people will have already long-since internalized. unlike some
other errors, these are likely to have a much lower chance of random-chance
incedence, which most often consist of IME missing/mistyped variables, major
type errors (often caused by another error), and missing/mixing function
arguments...

assigning a read-only variable is a little less likely, on the grounds that
this action is far more likely to be deliberate.


or such...
 
C

Charlie Gordon

cr88192 said:
prototypes provide a lot more:
they actually make the type handling work right...

(well, that and as a side benefiet, I use them to help reinfoce
modularity...).

James said "the same kind", not "the same amount".
I do enable a ton of warnings, and use extra tools such as valgrind, sparse,
and custom made ones.
I haven't looked at your compiler yet, I'm willing to bet the code would
benefit from such a treatment.
and I am also a person who writes some amount of stuff in assembler as
well, where assembler provides no such niceties...

I ride motorbikes, yet I fasten my seat belt in a car. Why take risks all
the time?
however, it is my belief that what const offers, for the most part, is
something people will have already long-since internalized. unlike some
other errors, these are likely to have a much lower chance of
random-chance incedence, which most often consist of IME missing/mistyped
variables, major type errors (often caused by another error), and
missing/mixing function arguments...

assigning a read-only variable is a little less likely, on the grounds
that this action is far more likely to be deliberate.

const correctness, although it requires discipline, pays off.
You are probably a bit young and still remember everything you type, when
you start experiencing memory lapses (from 25 up) you will find all these
little tricks pretty handy.
or such...

What do you mean by that? or it your signature? or such ...
 
B

Ben Bacarisse

Charlie Gordon said:
"Martin Wells" <[email protected]> a écrit dans le message de
(e-mail address removed)...

Consistency has its merits. But do you also write:

int main(int const argc, char * const * const argv) { ... }

<nit-pick size="micro">
A case could be made that such a definition is not legal. The
standard allows 'int main(int argc, char *argv[])' "or equivalent"
with a footnote that suggests the equivalence is to be exact
(e.g. type synonyms and 'char **' are OK but that is about it).

Since one can not even pass a 'char **' argument to a function that
expects a 'char *const *const' parameter it would be hard to argue
that your example is "equivalent".
</nit-pick>
 
M

Martin Wells

Chqrlie:
Consistency has its merits. But do you also write:

int main(int const argc, char * const * const argv) { ... }


Yes, I'd make them const if I didn't plan on changing it.

Martin
 
C

Charlie Gordon

Ben Bacarisse said:
Charlie Gordon said:
"Martin Wells" <[email protected]> a écrit dans le message de (e-mail address removed)...

Consistency has its merits. But do you also write:

int main(int const argc, char * const * const argv) { ... }

<nit-pick size="micro">
A case could be made that such a definition is not legal. The
standard allows 'int main(int argc, char *argv[])' "or equivalent"
with a footnote that suggests the equivalence is to be exact
(e.g. type synonyms and 'char **' are OK but that is about it).

Since one can not even pass a 'char **' argument to a function that
expects a 'char *const *const' parameter it would be hard to argue
that your example is "equivalent".
</nit-pick>

The prototype I wrote as a joke for main is compatible with the classic int
main(int argc, char *argv[]); in the sense that passing an int and a char *
array would be OK, but it is incompatible in terms of signatures.
Too bad, there is no way to hint that main does not modify the array (not
the strings it points to).

What about int main(int const argc, char ** const argv) { ... } ? This one
is compatible with the standard.
Is this how you define main ?

I can think of an even more verbose yet compatible one:

signed int main(register signed int const argc, register char ** const
restrict argv) { ... }

Great! it does not even fit on one line.
 
B

Ben Bacarisse

Charlie Gordon said:
The prototype I wrote as a joke ...
What about int main(int const argc, char ** const argv) { ... } ? This one
is compatible with the standard.
Is this how you define main ?

No, I write 'int main(int argc, char *argv[])' -- I did not think you
were making a joke. I thought you were suggesting a legal, but daft,
alternative to make a point.
 
C

Charlie Gordon

Ben Bacarisse said:
Charlie Gordon said:
The prototype I wrote as a joke ...
What about int main(int const argc, char ** const argv) { ... } ? This
one
is compatible with the standard.
Is this how you define main ?

No, I write 'int main(int argc, char *argv[])' -- I did not think you
were making a joke. I thought you were suggesting a legal, but daft,
alternative to make a point.

Well I thought I was making a joke, but Martin Wells does const argc and
argv in his definitions of main when they are not modified.
 
C

cr88192

Charlie Gordon said:
James said "the same kind", not "the same amount".
I do enable a ton of warnings, and use extra tools such as valgrind,
sparse, and custom made ones.
I haven't looked at your compiler yet, I'm willing to bet the code would
benefit from such a treatment.

potentially...
actually, I am ending up endlessly debugging and fixing things, but not so
much syntactic, primarily semantic issues.

a recent example was rigging up some 'bypass' code to allow me to more
effieciently move floats and doubles between the FPU and SSE (happens, say,
whenever one uses sin or cos), because, as it was before, this operation
would end up flushing the register allocator (bad...). now, it is just
storing into memory (the bottom of the compiler's notion of the stack) on
one end, and loading from the other (still not sure why x86 lacks opcodes
like 'fld32 xmm3', or 'fstp64 xmm0', these would be useful...).


another recent example was noting that my expression parsing, didn't exactly
closely match the C operator precedence rules (noted in part, because me
typing '*(vec3 *)(&v0)', failed to parse right...).

to a large degree, my parser was just sort of reused from my last scripting
language and beaten into shape, but I had failed to notice that I had not
gone and more correctly fixed up the precedence heirarchy (unary and postfix
operators were the same precedence, bitwise operators were the same as
normal arithmetic operators, ...).


so, now, everything is much more closely in tune with the C stadard, for
better or for worse (I don't entirely like C's precedence rules, but then
again, this is partly why my last script lang did them differently, but in
any case conformance forces me to live with them...).

well, at least in the upper-end of my parser (tokenizer mostly), I have gone
and added more operator and brace types (12 new brace types, based on
character combos that should not occure in valid well-formed C, and 22 new
operator tokens). more are possible if one is willing to go into the land of
horrible-looking tokens ('#<. stuff .>' is allready pretty bad...).

if I used them, it would be mostly for compiler and language extensions (the
operators specifically to be overloaded...).

most of the operators take forms like '+.' or '.+', and I will define that
they have precedence similar to those of the operators they resemble (unless
defined for something, it will be an error to try to use them though...).

I also added '~' as an infix operator, which I am considering will operate
like an exponent operator ('a~3', since 'a^b' generally means xor, and
'a**b' is ambiguous). it could also serve as an alternative for 'dot
product', which is currently handled with '^', which, sadly, has a very low
precedence (this however, becomes ambiguous for quaternions, which are both
numeric and vector, and thus can have both exponents and dot product...).

could potentially also add `, as an operator, since it is not otherwise used
as a quote ('2`3', 'u`v', ...). likewise for @ and $ (though gcc allows the
latter in names, I may not, but as of yet I am undecided...).
hmm...

but, whatever, all this is non-standard anyways...

(my great cost: before writing a C compiler, I implemented script languages,
I guess I still sort of think in this way...).

I ride motorbikes, yet I fasten my seat belt in a car. Why take risks all
the time?

point is, bugs usually pop up, and with practice, one develops a tendency to
specifically avoid certain kinds of problems (the more painful the problem,
the more highly the user learns to avoid it). as a result, for people using
assembler, they learn to be careful, since even trivial errors will not be
caught by assembler, and will proceed to become potentially hard to track
down bugs (one develops a kind of 'blank stare' code checking ability).

making something easier, just makes it less painful to make errors, and thus
errors become more frequent.

I suspect this is also very likely the case with programmers who primarily
use statically typed languages that go over and use dynamically typed ones.
since they have not really felt the pain of the compiler missing their type
errors, they are a lot more likely to miss them, which is why, I think,
paradoxically, many good old C and C++ programmers experience pain with many
script languages, yet newbs seem a lot more adept at learning them, and old
timers assert that these kind of errors don't really occur...

(many such people also assert that one gets used to lisp style syntax, but I
never really stopped thinking that it looked ugly, nor did I ever really
like having to use emacs to avoid the pain this kind of syntax causes when
edited in notepad...).

meanwhile, in general, I like power and capability, at the possibly
necessary cost of comfort (and stability...).

const correctness, although it requires discipline, pays off.
You are probably a bit young and still remember everything you type, when
you start experiencing memory lapses (from 25 up) you will find all these
little tricks pretty handy.

well, I don't remeber everything I type (there is just too much...).
as for age, I am getting there, sadly...
not 25 yet, but sadly, it is no longer that far away.
I am getting old it seems...

What do you mean by that? or it your signature? or such ...

habit I guess...

 
C

Charlie Gordon

cr88192 said:
potentially...
actually, I am ending up endlessly debugging and fixing things, but not so
much syntactic, primarily semantic issues.

const does help finding semantic issues. Just a moment ago, I was reviewing
Jacobs StringCollection module: he does not use const either, and as I added
constness for function arguments that should not be modified, I discovered a
few allocation bugs where string were not duplicated as they should have
been. Believe me, const is very helpful.

another recent example was noting that my expression parsing, didn't
exactly closely match the C operator precedence rules (noted in part,
because me typing '*(vec3 *)(&v0)', failed to parse right...).

to a large degree, my parser was just sort of reused from my last
scripting language and beaten into shape, but I had failed to notice that
I had not gone and more correctly fixed up the precedence heirarchy (unary
and postfix operators were the same precedence, bitwise operators were the
same as normal arithmetic operators, ...).

You should try some of the freely available validation suites. Well
actually there aren't that many: gcc has a lot of stuff, and same for the
glibc.
so, now, everything is much more closely in tune with the C stadard, for
better or for worse (I don't entirely like C's precedence rules, but then
again, this is partly why my last script lang did them differently, but in
any case conformance forces me to live with them...).

well, at least in the upper-end of my parser (tokenizer mostly), I have
gone and added more operator and brace types (12 new brace types, based on
character combos that should not occure in valid well-formed C, and 22 new
operator tokens). more are possible if one is willing to go into the land
of horrible-looking tokens ('#<. stuff .>' is allready pretty bad...).

if I used them, it would be mostly for compiler and language extensions
(the operators specifically to be overloaded...).

You need to find a way to put these under the programmer's control, with
custom definable precedence and associativity, while keeping the parser
efficient: a non trivial thing.
most of the operators take forms like '+.' or '.+', and I will define that
they have precedence similar to those of the operators they resemble
(unless defined for something, it will be an error to try to use them
though...).

..+ has quirks: double x = 1.+y;
I also added '~' as an infix operator, which I am considering will operate
like an exponent operator ('a~3', since 'a^b' generally means xor, and
'a**b' is ambiguous). it could also serve as an alternative for 'dot
product', which is currently handled with '^', which, sadly, has a very
low precedence (this however, becomes ambiguous for quaternions, which are
both numeric and vector, and thus can have both exponents and dot
product...).

You seem very focussed on floating point stuff. I think ~ as an infix
operator should be a bitwise nand with the same priority as &: a ~ b would
be the same as a & ~b. This is quite consistent with the other bitwise
operators and with its unary function. dealing with bit flags would be
neater this way:

if (value) flags |= FLAG; else flags ~= FLAG;

infix ~ could have other semantics for non integer operands: it is quite
appropriate for string concatenation, but you specific memory mangement for
that, and almost unavoidably garbage collection.

for exponentiation, I would suggest you do use ** as in Fortran. It is not
ambiguous because a ** b has no meaning for b scalar or struct type. You
would need to bend the grammar a bit, but at least the precedence would be
much better suited than that of ^
could potentially also add `, as an operator, since it is not otherwise
used as a quote ('2`3', 'u`v', ...). likewise for @ and $ (though gcc
allows the latter in names, I may not, but as of yet I am undecided...).
hmm...

I you do add string concatenation (after all you have GC if I recall
correctly), you should consider implementing the $ substitution inside
strings, à la Perl or PHP. To preserve compatibility with C, these should
be a different type of strings, either delimited with `` or prefixed with a
letter or a % or whatever.
but, whatever, all this is non-standard anyways...

Yes, some like it more than others ;-)
(my great cost: before writing a C compiler, I implemented script
languages, I guess I still sort of think in this way...).

I can relate to that: I like to implement scripting languages for breakfast
now and then.
point is, bugs usually pop up, and with practice, one develops a tendency
to specifically avoid certain kinds of problems (the more painful the
problem, the more highly the user learns to avoid it). as a result, for
people using assembler, they learn to be careful, since even trivial
errors will not be caught by assembler, and will proceed to become
potentially hard to track down bugs (one develops a kind of 'blank stare'
code checking ability).

Good C programmers learn to be careful, and humble too.
making something easier, just makes it less painful to make errors, and
thus errors become more frequent.

I suspect this is also very likely the case with programmers who primarily
use statically typed languages that go over and use dynamically typed
ones. since they have not really felt the pain of the compiler missing
their type errors, they are a lot more likely to miss them, which is why,
I think, paradoxically, many good old C and C++ programmers experience
pain with many script languages, yet newbs seem a lot more adept at
learning them, and old timers assert that these kind of errors don't
really occur...

Actually they do occur: approximate programming with scripting languages is
pretty common. In javascript, type mismatches are hard to catch because
because + for instance is both numeric addition and string concatenation
with implicit conversion!
(many such people also assert that one gets used to lisp style syntax, but
I never really stopped thinking that it looked ugly, nor did I ever really
like having to use emacs to avoid the pain this kind of syntax causes when
edited in notepad...).

meanwhile, in general, I like power and capability, at the possibly
necessary cost of comfort (and stability...).

simplicity is a greater goal.
well, I don't remeber everything I type (there is just too much...).
as for age, I am getting there, sadly...
not 25 yet, but sadly, it is no longer that far away.
I am getting old it seems...

I thought you were younger, and still under parental control.
Where are you studying theology? How long to go?
habit I guess...

Good night.
 
C

cr88192

Charlie Gordon said:
const does help finding semantic issues. Just a moment ago, I was
reviewing Jacobs StringCollection module: he does not use const either,
and as I added constness for function arguments that should not be
modified, I discovered a few allocation bugs where string were not
duplicated as they should have been. Believe me, const is very helpful.

<>

ok.



You should try some of the freely available validation suites. Well
actually there aren't that many: gcc has a lot of stuff, and same for the
glibc.

sadly, I don't trust my compiler that much yet.

most non-trivial tests are still likely to break things. so, I have mostly
been going on a "fixing bugs as they pop up" quest, and rigging up much
smaller and fairly specific tests (though, I have used if for general coding
as well, and have discovered and fixed more than a few bugs this route).

the above examples, were mostly me verifying that cast conversions between
float arrays and vectors actually worked (needed, since most of the rest of
my existing codebase is based on float arrays, rather than on builtin
vectors...).

You need to find a way to put these under the programmer's control, with
custom definable precedence and associativity, while keeping the parser
efficient: a non trivial thing.

I am likely to allow them to be "hooked into" at some point, where certain
tokens and keywords are "registered", and as a result invoke special parsing
handlers. I can't remember if anything like this is set up in the existing
parser (I ripped a lot of stuff out of the parser when reworking it into
something that parsed C, and likewise for a lot of the compiler).

as for custom precedence and associativity, this is harder (one issue is
that the current parser tries to be as context-independent as possible, and
generally knows essentially nothing about argument types, and thus, at
present, no custom precedence).

.+ has quirks: double x = 1.+y;

hmm, had not thought of this (since I always write things like '1.0').
I may use more care with these ones...

You seem very focussed on floating point stuff. I think ~ as an infix
operator should be a bitwise nand with the same priority as &: a ~ b
would be the same as a & ~b. This is quite consistent with the other
bitwise operators and with its unary function. dealing with bit flags
would be neater this way:

if (value) flags |= FLAG; else flags ~= FLAG;

infix ~ could have other semantics for non integer operands: it is quite
appropriate for string concatenation, but you specific memory mangement
for that, and almost unavoidably garbage collection.

interesting idea, and does make more sense for this operator...

yeah, float obsession I guess.
well, partly, it may be that I also do a lot of 3D and numeric stuff, and so
like fairly fast and convinient numerical support. also, it is another minor
intended use of the compiler to be for physics and some shader code (I am
odd I guess, in that I still do a lot of shaders on the main processor, but
then again, my obsession is not so much graphical pretties, unlike some
people...).

also, partly, it is that 'fexp(x, 3)' is ugly, and with larger expressions
more difficult to follow...
note that a a lot of these basic numerical operations have been implemented
as compiler builtins.

for exponentiation, I would suggest you do use ** as in Fortran. It is
not ambiguous because a ** b has no meaning for b scalar or struct type.
You would need to bend the grammar a bit, but at least the precedence
would be much better suited than that of ^

I had a '**' operator at one time before, but it kept clashing with pointers
handling. as a result, if it existed, it would have to be parsed specially,
and have rules to avoid accidentally mis-parsing a pointer operation:
"i**s++", where the intention was actually "i*(*s++)". provably
disambiguating this would require more info than the parser has available
(the only real option would be, for example, requiring spaces...).

'a^.b' could be another option (where '^.', is given a high rather than a
low precedence).

I you do add string concatenation (after all you have GC if I recall
correctly), you should consider implementing the $ substitution inside
strings, à la Perl or PHP. To preserve compatibility with C, these should
be a different type of strings, either delimited with `` or prefixed with
a letter or a % or whatever.

potentially, or I would need a built in string type (or make the compiler
automatically assume that 'char *' means 'string'...).

actually, I may implement a lot of this, not in my main C compiler, but I am
considering a secondary "branch" language I would call 'DyC', which would
include a lot more extensions, and a more agressive featureset (such as a
built in object system, string handling, ...).

my main C branch, still sticks primarily to non-intrusive extensions, and
thus far focuses primarily on numeric features...

Yes, some like it more than others ;-)


I can relate to that: I like to implement scripting languages for
breakfast now and then.

yeah.



Good C programmers learn to be careful, and humble too.

yeah.



Actually they do occur: approximate programming with scripting languages
is pretty common. In javascript, type mismatches are hard to catch
because because + for instance is both numeric addition and string
concatenation with implicit conversion!

yes, that is the kind of 'pain' I am referring to (ugly hidden problems that
make a large issue when one has to hunt them down to fix them...).

simplicity is a greater goal.

ok.



I thought you were younger, and still under parental control.
Where are you studying theology? How long to go?

I am still younger than this, and am still under parental control.
I am just, not very young anymore (no longer a teenager or anything, still
early 20s though...).

PIBC, and a few years...


well, note that I do infact believe in christianity, just it is a little
harder to be compelled by doctrine and theology. too much reading, mostly...

I am more one of the lazy and impious types I guess...

Good night.

for me it is afternoon...

 
K

Keith Thompson

cr88192 said:
I had a '**' operator at one time before, but it kept clashing with
pointers handling. as a result, if it existed, it would have to be
parsed specially, and have rules to avoid accidentally mis-parsing a
pointer operation:
"i**s++", where the intention was actually "i*(*s++)". provably
disambiguating this would require more info than the parser has
available (the only real option would be, for example, requiring
spaces...).

'a^.b' could be another option (where '^.', is given a high rather
than a low precedence).
[...]

"**" can be unambiguous if you allow parsing to be affected by
semantic analysis. But typically the source is tokenized before it's
parsed and analyzed (even though all these things theoretically happen
in translation phase 7). If you see ``x**y'', you can't tell whether
it's ``x ** y'' (an exponentiation) or ``x * *y'' without knowing the
type of y.

I can imagine ways to tweak the grammar, perhaps requiring white space
between a binary "*" operator and a unary "*" operator, but I wouldn't
recommend it; the result would be incompatible with C. And any such
solution would be complicated to define and to implement, and
therefore complicated to use in at least some cases.

You might consider ^^ as an exponentiation operator. It's not likely
that a future C standard will introduce a short-circuit xor operator.

Or you could use a keyword as an operator symbol (``sizeof'' is a
precedent for this).
 
C

Charlie Gordon

Keith Thompson said:
cr88192 said:
I had a '**' operator at one time before, but it kept clashing with
pointers handling. as a result, if it existed, it would have to be
parsed specially, and have rules to avoid accidentally mis-parsing a
pointer operation:
"i**s++", where the intention was actually "i*(*s++)". provably
disambiguating this would require more info than the parser has
available (the only real option would be, for example, requiring
spaces...).

'a^.b' could be another option (where '^.', is given a high rather
than a low precedence).
[...]

"**" can be unambiguous if you allow parsing to be affected by
semantic analysis. But typically the source is tokenized before it's
parsed and analyzed (even though all these things theoretically happen
in translation phase 7). If you see ``x**y'', you can't tell whether
it's ``x ** y'' (an exponentiation) or ``x * *y'' without knowing the
type of y.

Of course, I'm not proposing ** to be a token, but x * *y to be
reinterpreted as fexp(x, y) if y is a numeric type. This trick can be
played on the parse tree if you have one, at code generation time, or on the
fly if you generate code directly. The programmer would be more inclined to
write x ** y or x**y, but it is parsed as x * *y. This trick would be more
difficult to play in an interpreter with dynamic typing, but still possible,
by sticking the appropriate behaviour to fexp(x, y) for y pointer type.
I can imagine ways to tweak the grammar, perhaps requiring white space
between a binary "*" operator and a unary "*" operator, but I wouldn't
recommend it; the result would be incompatible with C. And any such
solution would be complicated to define and to implement, and
therefore complicated to use in at least some cases.

none of this should be needed.
You might consider ^^ as an exponentiation operator. It's not likely
that a future C standard will introduce a short-circuit xor operator.

Or you could use a keyword as an operator symbol (``sizeof'' is a
precedent for this).

These are other directions, but less appealing IMHO.
 
C

cr88192

Keith Thompson said:
cr88192 said:
I had a '**' operator at one time before, but it kept clashing with
pointers handling. as a result, if it existed, it would have to be
parsed specially, and have rules to avoid accidentally mis-parsing a
pointer operation:
"i**s++", where the intention was actually "i*(*s++)". provably
disambiguating this would require more info than the parser has
available (the only real option would be, for example, requiring
spaces...).

'a^.b' could be another option (where '^.', is given a high rather
than a low precedence).
[...]

"**" can be unambiguous if you allow parsing to be affected by
semantic analysis. But typically the source is tokenized before it's
parsed and analyzed (even though all these things theoretically happen
in translation phase 7). If you see ``x**y'', you can't tell whether
it's ``x ** y'' (an exponentiation) or ``x * *y'' without knowing the
type of y.

knowing the type of y is the problem, though theoretically it could be
handled by parse-tree tweaking, if I had the parse tree at the same point I
was doing type handling (in my compiler, I do not, since these are handled
as different stages).

a later frontend may also make type info available in more of the upper
compiler, such that such inferences can be made...

I can imagine ways to tweak the grammar, perhaps requiring white space
between a binary "*" operator and a unary "*" operator, but I wouldn't
recommend it; the result would be incompatible with C. And any such
solution would be complicated to define and to implement, and
therefore complicated to use in at least some cases.

yeah, I considered, but did not accept these ideas...
it matters to me that compatibility not be broken.

You might consider ^^ as an exponentiation operator. It's not likely
that a future C standard will introduce a short-circuit xor operator.

however, I may at some point add such an operator (after all, my last script
language had such an operator...).

'^.' still seems like a better option IMO, since it resembles '^', but is a
different operator...
(I can just make it have a very different precedence than '^').

ok, this drops the precedence-similarity idea (if the new operators have
different precedences than the old ones they resemble).

'&.', '|.', and '^.' might be made tightly binding (slightly more tightly
than '*' and '/').
'*.' and '/.' will be the same as '*' and '/'.
'+.' and '-.' will be the same as '+' and '-'.

'*.' could thus be an alternative for dot product, and maybe an additional
multiply form (is some other cases).
'/.' could be used for a 'reverse divide' for types with non-communitive
multiplication and division (such as quaternions, which currently use a
builtin function for this). potentially, it could also serve as a shorthand
for dividing ints and getting a float (aka: cast-free).


or, all this could be misguided, who knows...

Or you could use a keyword as an operator symbol (``sizeof'' is a
precedent for this).

possible, but I don't really like this approach personally...
 
C

Charlie Gordon

cr88192 said:
Keith Thompson said:
cr88192 said:
news:[email protected]... [...]
for exponentiation, I would suggest you do use ** as in Fortran. It
is not ambiguous because a ** b has no meaning for b scalar or
struct type. You would need to bend the grammar a bit, but at least
the precedence would be much better suited than that of ^

I had a '**' operator at one time before, but it kept clashing with
pointers handling. as a result, if it existed, it would have to be
parsed specially, and have rules to avoid accidentally mis-parsing a
pointer operation:
"i**s++", where the intention was actually "i*(*s++)". provably
disambiguating this would require more info than the parser has
available (the only real option would be, for example, requiring
spaces...).

'a^.b' could be another option (where '^.', is given a high rather
than a low precedence).
[...]

"**" can be unambiguous if you allow parsing to be affected by
semantic analysis. But typically the source is tokenized before it's
parsed and analyzed (even though all these things theoretically happen
in translation phase 7). If you see ``x**y'', you can't tell whether
it's ``x ** y'' (an exponentiation) or ``x * *y'' without knowing the
type of y.

knowing the type of y is the problem, though theoretically it could be
handled by parse-tree tweaking, if I had the parse tree at the same point
I was doing type handling (in my compiler, I do not, since these are
handled as different stages).

a later frontend may also make type info available in more of the upper
compiler, such that such inferences can be made...

Probably your best option.
yeah, I considered, but did not accept these ideas...
it matters to me that compatibility not be broken.

Wise choice.
however, I may at some point add such an operator (after all, my last
script language had such an operator...).

short circuit xor does not get much usage IMHO.
'^.' still seems like a better option IMO, since it resembles '^', but is
a different operator...
(I can just make it have a very different precedence than '^').

ok, this drops the precedence-similarity idea (if the new operators have
different precedences than the old ones they resemble).

'&.', '|.', and '^.' might be made tightly binding (slightly more tightly
than '*' and '/').
'*.' and '/.' will be the same as '*' and '/'.
'+.' and '-.' will be the same as '+' and '-'.

'*.' could thus be an alternative for dot product, and maybe an additional
multiply form (is some other cases).
'/.' could be used for a 'reverse divide' for types with non-communitive
multiplication and division (such as quaternions, which currently use a
builtin function for this). potentially, it could also serve as a
shorthand for dividing ints and getting a float (aka: cast-free).

No, these tokens are really problematic. I pointed at ``1.^2'' that would
become ambiguous if you attach semantics to ^ for floating point values (as
you may have), as is unequivically parsed as 1. ^ 2. ; at least the .^ and
more generally . prefixed arithmetic operators you are considering would not
cause incompatibilities with current C syntax, just parsing surprises for
programmers trying to use your extensions. Adding tokens with a trailing .
do create incompatibilites with the current C syntax as it would cause
legitimate expressions to be parsed differently. Consider these:

..5==x^.5==y // x equals 0.5 or y equals 0.5 but not both.
..8<x&.9>x
1.1==x|.9==y
x/.1
y*.2
z-.2
....

I always put around binary operators but a lot of programmers don't.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top