Syntax for union parameter

Rick C. Hodgin · Feb 8, 2014

I see. I thought you were saying something significant. I don't think
I've seen a language that does not work exactly like that, including C.

Except that with C some things are left to compiler author whims that can
tick-tock version after version.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

You've lost me. If the hardware doesn't support something how can a
compiler switch make it do it?

You answer it yourself in the next sentence, Dr Nick.

Are you suggesting that every C implementation on every architecture
should be able to emulate every architecture, including those not yet
invented?

You're round the bend, Dr Nick. I cannot communicate with you.

Or are you just suggesting that an ideal architecture should be defined
and all compilers on all architectures, by default, have to emulate this
unless turned off by compiler switches? So the machine I used 20-odd
years ago that had 64 bit shorts should have to do the same sort of slow
and painful fiddling it did for chars for every integer type except long
long?

That's Java. C isn't like that. Really. There are reasons why it's
like it is, and your incredulity about them doesn't make them not so.

And it's not going to change, so continuing to express your horror at
the way the world is, isn't going to change anything.

I am developing RDC. Provided I don't die, or something doesn't come
up to prevent it being completed, I most certainly can do something
about it, and I am.

This conversation really isn't going anywhere.

There is consensus.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

I used to be a lot more critical of C than now. But a couple of years ago I
took a sizeable C program (some 20kloc) running on x86/Windows, and compiled
it on ARM/Linux. It worked first time, without changing a line of code
(afaicr).

From then on I've been a lot more impressed and less critical! (But still
fighting a private battle against its syntax and everything that makes
coding harder than it need be.)

You can impose rigid types on C with some effort. If the hardware can't
support a particular width, then you will have the same problem with RDC.
Worse, because it's rigid.

I'm writing C code. I don't care about how difficult it is for the hardware
to carry out my instructions. I want them carried out as I've indicated,
not as the compiler author dictated for me because he felt a particular way,
in contrast to other developers who felt another way, vis-a-vis Microsoft's
compiler, and GCC's compiler.

Command line switches are a bad dependency for your program to have. The
source code should stand by itself. Pragmas are better.

That's fine. My use of the words "command line switch" dose not mean I'm
affixed to the mechanics of the thing, but rather some ability to possess
a switching ability when then enables the extensions.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

If specific implementations can override standards-defined behaviour,
then the behaviour is no longer standard!

The standard is violated purposefully by the inclusion of some kind of
command line switch which enables the alternate behavior.

You can't have a "standard"
that says "int is always 32-bit" and then say "but for /this/ particular
compiler, int is 16-bit".

Yes you can. The standard says "int is always 32-bit" and the extension
says "when enabled, int is no longer 32-bit, but is now 16-bit."

The standard never changes. The extensions override the standard. Get it?

You have two choices - you can do as D does,
and specify that "int is always 32-bit" and therefore the language is
not suitable for smaller processors, or you can do as C does and say the
choice is "implementation dependent". A feature is /either/ fully
defined and specified, /or/ it is implementation dependent - it cannot
be both.

C's choice is a bad one. It is incorrect. It produces code which remains
in question until tested on a particular implementation because there is
no standard. Code that works perfectly well on 99 compilers might fail
on the 100th because that particular compiler author did something that
the C "standard" allowed through not having a definition, but yet is
inconsistent with other compilers. As such, my code, because I use the
behavior demonstrated through 99 out of 100 compilers, cannot be known
to work on any new compiler.

It is a variable. And it's a hideous allowance, one that the authors of
C should never have allowed to be introduced ever. The rigid definition
should've been created, and then overrides for specific and unique
platform variances should've been encouraged. I'm even open to a common
set of command line switches which allow them to all enable the same
extensions with the same switch.

The whole point with the C standards is that programmers know which
parts are fixed in the specs, and which are variable. They can rely on
the fixed parts. For /some/ code, you might want to rely on
implementation-specific features - not all code has to be portable.

Yes. Lunacy. Absolute lunacy.

In the particular case of bit sizes, it is often perfectly reasonable to
work with types that are defined as "at least 16 bits". If you are
counting up to 1000, you don't care if the variable has a max of 32K or
2G. If you need bigger numbers, you can use "long int" and know that it
is at least 32 bits. If you need specific sizes (I often do in my
work), you can use types like int16_t and uint32_t. The system is
clear, flexible, portable, and works well on big and small systems. Of
course, it all relies somewhat on the programmer being competent - but
that applies to all programming tasks.

If all I'm doing is counting up to 1000 then I don't care. How many useful
programs do you know of that only count up to 1000?

The language needs to be affixed so that people can have expectations
across the board, in all cases, of how something should work. And then,
on top of that, every compiler author is able to demonstrate their coding
prowess by introducing a host of extensions which make their C compiler
better than another C compiler. Competition would abound. New features
created. Creativity employed. The result would be a better compiler for
all, rather than niche implementations which force someone into a
particular toolset because the man-hours of switching away from compiler
X to compiler Y introduces unknown variables which could exceed reasonable
time frames, or expenses.

Apparently you have /no/ concept of how the processor market works. You
live in your little world of x86, with brief excursions to ARM. Did you
know that it is only a few years ago that shipments of 8-bit cores
exceeded those of 4-bit cores? And that there are still far more 8-bit
cores sold than 32-bit? As for cpus that cannot access 8-bit or 16-bit
data, these are almost always DSP's - and there is good reason for that
behaviour. The manufacturers will continue to produce them, and
designers will continue to use them - because they give better value for
money (or power, or space) than alternative solutions. And they will
continue to program them in C, because C works fine with such cores.
That is the way the market works.

I don't care. We are not going backwards. We are going forwards. We
are not in the 1970s. We are in the 2010s. And things are not getting
smaller. They are getting bigger.

32-bit and 64-bit CPUs today exist for exceedingly low cost. ARM CPUs
can be created with full support for video, networking, external storage,
for less than $1 per unit.

We are not stuck in the past. We are standing at a switchover point.
Our mechanical processes are down below 20nm now. We can put so many
transistors on a chip today that there is no longer any comparison to
what existed even 15 years ago, let alone 40 years ago.

It's time to move on.

One thing that strikes me in your writing here, is that you seem to have
a belief that there is such a thing as "absolute" specifications - that
you can define your language and say /exactly/ how it will always work.
This is nonsense.
Nonsense.

You can give more rigid specifications than the C
standards do - but there are no absolutes here. There are /always/
aspects of the language that will be different for different compilers,
different options, different targets. Once you understand this, I think
you will get on a little better.

We'll see. RDC will have rigid standards, and then it will allow for
variances that are needed by X, Y, or Z. But people who write code,
even code like a = a[i++], in RDC, will always know how it is going
to work, platform A or B, optimization setting C or D, no matter the
circumstance.

People are what matter. People make decisions. The compiler is not
authorized to go outside of their dictates to do things for them. If
the developer wanted something done a particular way they should've
written it a particular way, otherwise the compiler will generate
EXACTLY what the user specifies, in the way the user specifies it,
even if it has to dip into clunky emulation to carry out the mandated
workload on some obscure CPU that probably has no business still
being around in the year 2014.

It is /precisely/ because C does not define these details, that you are
able to write for C and not for the machine. If C specified
requirements tuned for a particular processor type, then you would be
writing for that processor.

Click to expand...

No. It would guarantee that your program would work the same on all
CPUs, regardless of internal mechanical abilities. The purpose of the
language is to write common code. Having it work one way on one CPU,
and potentially another way on another CPU, or maintaining such an
insane concept as "with int, you can always know that it is at least
16-bits," when we are today in the era of 64-bit computers, is to be
found on the page of examples demonstrating wrongness.

The language should determine how all things are done. Anything done
beyond that is through exceptions to the standard through switches
which must be explicitly enabled (because on a particular platform you
want to take advantage of whatever features you deem appropriate).

If you allow "overrides", you no longer have rigid specs.

Click to expand...

Bzzz! Wrong. You still have rigid specs. The specs are exactly what
they indicate. Only the overrides change the spec'd behavior. A person
writing a program on ANY version of RDC will know that it works EXACTLY
the same on ANY version of RDC no matter what the underlying architecture
supports. And that will be true of every form of syntax a person could
invent, use, employ, borrow, and so on...

Rigid specs are what computers need. They expose a range of abilities
through the ISA, and I can always rely upon those "functions" to work
exactly as spec'd. I don't need to know whether the underlying hardware
uses method A or method B to compute a 32-bit sum ... I just want a
32-bit sum.

That's how it will be with RDC.

You have
"implementation dependent" behaviour.

Click to expand...

NO! You only have "implementation dependent" behavior because the authors
of the C language specs did not mandate that behavior should be consistent
across all platforms, with extensions allowed.

C exists as it does today because they chose this model of "I will leave
things open in the standard so that people can implement whatever they
choose, possibly based upon whether or not the breakfast they ate is
settling well in their stomach today, or not," rather than the model of
"I will define everything, and let anyone who wants to step outside of
what I've defined do so, as per an extension."

To me (b) is FAR AND AWAY the better choice.

This is precisely what the C
standards do. Sometimes I think the C standards could have some
multiple choice options rather than wider freedom (they have multiple
choices for the format of signed integers), but it is certainly better
that they say a particular point is "implementation dependent" than if
they were to say "/this/ is how to do it", and then allow
implementations to override that specification.

Click to expand...

Yes. I'd rather be a farmer than deal with this variation in behavior
across platforms for the rest of my life. I am a person. I am alive.
I write something in a language, I expect it to do what I say. I do not
want to be subject to the persona inclinations of a compiler author who
may disagree with me on a dozen things, and agree with me on a dozen
others. I want to know what I'm facing, and then I want to explore the
grandness of that author's vision as to which extension s/he has provided
unto me, making use of them on architecture X whenever I'm targeting
architecture X.

Yes, and that is what happens today with C. So what is your point?

Click to expand...

It should be that way on all aspects. On an 8-bit computer, ints should
should be 32-bit and the compiler should handle it with emulation.
However, by using the compiler switch to enable the alternate size, then
my code can be compiled as I desire for the lesser hardware.

No, it should not hide /everything/.

Click to expand...

Oh yes it should.

You should be free to develop
general portable code, and free to take advantage of particular
underlying architectures, depending on the sort of code you are writing.
C gives you that.

Click to expand...

C gives you that in its spec. It should not ever do that. It should give
you that ONLY through switches which enable those features.

Yes, and C gives you that. You have to stick to the things that C
defines explicitly, but that's fine.

Click to expand...

No. Only if the compiler author of tool X implemented a particular
mechanism to communicate the same way as the compiler author of tool Y
does such an ability exist. This may be the common situation, but from
what's been said in this thread, the spec does not require it.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

Which language will you write the first version in? (I guess it won't be C.)

I am currently developing RDC using Visual Studio 2008, and am using the Visual
C++ compiler, and though the bulk of my code is in C I am using some C++ syntax
relaxations, and some features which employ Microsoft extensions.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

In which case you don't want C - it's as simple as that.

Exactly why I'm writing RDC.

C doesn't work
that way because it's designed to make it a bit harder for the
programmer in order to squeeze the maximum out of a vast range of
architectures. C is not the language to use when "you don't care about
how difficult it is for the hardware to carry out my instructions".

My desire is to change that. "Rapid Development Compiler" is the name of
my compiler architecture framework, with RDC being a language component,
something C-like, but not C.

I desire to give the C developers of the world something more desirable,
less variable, more specific, and to do so in an edit-and-continue
framework from day one, something which allows them to rapidly develop
software in whatever languages are supported (RDC and VXB++ (an XBASE
language like dBase or Visual FoxPro) initially).

What you want is a perfectly reasonable thing for some purposes - and
Java shows that there is a role for a language that takes just that
approach. But it's not for other purposes.

Java is too protective. C provides far more flexibility. I desire to
have C with more rigid standards, and some features thrown out (like
undefined behavior), and some features added in (like union parameters).

I don't think the world needs as many programming languages as it has,

Couldn't agree more.

but it certainly needs more than one and trade-offs like this are part
of the reason that there are more than one.

Yes. And C will die unless it changes, or at the very least, be relegated
to those areas where the figurative COBOL code of yesterday needs modernized.

And you are still failing completely to understand that the
compiler-writer almost certainly didn't feel one way or the other. He
wrote it in a way that made sense and was compatible with the C standard
and the results fell out one way or the other.

I do recognize that. And it was still his choice (because the spec utterly
and completely failed to give him explicit guidance on those points).

Best regards,
Rick C. Hodgin

Geoff · Feb 8, 2014

The standard never changes. The extensions override the standard. Get it?

Yes. This is precisely the definition of "implementation-defined
behavior".

BartC · Feb 8, 2014

If all I'm doing is counting up to 1000 then I don't care. How many
useful
programs do you know of that only count up to 1000?

I would say the majority of individual 'int' variables in my programs don't
need to store values above 1000.

We'll see. RDC will have rigid standards, and then it will allow for
variances that are needed by X, Y, or Z. But people who write code,
even code like a = a[i++], in RDC, will always know how it is going
to work, platform A or B, optimization setting C or D, no matter the
circumstance.

Have you already implemented much of your language, or is this just talk?
Because it sounds incredibly ambitious (especially if you are doing this
single-handled; want it to work for any conceivable processor; and want to
also create a suite of tools including an IDE with an edit-and-continue
debugger; the MS versions will consist of millions of lines of source-code
and they have thousands of people and lots of money to help out!).

But some questions about this language:

- Will it have both signed and unsigned integer types?

- Will it allow binary operations between mixed signed and unsigned
integers? If so, how will it work (will the result be signed or unsigned,
how will it deal with one side out of range)?

- What will it do with overflow on signed arithmetic? What about unsigned;
will it be modular? (By overflow I mean calculations which, if there were
extra bits available, it would make use of those bits)

- What about floating point; how can you guarantee exactly the same
behaviour on any possible hardware without resorting to slow software
emulation?

- How will it deal with the different endian-ness of various hardware? At
the moment, if I have a 32-bit int in memory, and access the
lowest-addressed byte of that, I will get different results depending on the
memory organisation; you say you want the results always to be predictable.

- How about things like structs, where the hardware demands certain
alignments? This can also give different results (different sizes and
offsets for a start). Will you do byte-by-byte load and store to get around
that?

- Suppose you finish the implementation, but discover that to work exactly
how you claim, its performance is a fraction of that of C; will you drop
many of the requirements, or will it just be another switch? So that either
you have it slow, or have it fast but with many of the same problems of C
(which means that you might as well as used C).

Ben Bacarisse · Feb 8, 2014

Rick C. Hodgin said:
Wrong. Developers need to know if on this computer the int size is 16-bits,
or 32-bits, or 64-bits, or a gazillion bits, which means they need to know
something other than things related to C alone,

No, I disagree. They need to know what the language says about the
various types so they can use the correct one. The cases where one needs
to know are very rare.

I took a peek at a bit of your code (the sha1 utility) and it breaks on
my machine for just this reason -- an assumed integer size rather than
using a type with a known size. (That's after fixing a couple of calls
to a function where the wrong number of arguments are passed. Your
compiler should have spotted that.)

and even the compiler alone,
but they need to know about the mechanics of the machine's underlying
architecture ... and that's just wrong to impose upon every developer in
that way.

WAY too much information for a developer to have to be consider given that
he is writing code in a computer language, and not at the machine level.

RDC will have rigid types defined for char, short, int, long, and others.
They will never change on any platform, apart from command line switches
which allow extensions not specified in the spec.

C has that, or very nearly that, depending on exactly what you mean.
The bonus that C brings (albeit at the expense of complexity) is that
you can use plain int where you know it will work, with a reasonable
assurance that it will be fast by being the "natural" integer type for
some hardware.

The more I learn about C, the more I realize how horrid this thing is I've
been using all these years. I am so thankful I never knew about its rusty
undersides or I never would've devoted so much time and energy into coding
for it.

Quite. It does not seem to meet your needs. Why are you using it?

Rick C. Hodgin · Feb 8, 2014

Yes. This is precisely the definition of "implementation-defined
behavior".

No. The standard should say "will exhibit behavior [whatever]." And then
any compiler author can override that with their own override, something
which then steps "outside of spec'd behavior."

The allowance to introduce what is nothing short of questionable behavior
on each implementation ... is insane.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

I would say the majority of individual 'int' variables in my programs don't
need to store values above 1000.

The majority of mine don't typically go above 1,000 as well. But every
program I write has a range which do.

We'll see. RDC will have rigid standards, and then it will allow for
variances that are needed by X, Y, or Z. But people who write code,
even code like a = a[i++], in RDC, will always know how it is going
to work, platform A or B, optimization setting C or D, no matter the
circumstance.

Click to expand...

Have you already implemented much of your language, or is this just talk?

It has been mostly designed. Not much has not been coded. The framework
has been designed, but still has some additional work required.

Because it sounds incredibly ambitious (especially if you are doing this
single-handled; want it to work for any conceivable processor; and want to
also create a suite of tools including an IDE with an edit-and-continue
debugger; the MS versions will consist of millions of lines of source-code
and they have thousands of people and lots of money to help out!).

Click to expand...

My targets are my virtual machine, x86, and ARM (in that order). I do not
intend to write code which works for every processor, but rather a compiler
framework which allows code to be written for any processor. My initial
offerings are pertinent to my target goals.

But some questions about this language:
- Will it have both signed and unsigned integer types?

Click to expand...

Variables will be prefixed with "u" or "s" for unsigned and signed,
followed by their bit size. My initially supported types will be u8,
u16, u32, u64, s8, s16, s32, s64, f32, and f64. Char will be aliased
to the 8-bit form, short to 16-bit, int to 32-bit, long to 64-bit,
float to 32-bit floating point f32, double to 64-bit floating point
f64.

- Will it allow binary operations between mixed signed and unsigned
integers? If so, how will it work (will the result be signed or unsigned,
how will it deal with one side out of range)?

Click to expand...

Depends on their relative sizes. Typically whenever signed values are
used they are converted to signed values of the target's size.

- What will it do with overflow on signed arithmetic?

Click to expand...

Through my flow { } blocks I have introduced the ability to trap for
specific exceptions on a line-by-line manner, or an instance-by-instance
manner. Beyond explicit specification, behavior will be specified by
an explicitly programmed sticky feature (which changes the operation
from that point forward), or by command line switches which automatically
introduces a global handler.

What about unsigned;

Click to expand...

Same way.

will it be modular?

Click to expand...

The ability exists, but whether it actually is depends on how the
developer chooses to use those features.

(By overflow I mean calculations which, if there were
extra bits available, it would make use of those bits)

- What about floating point; how can you guarantee exactly the same
behaviour on any possible hardware without resorting to slow software
emulation?

Click to expand...

The architectures I'm targeting are all IEEE-754 (and optionally IEEE-854)
compliant. In the future, on a platform where I cannot guarantee compliance,
it will be through slow software emulation by default. A compiler override
switch will allow it to sacrifice standard computation for speed.

- How will it deal with the different endian-ness of various hardware?

Click to expand...

All values will be internally presented as little-endian. If it is
operating on a big-endian machine, the compiler will introduce byte
swapping automatically. The language will only see little-endian
presentations.

At the moment, if I have a 32-bit int in memory, and access the
lowest-addressed byte of that, I will get different results depending on the
memory organisation; you say you want the results always to be predictable.
Yes.

- How about things like structs, where the hardware demands certain
alignments?

Click to expand...

Everything will be byte aligned by default. If the developer wants to
explicitly align something a particular way, then it will be coded that
way in the source code, or through a compiler switch. My implementation
will sacrifice speed for consistency across platforms.

This can also give different results (different sizes and
offsets for a start). Will you do byte-by-byte load and store to get
around that?

Click to expand...

Yes. Or some other commensurate mechanism.

- Suppose you finish the implementation, but discover that to work exactly
how you claim, its performance is a fraction of that of C; will you drop
many of the requirements, or will it just be another switch?

Click to expand...

No. I expect it to be slower than C. My purpose in the whole "rapid
development compiler" is development time. By maintaining these
features for the developer, and several others, they will be able to
write code much faster, testing it much faster, completing their tasks
much quicker. Once it is developed, if they desire, they can take the
code and migrate to another platform and make machine-specific changes
as per their C spec.

So that either
you have it slow, or have it fast but with many of the same problems of C
(which means that you might as well as used C).

Click to expand...

I will always sacrifice performance for consistency. If I need a particular
portion to go faster, I will code around the weaknesses, or manually write
some assembly or C code which gets linked in.

Best regards,
Rick C. Hodgin

David Brown · Feb 8, 2014

Wrong. Developers need to know if on this computer the int size is 16-bits,
or 32-bits, or 64-bits, or a gazillion bits, which means they need to know
something other than things related to C alone, and even the compiler alone,
but they need to know about the mechanics of the machine's underlying
architecture ... and that's just wrong to impose upon every developer in
that way.

I don't know how you manage to get things so consistently backwards...

The flexibility of C's integer sizes means that you can write code that
works for any C target, and is efficient on any target. Surely that is
the aim here?

If int were fixed at 32-bit, then code such as:

for (int i = 0; i < 1000; i++) { foo(i); }

would be very inefficient on 8-bit or 16-bit machines.

If you have data that you know must be 32-bit in size, you can use
"int32_t" to specify it - it is clear, explicit, and cannot be mistaken
(and it will give compile-time errors if you have a weird machine that
does not support 32-bit integers).

C gives you everything you need or want here.

The only thing required is that the programmer needs to know the basics
of the language they are using - they need to know that "int" means "the
fastest, most natural integer size for this target supporting at least
16 bits".

Of course, often you can make additional assumptions - you often know
some of the "implementation defined" features. If you are writing a
program for Linux, you know that "int" is at least 32-bit. If you are
writing for an AVR, you know that "int" is precisely 16-bit. But these
details depend on the target, therefore the C standards refer to them as
"implementation defined".

WAY too much information for a developer to have to be consider given that
he is writing code in a computer language, and not at the machine level.

RDC will have rigid types defined for char, short, int, long, and others.
They will never change on any platform, apart from command line switches
which allow extensions not specified in the spec.

The more I learn about C, the more I realize how horrid this thing is I've
been using all these years. I am so thankful I never knew about its rusty
undersides or I never would've devoted so much time and energy into coding
for it.

I don't know where you have been learning about C, since you are
determined not to learn anything here, but instead twist everything into
a confirmation of the weird ideas you have already decided on.

Rick C. Hodgin · Feb 8, 2014

No, I disagree. They need to know what the language says about the
various types so they can use the correct one. The cases where one needs
to know are very rare.

Unless they try to access a disk file with a given structure, or read a
network packet with a given structure, not an uncommon task (to say the
least). In such a case there are fixed sizes that are needed. And in
general programming there are fixed sizes that are needed.

I took a peek at a bit of your code (the sha1 utility) and it breaks on
my machine for just this reason -- an assumed integer size rather than
using a type with a known size. (That's after fixing a couple of calls
to a function where the wrong number of arguments are passed. Your
compiler should have spotted that.)

The core of the SHA-1 engine was taken from public domain code. I have
left it unchanged in terms of syntax. I have altered the definition of
the various types to the forms defined at the head of sha1.cpp so that
they can be redefined based on the requirements of a particular flavor
of C. The algorithm comes with its own self-test to determine if it
is working properly. Compiling my sha1.cpp program with _TEST_ME
defined allows it to be stand-alone. Otherwise it can be used as an
#include file.

To address your issues, I have bypassed this limitation in C with my
own mechanism. There is a definition at the header which allows typedefs
for determining what native variable type for the compiler is used for
the fixed size entities I use. And, being as C is lackadaisical in
this area, you will have to manually redefine those typedefs for your
particular version of C (to maintain as is indicated in their type
(the bit size, such as 8 for u8, 32 for u32, and so on).

From the top of sha1.cpp:

typedef unsigned long long u64;
typedef unsigned long u32;
typedef unsigned short u16;
typedef unsigned char u8;

typedef long long s64;
typedef long s32;
typedef short s16;
typedef char s8;

typedef float f32;
typedef double f64;

These work for Visual C++. If you use another compiler, re-define them
to work for your platform. Were you using RDC, they would be native
types and there would not be an issue. Ever.

C has that, or very nearly that, depending on exactly what you mean.
The bonus that C brings (albeit at the expense of complexity) is that
you can use plain int where you know it will work, with a reasonable
assurance that it will be fast by being the "natural" integer type for
some hardware.

Yeah, I don't care about that. I want my ints to always be 32-bit,
something I do care about (because I'm accessing data transmitted
across a network, or read from disk, and it comes with 32-bit and
64-bit quantities I need to explicitly access).

Computers today are blazingly fast on nearly everything they do. There
are components which make them slower than they actually are (reading
data from disk, a network, waiting for user input, etc.), but they are
amazing processor engines.

FWIW, I would rather have slower code operating in parallel than faster
code operating in serial. Intelligent design, not reliance upon hardware.
As was said about a decade ago when the MHz wars ended: "The free lunch
is over. Now it's time to do real programming."

Quite. It does not seem to meet your needs. Why are you using it?

Because, at present, it is the best tool for what I'm trying to accomplish.
Once I complete RDC, I will never look back ... except or when I also
desire to bring a true C standard into an add-on, so that existing C
code will compile without alteration using the spec. Prayerfully it will
be someone else who ultimately codes that engine. For me, it's about 9th
on my list of things to do:

RDC/VXB++
Visual FreePro
Whitebox
Journey database engine
Exodus-32
Armodus-23
Exodus-64
Armodus-64
Other languages, including C, ported to the RDC compiler framework.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Feb 8, 2014

I don't know how you manage to get things so consistently backwards...

The flexibility of C's integer sizes means that you can write code that
works for any C target, and is efficient on any target. Surely that is
the aim here?

No. I don't particularly care about code execution efficiency for the
vast majority of what I do. Computers are fast enough. I do care about
how much time it takes me to get something right. The Rapid Development
Compiler is designed with edit-and-continue, the ability to use Pursues
in the debugger, or in code, to test things on the virtual machine,
making rapid restarts of data targets exceedingly easy for rapid testing.

My goals with RDC are (1) correctness in processing, (2) rapid development
for the developer, and (3) performance. I might even have a few things
in there ahead of (3) that I'm not remembering right now ... like ease of
writing documentation.

If int were fixed at 32-bit, then code such as:
for (int i = 0; i < 1000; i++) { foo(i); }
would be very inefficient on 8-bit or 16-bit machines.

Oh yes. It would be awful. I quote Sylvester the cat's son as he dons
a bag over his head: "Oh the shame!"

If you have data that you know must be 32-bit in size, you can use
"int32_t" to specify it - it is clear, explicit, and cannot be mistaken
(and it will give compile-time errors if you have a weird machine that
does not support 32-bit integers).

Yes, and the unwieldy "int32_t" is not at all difficult to type repeatedly,
nor understand. Even now, I sit here wondering "why is there a _t on the
end?"

RDC uses:

u8, s8
u16, s16
u32, s32
u64, s64
f32
f64

For bool I have considered using u1. Still undecided there.

C gives you everything you need or want here.

That part was added with C99 (to address a former shortcoming in its spec).
Its addition recognized how fundamental this ability should have been.
And usage of such a clunky syntax requires the use of typedefs, which is
more or less what I have done in my version.

The only thing required is that the programmer needs to know the basics
of the language they are using - they need to know that "int" means "the
fastest, most natural integer size for this target supporting at least
16 bits".

I can see the reasoning here. I don't place high enough value on the
performance gains to make it a first class citizen. I would rather have
the u8, s8, u16, s16, u32, s32, u64, s64, f32, f64 sequence as first
class citizens, and then add a secondary clunky type called int_t which
is the fastest integer for the platform.

C has it backwards.

Of course, often you can make additional assumptions - you often know
some of the "implementation defined" features. If you are writing a
program for Linux, you know that "int" is at least 32-bit. If you are
writing for an AVR, you know that "int" is precisely 16-bit. But these
details depend on the target, therefore the C standards refer to them as
"implementation defined".

Exactly. So, I write my code on Linux and Windows using compilers which
implement 32-bit ... and then should I have some future need to port it
to CPU X where int is only 16-bits ... Oops, sorry. All of that speed
gained now comes at the expense of man-hours of labor to re-write the
algorithms. No thanks.

I don't know where you have been learning about C, since you are
determined not to learn anything here, but instead twist everything into
a confirmation of the weird ideas you have already decided on.

I have learned much here. None of it lends credence to the idea that using
C is a good idea for multi-platform development. Too many concerns over
peculiar hardware quirks which may or may not exist to be of use.

Best regards,
Rick C. Hodgin

Ian Collins · Feb 8, 2014

Rick said:
Unless they try to access a disk file with a given structure, or read a
network packet with a given structure, not an uncommon task (to say the
least). In such a case there are fixed sizes that are needed. And in
general programming there are fixed sizes that are needed.

If you were to use a real, rather than Microsoft's half arsed, C
compiler and were familiar with the C standard, you would be aware of
the standardised fixed width types.

To address your issues, I have bypassed this limitation in C with my
own mechanism. There is a definition at the header which allows typedefs
for determining what native variable type for the compiler is used for
the fixed size entities I use. And, being as C is lackadaisical in
this area, you will have to manually redefine those typedefs for your
particular version of C (to maintain as is indicated in their type
(the bit size, such as 8 for u8, 32 for u32, and so on).

From the top of sha1.cpp:

typedef unsigned long long u64;
typedef unsigned long u32;
typedef unsigned short u16;
typedef unsigned char u8;

typedef long long s64;
typedef long s32;
typedef short s16;
typedef char s8;

typedef float f32;
typedef double f64;

These work for Visual C++. If you use another compiler, re-define them
to work for your platform. Were you using RDC, they would be native
types and there would not be an issue. Ever.

The standardised types are there to remove the need for this pointless
wheel reinventing.

Keith Thompson · Feb 8, 2014

Rick C. Hodgin said:
From the top of sha1.cpp:

typedef unsigned long long u64;
typedef unsigned long u32;
typedef unsigned short u16;
typedef unsigned char u8;

typedef long long s64;
typedef long s32;
typedef short s16;
typedef char s8;

All of those could be replaced by

#include <stdint.h>

and, if you insist on the short names:

typedef uint64_t u64;
/* ... */

typedef int64_t s64;
/* ... */

The fixed-size types you want are already in the language. And if an
implementation doesn't define them (say, because there is no unsigned
32-bit type or because it's pre-C99), then you probably won't be able to
define them yourself anyway.

typedef float f32;
typedef double f64;

There are no float32_t and float64_t types, probably because
floating-point has more characteristics than size (exponent bits,
exponent representation, significand bits, base, and so forth). But
most implementations these days support IEEE floating-point
representations. You can check whether __STDC_IEC_559__ is defined, and
#error out if it isn't.

[...]

Yeah, I don't care about that. I want my ints to always be 32-bit,
something I do care about (because I'm accessing data transmitted
across a network, or read from disk, and it comes with 32-bit and
64-bit quantities I need to explicitly access).

You can have exactly what you want; it's just spelled "int32_t" rather
than "int", and "uint32_t" rather than "unsigned int".

If your complaint that you can't do this in C is a ploy to irritate
someone into telling you how to do it, you could just *ask* how to do it
in C.

David Brown · Feb 8, 2014

Exactly why I'm writing RDC.

My desire is to change that. "Rapid Development Compiler" is the name of
my compiler architecture framework, with RDC being a language component,
something C-like, but not C.

I desire to give the C developers of the world something more desirable,
less variable, more specific, and to do so in an edit-and-continue
framework from day one, something which allows them to rapidly develop
software in whatever languages are supported (RDC and VXB++ (an XBASE
language like dBase or Visual FoxPro) initially).

What you don't seem to realise is that C developers do not "desire"
anything like that. You are not a C developer - you are a developer who
happens to use a bit of C in a very strange way, using inappropriate
tools (MSVC is a C++ toolkit, not a C toolkit). If /you/ want to make
"RDC", and /you/ want to use "RDC", then that's fine. The C programming
community, however, will continue to program in C, using tools that make
sense for C.

So have your own opinions, make your own choices, your own tools, your
own language if you want - but don't try and tell other people what
/they/ want or need.

Java is too protective. C provides far more flexibility. I desire to
have C with more rigid standards, and some features thrown out (like
undefined behavior), and some features added in (like union parameters).

Couldn't agree more.

How many other languages have you looked at in close detail before
deciding to write your own? Have you looked at D, for example? Or Go?
Since there are more that enough existing programming languages, maybe
there is already one that suits you, or at least is closer than C and
could give you a better starting point.

Yes. And C will die unless it changes, or at the very least, be relegated
to those areas where the figurative COBOL code of yesterday needs modernized.

I have worked with C for twenty years, and I expect to work with it for
another twenty years. It is not the only language I use (I have used
about a dozen), but I see no sign of its demise as the language of
choice for small systems and for low-level programming. You'd have to
be a masochist to pick plain C for implementing a compiler or
development toolkit (something like Python would be my first choice, but
C++ would be a reasonable choice). But the only real competitor for C
in its areas of strength is C++.

So no, C will /not/ die - even if it did not change. And in fact it
/does/ change - we have gone from Ansi, C90, C99, to the current C11,
and there are proposals in the works for future changes and
enhancements. However, IMHO, there are not many important or useful
changes from C99 to C11 - certainly less than from C90 to C99. I take
that to be a sign of stability in the language - C does what it has to
do, and people (users, compiler implementers, and language designers)
are mostly happy with it.

(In comparison, C++ had major changes from C++98 to C++11, which I think
greatly benefit the language, and plans for C++14 and C++17 are underway.)

I do recognize that. And it was still his choice (because the spec utterly
and completely failed to give him explicit guidance on those points).

The C standards are very clear on these points - they say the compiler
implementer is free to do whatever he wants. That could mean picking a
particular method that is easy for him to implement, or making a smarter
variable method that gives better object code.

What you seem to think of as a failing in the language design is
actually a strength, because it gives freedom and flexibility to the
compiler (and compiler implementers), and it does not cause any problems
for developers. If you had learned to program in C, rather than merely
learning enough to complain about imaginary faults, then you would
understand that.

James Kuyper · Feb 8, 2014

Oh for crying out loud, James ... the point is not the mechanical
implementation of the option, but rather that the option exists.

That's what I meant by the dangers of over-specification. You did in
fact specify the mechanics, even though you had not intended to do so.

....

A standard should define how things operate in all cases. ... >

The authors of the C standard very emphatically disagreed with you on
that point.

... It should
define specific behavior that can be relied upon no matter the platform,
no matter the implementation, no matter the circumstances. ...

That, on the other hand, they agree with. However, another key part of
the standard is that it should also clearly specify what you can NOT
rely upon. Ignore that part of the standard at your own risk.

The fact that C leaves things up in the air is absolutely mind blowing
to me. How could a computer language be so poorly written?

Because the people who wrote the standard would have considered it to be
poorly written if it had specified all of the things that you want it to
specify.

It is absolutely essential that nothing be left to chance or personal
desires for implementation. The standard should define it all,

The authors of the standard, on the other hand, considered it absolutely
essential that implementations be allowed a certain well-defined amount
of freedom in how they implemented the language. They wanted it to be
possible to efficiently implement C on a wide variety of platforms,
including ones where it would have been difficult bordering on
impossible to efficiently implement a more tightly specified language,
when that specification had been written with a very different kind of
platform in mind. As a result of making that decision, C is one of the
most widely implemented languages in the world. One of the first things
implemented on almost any new platform is a C compiler. Many of the
things implemented later are compilers, written in C, for languages with
less flexible specifications.

Rick C. Hodgin · Feb 8, 2014

If you were to use a real, rather than Microsoft's half arsed, C
compiler and were familiar with the C standard, you would be aware of
the standardised fixed width types.

I tried Solaris Studio. The older version of the toolset has issues in my
versions of Linux (Mint 14, and Mint 15 using MATE). In addition, I was
unable to import my code and get it to compile, though I was intrigued by
the idea of having a toolset I could continue to develop on outside of
Windows (as I truly hate Windows).

The standardised types are there to remove the need for this pointless
wheel reinventing.

If I desire to type int32_t everywhere I desire to use an s32, then I don't
have to redefine. However, that's hideous as well, requiring a typedef.

By me "reinventing the wheel," I was able to overcome a shortcoming not only
on the standard Microsoft C/C++ compiler, but also in any other I port to
because my code is already designed to run on fixed type sizes. They also
have the added advantage of being short names that are reasonably intuitive
(for someone interested in C-level programming).

Best regards,
Rick C. Hodgin

David Brown · Feb 8, 2014

Unless they try to access a disk file with a given structure, or read a
network packet with a given structure, not an uncommon task (to say the
least). In such a case there are fixed sizes that are needed. And in
general programming there are fixed sizes that are needed.

As has been pointed out to you multiple times, C has the answer - the
<stdint.h> types such as int32_t, uint16_t, etc., handle such cases
perfectly. The work I do is often dependent on these types.

Just because C has types whose sizes are not fixed, does not mean it
does not also have fixed size types!

The core of the SHA-1 engine was taken from public domain code. I have
left it unchanged in terms of syntax. I have altered the definition of
the various types to the forms defined at the head of sha1.cpp so that

"cpp" is the file extension for C++, not for C. Your compiler will
therefore treat it as C++. For many purposes, C is a subset of C++ -
but there are plenty of exceptions. How can you talk so arrogantly
about C when you don't even know what language you are using?

they can be redefined based on the requirements of a particular flavor
of C. The algorithm comes with its own self-test to determine if it
is working properly. Compiling my sha1.cpp program with _TEST_ME
defined allows it to be stand-alone. Otherwise it can be used as an
#include file.

To address your issues, I have bypassed this limitation in C with my
own mechanism. There is a definition at the header which allows typedefs
for determining what native variable type for the compiler is used for
the fixed size entities I use. And, being as C is lackadaisical in
this area, you will have to manually redefine those typedefs for your
particular version of C (to maintain as is indicated in their type
(the bit size, such as 8 for u8, 32 for u32, and so on).

From the top of sha1.cpp:

typedef unsigned long long u64;
typedef unsigned long u32;
typedef unsigned short u16;
typedef unsigned char u8;

typedef long long s64;
typedef long s32;
typedef short s16;
typedef char s8;

typedef float f32;
typedef double f64;

It has been a /very/ long time since this was necessary - even before
C99 was in common use, a great many compilers supplied a <stdint.h>
header with fixed size types. But before <stdint.h>, it was very common
to include a platform-specific header so that you had things like fixed
integer types, boolean types, etc., easily available. Since you are
using Windows, you can include <WinDef.h> in your code to get these
platform-specific definitions. Or you can use use <stdint.h> - even
though MSVC++ is not intended for C development (it is limited to C90

These work for Visual C++. If you use another compiler, re-define them
to work for your platform. Were you using RDC, they would be native
types and there would not be an issue. Ever.

Yeah, I don't care about that. I want my ints to always be 32-bit,
something I do care about (because I'm accessing data transmitted
across a network, or read from disk, and it comes with 32-bit and
64-bit quantities I need to explicitly access).

Apparently we need to add networking and file handling to the things you
know nothing about.

Computers today are blazingly fast on nearly everything they do. There
are components which make them slower than they actually are (reading
data from disk, a network, waiting for user input, etc.), but they are
amazing processor engines.

FWIW, I would rather have slower code operating in parallel than faster
code operating in serial. Intelligent design, not reliance upon hardware.
As was said about a decade ago when the MHz wars ended: "The free lunch
is over. Now it's time to do real programming."

The "MHz wars" never ended. There will always be reasons to prefer
faster serial code over slower parallel code. Some tasks work well in
parallel, but many do not.

Because, at present, it is the best tool for what I'm trying to accomplish.

Please, tell us which other tools you considered for developing RDC?
Even if you believe edit-and-continue to be god's gift to programmers,
and are thus locked to MSVC++, the obvious choice of C++ is staring you
in the face and would be a far better choice than C (at least when using
MSVC++).

Portability issues (union, bitfields)	7	Nov 4, 2009
UNION global variabl initialize	10	Sep 12, 2011
Union and strict aliasing	4	Jul 28, 2012
Can one get away with an under-allocated union?	5	Dec 25, 2010
Union test for endianess	47	Jun 16, 2011
How to understand the union part in this C segment	1	Sep 13, 2010
Union trouble	7	Mar 27, 2008
Union and pointer casts?	13	Feb 24, 2011

Syntax for union parameter

Rick C. Hodgin

Rick C. Hodgin

Rick C. Hodgin

Rick C. Hodgin

Rick C. Hodgin

Rick C. Hodgin

Geoff

BartC

Ben Bacarisse

Rick C. Hodgin

Rick C. Hodgin

David Brown

Rick C. Hodgin

Rick C. Hodgin

Ian Collins

Keith Thompson

David Brown

James Kuyper

Rick C. Hodgin

David Brown

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads