Someone Told Me: Is C an Assembly Language

J

James Kuyper

(snip, I wrote)



Yes, but when those appear I write a C callable routine to
do just that function and nothing else.

Yes, that is precisely what I was talking about.
 
?

?? Tiib

Finally, there are still some things that are difficult to specify in C,
such as handling CPU condition flags. Try writing *portable* C code
that tells you whether multiplying two given int values will overflow.

OK ... interesting challenge. I will make a try. Do not laugh. :)

/// will_imul_overflow.h
#include <limits.h>

#if LONG_MAX > INT_MAX
# define DOUBLE_INT long
#elif LLONG_MAX > INT_MAX
# define DOUBLE_INT long long
#else
# error "sorry, seems that will_imul_overflow can't be ported here."
#endif

inline int will_imul_overflow( int a, int b )
{
DOUBLE_INT product = (DOUBLE_INT)a * b;
return product < INT_MIN || product > INT_MAX;
}

#undef DOUBLE_INT

The counter-challenge to write it in assembler for all platforms
where above gives correct result is perhaps harder?
 
G

glen herrmannsfeldt

(snip, I wrote)
That's like saying "some things can't be done in assembly language, like
writing a business letter in Word format". It doesn't make sense.
"Information passed onto the linker" is just some arbitrary container
constraint with no relevance to the actual power of the language itself.

That is what I thought, but it seems that they added new fields to the
object program format without changing the assembler to generate them.

Now, the IBM assemblers have a PUNCH instruction which writes one card
directly to the object file, so it might be possible that way.
You can invent a machine that magically "executes" C-code as its native
language, sure. It would be a completely pointless acedemic exercise,
but, yes, this could be done. Such a machine would essentially "compile"
code at runtime (i.e. parse and transform into smaller pieces,
"assembly" instructions).

That is true, but one could also write the C compiler and then never get
around to writing an assembler. Now, in the unix world it is usual for
C compilers to write textual assembler code and then assemble that,
possibly with something less powerful than the usual macro assembler.
Many other systems write the object programs directly.

The IBM compilers I know have an option to write a human readable,
but not assembler readable, form of the generated assembly code.
However, all machines that we see today execute some form of machine
language which has a 1:1 mapping onto assembly language. It's only a
*representation* that is easy for humans to read, i.e. it says "nop"
instead of "0x90" and it says "int3" instead of "0xcc". That's why it's
absolutely trivial to write an assembler, but very much harder to write
a compiler.

Well, the usual assemblers do things like allow for symbolic addressing
including forward references, not to mention macros and the computation
of offsets for addresses that need offsets. Also, conversion of
constants from human readable form, especially floating point.

The manual for the S/370 assembler (describing the assembler
instructions but not machine instructions) is much larger than the
"Principles of Operations" describing the machine instructions.
The macro language for the IBM assemblers might be more complicated
than C89. (OS/360 Assembler F is a four pass assembler, partly because
it has to be able to run in small memory machines.)

There is, however, no requirement that one write the assembler before
the C compiler, no matter how trivial it may seem. One could design a
new processor, write the back end for a C compiler to generate code for
it, and never get around to writing an assembler.

-- glen
 
G

glen herrmannsfeldt

(snip, I wrote)
If it's just info passed to the linker, I bet there's a way
to do it in assembler, too.

It was some years ago that I knew that, and it may have been added
since. Even back to OS/360, the assembler and object program are
not so simple. Much has been added since.

-- glen
 
J

James Kuyper

OK ... interesting challenge. I will make a try. Do not laugh. :)

/// will_imul_overflow.h
#include <limits.h>

#if LONG_MAX > INT_MAX

That doesn't test the right thing. If INT_MAX were 16777215, and
LONG_MAX were 2147483647, that test would say that "long" was good
enough, even though it's not actually large enough to avoid overflow.
What you really want to check is whether LONG_MAX is greater than or
equal to the largest possible product of two ints, which is the
mathematical value of INT_MIN*INT_MIN. However, since INTMAX_MIN ==
INT_MIN is permitted, it's not possible to portably perform that test
directly without possibly running into overflow. I normally use
something like the following for cases like this:

#if LONG_MAX/INT_MIN < INT_MIN
# define DOUBLE_INT long
#elif LLONG_MAX > INT_MAX

The same issue applies here, of course.
# define DOUBLE_INT long long
#else
# error "sorry, seems that will_imul_overflow can't be ported here."
#endif

inline int will_imul_overflow( int a, int b )
{
DOUBLE_INT product = (DOUBLE_INT)a * b;
return product < INT_MIN || product > INT_MAX;
}

#undef DOUBLE_INT

There are approaches that can be used even when there's no type capable
of storing the product without overflow. Carefully considering how I
dealt with the possible overflow in the #if directive might give you a hint.
 
K

Keith Thompson

Öö Tiib said:
OK ... interesting challenge. I will make a try. Do not laugh. :)

/// will_imul_overflow.h
#include <limits.h>

#if LONG_MAX > INT_MAX
# define DOUBLE_INT long
#elif LLONG_MAX > INT_MAX
# define DOUBLE_INT long long
#else
# error "sorry, seems that will_imul_overflow can't be ported here."
#endif

inline int will_imul_overflow( int a, int b )
{
DOUBLE_INT product = (DOUBLE_INT)a * b;
return product < INT_MIN || product > INT_MAX;
}

#undef DOUBLE_INT

Not a bad try. I'd use typedefs rather than macros, and you're
assuming that because a type is bigger than int, it can hold the
result of any int-by-int multiplication.

And of course it's not entirely portable.

Try doing the same thing for intmax_t.
The counter-challenge to write it in assembler for all platforms >
where above gives correct result is perhaps harder?

Writing anything in "assembler for all platforms" is clearly
impractical, or even impossible if you let me invent new platforms
after you write the code.

My point is that for a given assembly language, you can likely solve
the problem by performing the multiplication and then checking
some bits, or by some similar mechanism. C doesn't make such
mechanisms visible.

You could do some complicated mathemetical checks in pure C that
would tell you whether a multiplication *will* overflow before you
actually perform it, but such checks are going to be substantially
more expensive than what's available in assembly code.

(I frankly find it a bit disturbing that so many C programmers aren't
bothered by the inability to detect overflows.)
 
G

glen herrmannsfeldt

(snip)
And many compilers generate assembly language which is then assembled to
machine code. (Others generate machine code directly.)
Finally, there are still some things that are difficult to specify in C,
such as handling CPU condition flags. Try writing *portable* C code
that tells you whether multiplying two given int values will overflow.

TeX was originally written in Pascal, through a special "literate
programming" system Knuth developed. Among other things, WEB
(the name of the system) has macros and a line by line update
facility.

But anyway, one thing that TeX has to do fairly often is multiply
two 32 bit values with 16 bits after binary point, generating a
product in the same form.

Easy to do in assembly language on hardware that generates a double
width product, but not at all easy in Pascal. The original source
contains a Pascal routine to do it, but he suggests one should replace
it with an assembly version if possible. I believe that there is also
one to do divide, or maybe (A*B)/C where all three have 16 bits
after the binary point.

I believe the currently running TeX versions are from Web2C that
translates modified Pascal source (using the WEB CHANGEFILE system)
into C.

In any case, TeX requires a system with at least 32 it integers, which
portable C doesn't. Given CHAR_BIT and sizeof(int), and the assumption
(or not) of twos complement, it shouldn't be all that hard to do.
Maybe tedious, though, and it might run slow. (You didn't specify
either of those.)

In the twos complement case, make the special tests for the most
negative integer value, take the absolute value of both operands,
divide them up into 8 bit pieces (using shift and & operators),
multiply them together, and combine them as appropriate to
generate a full, double length, product. Then test the product
for being bigger than int. Restore the sign of the product.

(The double length product might be in two or four int variables,
but it isn't hard to test the bits.)

-- glen
 
G

glen herrmannsfeldt

(snip)
My point is that for a given assembly language, you can likely solve
the problem by performing the multiplication and then checking
some bits, or by some similar mechanism. C doesn't make such
mechanisms visible.

It would be nice to have a C function to generate the full,
double length, product in two C int variables.
You could do some complicated mathemetical checks in pure C that
would tell you whether a multiplication *will* overflow before you
actually perform it, but such checks are going to be substantially
more expensive than what's available in assembly code.

Oh, so expense counts now.
(I frankly find it a bit disturbing that so many C programmers aren't
bothered by the inability to detect overflows.)

Probably the same ones that don't test the return value of
fclose(), or probably any other I/O function call.

With buffering, and in the case of writing a small file,
it might be that the only error is detected by fclose()
when the buffer is written out.

-- glen
 
E

Eric Sosman

[...]
(I frankly find it a bit disturbing that so many C programmers aren't
bothered by the inability to detect overflows.)

In my formative years I wrote code (not in C) for machines
that ABENDed a program on integer over- and underflow. To this
day I often think C programs would be better if C didn't hide
behind "integer overflow is undefined" and instead said "integer
overflow is a capital offense, and you will rue it! SIGBANG!!!"

When I'm elected Almighty Arbiter of Automata, that's how
it'll be. Just you wait.
 
J

James Kuyper

On 04/03/2013 07:59 PM, glen herrmannsfeldt wrote:
....
In any case, TeX requires a system with at least 32 it integers, which
portable C doesn't.

? The minimum permitted value for LONG_MAX is 2147483647; that's been
true since at least C99; I think it was true in C90 as well, but I don't
have a copy of that version of the standard, so I can't be sure.
 
G

glen herrmannsfeldt

Eric Sosman said:
[...]
(I frankly find it a bit disturbing that so many C programmers aren't
bothered by the inability to detect overflows.)
In my formative years I wrote code (not in C) for machines
that ABENDed a program on integer over- and underflow. To this
day I often think C programs would be better if C didn't hide
behind "integer overflow is undefined" and instead said "integer
overflow is a capital offense, and you will rue it! SIGBANG!!!"

The system that I know of that tended to ABEND has a mask bit
controlling the effect of integer overflow. Fortran, at least
didn't set the bit, so that overflow was ignored.

I believe (having never programmed in it) that COBOL did set it.

Of course, you could always write an assembly program and set or
clear the bit as desired.
When I'm elected Almighty Arbiter of Automata, that's how
it'll be. Just you wait.

Many systems don't make it so easy to test, though.

-- glen
 
G

glen herrmannsfeldt

(snip, someone wrote)
While you could obviously do the check in some fashion in all assembly
languages, not all ISAs include double width multiplication (or
overflow detection on multiplies).

Good ones do. A 16 bit ISA might have a 16x16 but not 32x32.

Well, if the product is double length, then it isn't overflow.
For S/360 and successors you do a 32 bit arithmetic left shift,
and the overflow will trap if the mask bit is one.
So it may not actually be easier in assembler than in C on
some platforms.

-- glen
 
L

Les Cargill

?? Tiib said:
OK ... interesting challenge. I will make a try. Do not laugh. :)

/// will_imul_overflow.h
#include <limits.h>

#if LONG_MAX > INT_MAX
# define DOUBLE_INT long
#elif LLONG_MAX > INT_MAX
# define DOUBLE_INT long long
#else
# error "sorry, seems that will_imul_overflow can't be ported here."
#endif

inline int will_imul_overflow( int a, int b )
{
DOUBLE_INT product = (DOUBLE_INT)a * b;
return product < INT_MIN || product > INT_MAX;
}

#undef DOUBLE_INT


Perhaps

inline int will_imul_overflow( int a, int b )
{
const int prod = a * b;
const int nb = prod / a ;
return ( b != nb );
}

Supporting unsigned may require a little more effort.
 
K

Keith Thompson

Les Cargill said:
[snip]

Perhaps

inline int will_imul_overflow( int a, int b )
{
const int prod = a * b;
const int nb = prod / a ;
return ( b != nb );
}

If the multiplication overflows, the behavior is undefined.
Supporting unsigned may require a little more effort.

[...]
 
L

Les Cargill

Eric said:
[...]
(I frankly find it a bit disturbing that so many C programmers aren't
bothered by the inability to detect overflows.)

In my formative years I wrote code (not in C) for machines
that ABENDed a program on integer over- and underflow. To this
day I often think C programs would be better if C didn't hide
behind "integer overflow is undefined" and instead said "integer
overflow is a capital offense, and you will rue it! SIGBANG!!!"

There is the -ftrapv option for gcc...
 
L

Les Cargill

Keith said:
Les Cargill said:
On Wednesday, 3 April 2013 20:43:52 UTC+3, Keith Thompson wrote:
Finally, there are still some things that are difficult to specify in C,
such as handling CPU condition flags. Try writing *portable* C code
that tells you whether multiplying two given int values will overflow.
[snip]

Perhaps

inline int will_imul_overflow( int a, int b )
{
const int prod = a * b;
const int nb = prod / a ;
return ( b != nb );
}

If the multiplication overflows, the behavior is undefined.

But not hopelessly so...
Supporting unsigned may require a little more effort.

[...]
 
G

glen herrmannsfeldt

(snip, I wrote)
While I don't happen to know of such a limitation, it could well
exist, I've not encountered it, and haven't gone looking.
But the linker and assembler (and compilers) are independent programs,
just because the current assembler doesn't provide a way to use some
feature of the current linker, while an HLL does, says little about to
capabilities of either assemblers or HLLs in general.

That could be. It has been some years now.
As a non-mainframe example, MS's linker supports LTCG, and there are
certain object module entries that pertain to that (namely object
modules entries that contain the internal representation of the parsed
program, and a link to the executable compiler back end that can
process those), that cannot be generated by the assembler. In that
case a linker feature (LTCG) is not really relevant to an assembler.
A mainframe example is when C started getting popular on the
mainframe, it needed a linker enhancement to support external
identifies longer than eight characters.

The PL/I compilers used to take, I believe, the first four and last
three characters as the external name. (They generate more than one
CSECT for each source procedure, so need one more character.)

I thought that C allowed for something similar, but maybe not.
Originally there was a
pre-link step that would mangle long names into short names, and then
pass the result to the traditional linker. Eventually the linker was
replaced with the binder, which supported the extended format
directly. The assembler also eventually grew long name support, but
not at the same time as the original pre-link+linker process.

-- glen
 
J

James Kuyper

On Wed, 03 Apr 2013 10:56:35 -0400, James Kuyper



I don't disagree, but there's nothing to prevent assemblers from
providing higher level facilities as well (macros, block structure
statements, data structures, etc.). Use of those may not be
appropriate in every situation, but may help in many, particularly in
the many non-critical bits of code.

I agree - that's all covered by the word "convenient". The key question
is whether the primary purpose of the language is to specify the machine
language instructions to be generated, or only the behavior to be
achieved. Convenience features like the ones you describe don't affect
that distinction.
Of course that pretty much requires that we consider an HLL that
supports inline assembler (as do most C compilers), to be an
assembler. ...

I agree, the inline assembler option provided by many C compilers does
indeed support some form of assembly language. However, that does not
make C itself an assembly language.
... And while such a compiler might not be a very good
assembler, I can't really see how a compiler that supports:

int main()
{
__asm{
...10,000 lines of assembler...
}
return 0;
}

isn't an assembler, no matter what else it might be.

So assemblers and HLLs are perfectly capable of intruding into each
others turf, and trying to draw a bright line is impossible.

I draw the "bright" line at "__asm{" and "}", the constructs which
separate the C code from the assembly code.
I'm sure it's possible to blur the distinction; but I find it hard to
believe the claim that the distinction has already been so universally
blurred that things I would willing call "assembly languages" have
disappeared. I've seen too many things recently posted to this very
newsgroup that look very much like snippets of assembly code, to believe
that claim.
 
K

Keith Thompson

Les Cargill said:
Keith said:
Les Cargill said:
On Wednesday, 3 April 2013 20:43:52 UTC+3, Keith Thompson wrote:
Finally, there are still some things that are difficult to specify in C,
such as handling CPU condition flags. Try writing *portable* C code
that tells you whether multiplying two given int values will overflow. [snip]

Perhaps

inline int will_imul_overflow( int a, int b )
{
const int prod = a * b;
const int nb = prod / a ;
return ( b != nb );
}

If the multiplication overflows, the behavior is undefined.

But not hopelessly so...

What does that mean?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top