assembly in future C standard

fermineutron · Oct 28, 2006

Some compilers support __asm{ } statement which allows integration of C
and raw assembly code. A while back I asked a question about such
syntax and was told that __asm is not a part of a C standard. My
question now is:

Is there a chance that such statement will become a part of C standard
in the future? In some cases using asm language is the best way to
acomplish some small task, hence integration of C and asm would greatly
enhence C, or atleast it would in my opinion.

Is there a good reason why __asm is not a part of current C standard?

I have bumped into compilers that support and others that ignore __asm
statement so obviously it is still not a part of C standard.

Richard Heathfield · Oct 28, 2006

fermineutron said:

Some compilers support __asm{ } statement which allows integration of C
and raw assembly code. A while back I asked a question about such
syntax and was told that __asm is not a part of a C standard. My
question now is:

Is there a chance that such statement will become a part of C standard
in the future?
No.

In some cases using asm language is the best way to
acomplish some small task, hence integration of C and asm would greatly
enhence C, or atleast it would in my opinion.
No.

Is there a good reason why __asm is not a part of current C standard?

Yes.

Andrew Poelstra · Oct 28, 2006

Is there a chance that such statement will become a part of C standard
in the future? In some cases using asm language is the best way to
acomplish some small task, hence integration of C and asm would greatly
enhence C, or atleast it would in my opinion.

I'd say no, but the fact that system() is a part of the C standard
makes that answer questionable.

Is there a good reason why __asm is not a part of current C standard?

It's 100% non-portable among different architectures, which is contrary
to the spirit of C.

Michal Nazarewicz · Oct 28, 2006

fermineutron said:
Is there a good reason why __asm is not a part of current C standard?

It'll make C language not portable.

sjdevnull · Oct 28, 2006

fermineutron said:
Is there a good reason why __asm is not a part of current C standard?

Yes. There's no portable assembly, using it makes your code
non-portable between different implementations.

Even on the same OS and hardware assembly language may differ. e.g. on
Windows, Microsoft VC++ uses the MASM:

mov ebx, eax

(AKA "Intel syntax")

while gcc uses the AT&T-style:

movl %eax, %ebx

(AKA "AT&T syntax")

The instruction name, register ordering, and register syntax are all
different. And that's on the same OS & architecture. If you change
chips, the target assembly language won't even bear the superficial
simlarities seen here. To standardize __asm or __asm__ even only on
one platform would mean standardizing the entire assembly language for
that platform too, and even then it wouldn't result in code being
portable to other hardware.

Thomas Lumley · Oct 28, 2006

fermineutron said:
Some compilers support __asm{ } statement which allows integration of C
and raw assembly code. A while back I asked a question about such
syntax and was told that __asm is not a part of a C standard. My
question now is:

Is there a good reason why __asm is not a part of current C standard?

As everyone else has pointed out, you can't portably specify the output
of a program that contains any call to __asm{} and so it isn't in the
proper domain of the C standard.

The C standard does specify that __asm is in the implementation
namespace. This means that an implementation can choose to specify what
__asm{} does, without causing any name clashes with portable code, and
presumably without any conflicts with subsequent versions of the
Standard. That's about all you *can* guarantee for raw assembly code,
and it is quite a useful guarantee.

-thomas

jacob navia · Oct 28, 2006

Yes. There's no portable assembly, using it makes your code
non-portable between different implementations.

My qfloat package has been ported to linux/windows/ and it will run
(unmodified) under Solaris, Mac (x86) and aix (x86).

Assembly is quite portable between OSes, but not within
different processors

fermineutron · Oct 28, 2006

jacob said:
My qfloat package has been ported to linux/windows/ and it will run
(unmodified) under Solaris, Mac (x86) and aix (x86).

Assembly is quite portable between OSes, but not within
different processors

Makes sense.

I guess portability of C is a bigger Ace than the gain from clock-cyle
level of the CPU controll.

Richard Heathfield · Oct 28, 2006

fermineutron said:

Makes sense.

Would that it were true - but it isn't. There is no such thing as "assembly
language". There are, rather, a great many assembly languages. One
particular assembly language may well be portable between two or even more
OSs, and yet not be portable between two different assemblers on the same
OS. One assembly language may be portable between two different assemblers
on the same OS, and yet not be portable to some other OS.

I guess portability of C is a bigger Ace than the gain from clock-cyle
level of the CPU controll.

It depends what you need. But the best solution to your immediate problem -
that of performance - lies in choosing better, faster algorithms and
implementing them well. You have a great many gains to realise from doing
this; if you do it well, you may well decide that you have no need for any
assembly language after all. Implementing your current algorithms in some
assembly language or other is unlikely to result in significant performance
improvements.

Eric Sosman · Oct 28, 2006

fermineutron said:
Makes sense.

I guess portability of C is a bigger Ace than the gain from clock-cyle
level of the CPU controll.

Inserting assembly language into the middle of C code (if
the compiler permits it) is rarely the road to a noticeable
performance improvement. It may even disimprove performance by
creating an "opaque" section whose purpose the compiler cannot
fathom, thus inhibiting optimizations that would span the area
of impenetrable code. Once in a very great while, embedded
assembly is the cat's pajamas -- but most of the time the cat
sleeps nude.

A more usual motivation is to make use of special machine
instructions the compiler would not generate on its own. If
you need to fetch a value with "cache-bypass load" or execute
the "refresh TLB tag bits" instruction, injecting assembly into
the middle of the C source may be attractive. But even in such
cases it is usually cleaner to package the screwball instructions
in external functions that are themselves written in assembly,
and write an ordinary function call in the C code. This has the
benefit of pulling the machine-dependent stuff out of the main
stream of your program, making it easier to substitute "morally
equivalent" external functions when porting the code to new
platforms. You are almost always better off writing and calling
an AtomicIncrementInt() function than trying to embed assembly
for a compare-and-swap loop.

Chris Torek · Oct 28, 2006

A more usual motivation is to make use of special machine
instructions the compiler would not generate on its own. If
you need to fetch a value with "cache-bypass load" or execute
the "refresh TLB tag bits" instruction, injecting assembly into
the middle of the C source may be attractive. But even in such
cases it is usually cleaner to package the screwball instructions
in external functions that are themselves written in assembly,
and write an ordinary function call in the C code. This has the
benefit of pulling the machine-dependent stuff out of the main
stream of your program, making it easier to substitute "morally
equivalent" external functions when porting the code to new
platforms. You are almost always better off writing and calling
an AtomicIncrementInt() function than trying to embed assembly
for a compare-and-swap loop.

I agree with all of this; however, in some (sometimes significant)
cases (e.g., the actual implementation for a mutex), you may want
to have an inline expansion of the underlying atomic operation,
typically via a macro. For instance, if you have a mutex construct
that -- at least in the uncontested case -- is just a (possibly
locked) compare-and-swap, you may want the x86-specific version
of:

MUTEX_GET(mutex_ptr);

to turn into the assembly equivalent of:

if (compare_and_exchange(mutex_ptr->key, __self()->key) != SUCCEEDED)
mutex_get_contested(mutex_ptr); /* blocks until success */

The tricky part lies not only in arranging for the assembly equivalent
to be inserted inline, but in *also* informing the compiler that
it must not move certain memory operations across the "special"
instruction(s). That is, if the mutex protects a data structure,
the compiler *must not* turn:

MUTEX_GET(&data->mutex);
data->field = newvalue;
MUTEX_RELEASE(&data->mutex);

into, e.g.:

data->field = newvalue;
MUTEX_GET(&data->mutex);
MUTEX_RELEASE(&data->mutex);

The compiler may think the second version is superior (because it
uses less CPU time overall, e.g., due to reduced register pressure
or because it schedules better), but in fact, it is not.

fermineutron · Oct 29, 2006

Richard said:
It depends what you need. But the best solution to your immediate problem -
that of performance - lies in choosing better, faster algorithms and
implementing them well. You have a great many gains to realise from doing
this; if you do it well, you may well decide that you have no need for any
assembly language after all. Implementing your current algorithms in some
assembly language or other is unlikely to result in significant performance
improvements.

Well, it seems to me that it is not so much the speed gain as a
functionality gain that can be realized from assembly language. For
example, a while back i wrote a simple C profiler, which parces C file
and inserts RDTSC statements before and after each C statement, hence
determining the number of clock cycles it took to execute that line of
code. Now the only compiler that my profiler will work with is lcc
because the RDTSC is a part of intrinsics library of LCC, but it is not
a part of BC++ 5.02 for example. Had there been a full support for
assembly code within C I could have used inline assembly to do this and
not rely on intrinsics library of LCC.

To the best of my knowlege most of modern C compilers produce assembly
code whih is as good as one could optimize it by hand, so clearly there
is no speed gain from asm for a general purpose code.

Speaking of speed gains:
Richard, you may be pleased to hear that after reworking my code for
calculation of factorials of large numbers, to not use the stack space
but to use malloc instead, i realized a performance gain of about 60
times. So a correctly writtec C code does not loose to correctly
writtem asm code in speed, but it is somewhat limited in CPU control.

Theoretically it seems possible to develop a subset of assembly
languages which are used by motern CPUs and include in a C standard a C
library which would allow user to use the assembly subset. Since it is
a limited subset and its sintax is goverened by C it should not present
portability challenges with possible exception of older systems. any
thoughts about this?

Gordon Burditt · Oct 29, 2006

Some compilers support __asm{ } statement which allows integration of C

and raw assembly code. A while back I asked a question about such
syntax and was told that __asm is not a part of a C standard. My
question now is:

Is there a chance that such statement will become a part of C standard
in the future?

NO. And I believe the same applies to __COBOL{}.

In some cases using asm language is the best way to
acomplish some small task, hence integration of C and asm would greatly
enhence C, or atleast it would in my opinion.

In my opinion, you should write a function in pure asm, assemble
it separately, and link it with the C program. That requires you
to know things like function linkage conventions, symbol naming
conventions, etc.

If you write inline assembly, how does the assembly talk to the
non-assembly part, as far as passing data between them? The part
about function linkage conventions and symbol naming conventions
aren't enough. You have to have "unwarranted chumminess with the
compiler" to know what register the compiler puts stuff in.

Is there a good reason why __asm is not a part of current C standard?

There's no way to describe what the stuff in the {} DOES in any
reasonable way. There are, for example, often several
mutually-incompatible assembly languages for the *SAME* CPU.
And, unlike system(), you can't generate the assembly-language
code at runtime for the purpose of passing data to it.

Peter Nilsson · Oct 29, 2006

Gordon said:
There's no way to describe what the stuff in the {} DOES in any
reasonable way.

There doesn't actually need to be. C++ has...

An asm declaration has the form

asm-definition:
asm ( stringliteral ) ;

The meaning of an asm declaration is implementation defined.

The question is better asked in comp.std.c. But as I see it, the
purpose
of standardisation is to bring common elements into line. But trying
to bring into line a can of worms can be difficult. I know of many C++
implementations that don't support the standard form of asm, but have
retained their own syntax.

Looking back, I imagine many prestandard compilers already had their
own inline assembler syntax and that those implementations, as today,
varied wildly.

There are, for example, often several mutually-incompatible assembly
languages for the *SAME* CPU.

True, but system() is often subject to the same problems. [Generate a
command that calls a function and pipes the output to a given name
and it will work under one implementation and not others, even on
the same platform. Quoting arguments that contain whitespace can
be handled differently by different implementations on the same
platform.]

Perhaps the most non-portable programming that is in the standard is
support for locales.

And, unlike system(), you can't generate the assembly-language
code at runtime for the purpose of passing data to it.

Some old implementations allowed you to put machine code into an
unsigned char and 'call' that code like a function. However there are
problems on modern machines, e.g. instruction caching, and code
data lying in non-executable segments.

The system() function brings up the topic of command line options.
Lot's of programs use argc/argv, but their use itself is inherently
(even if not dramatically) non-portable. For example, some
implementations will perform wildcard replacement for you, others
won't.

Contrary to Richard Heathfield's categorical statement, it is not an
absolute given that there will never be an asm keyword in C. But it
is unlikely because it's already clear that the asm keyword in C++ has
not served to truly standardise the syntax of inline assembly.

At the end of the day, the committee could probably spend many man
weeks deciding issues on an __asm keyword, but for what? Most
implementations will keep their existing syntax, and most programmers
who use inline assembly will no doubt continue to prefer the localised
syntax because it's less cumbersome than any standard syntax.

Christopher Benson-Manica · Oct 29, 2006

(Crossposted to comp.std.c, with followups directed there, hopefully
appropriately. The original post discussed the possibility of whether
__asm or something similar to it would be added to the C standard.)

Contrary to Richard Heathfield's categorical statement, it is not an
absolute given that there will never be an asm keyword in C. But it
is unlikely because it's already clear that the asm keyword in C++ has
not served to truly standardise the syntax of inline assembly.

One idea that was not mentioned in the original thread (I imagine for
good reason, because it's a half-baked and probably stupid idea that
occurred to me reading your post) would be to allow for some kind of
conditional assembly, just perhaps something like

#pragma assemble
#pragma X86 /* Inner pragma's implementation-defined */
/* Inline assembly, which the implementation can ignore or not */
#pragma no-assemble
/* Stock C code for implementations that can't or won't accept the
* assemble pragma: */
for( i=1; i < 10; i++ ) {
foo();
/* ... */
}
#pragma end-assemble

The end result would be something like "If the implementation attempts
to inline the assembly code contained within a #pragma assemble
directive, the behavior is implementation-defined. Otherwise the
assembly code shall be ignored and the C code contained within any
corresponding #pragma no-assemble directive shall be compiled as
though no directives were present." It would require adding some
duties to the #pragma directive, but it would allow implementors to
take a reasonable shot at using targetted assembly instructions when
appropriate and available, and reverting to ordinary C otherwise.

I'm sure there are reasons why this is stupid and/or impossible, or it
would have been done already

At the end of the day, the committee could probably spend many man
weeks deciding issues on an __asm keyword, but for what? Most
implementations will keep their existing syntax, and most programmers
who use inline assembly will no doubt continue to prefer the localised
syntax because it's less cumbersome than any standard syntax.

Indeed, but it's an interesting thought experiment to consider how the
committee *might* add assembly to C if they chose to do so. (Well,
interesting to me, at least.)

Keith Thompson · Oct 29, 2006

Peter Nilsson said:
Contrary to Richard Heathfield's categorical statement, it is not an
absolute given that there will never be an asm keyword in C. But it
is unlikely because it's already clear that the asm keyword in C++ has
not served to truly standardise the syntax of inline assembly.

At the end of the day, the committee could probably spend many man
weeks deciding issues on an __asm keyword, but for what? Most
implementations will keep their existing syntax, and most programmers
who use inline assembly will no doubt continue to prefer the localised
syntax because it's less cumbersome than any standard syntax.

C99 Annex J (J.5.10) shows "asm" as a common extension:

J.5.10 The asm keyword

The asm keyword may be used to insert assembly language directly
into the translator output (6.8). The most common implementation
is via a statement of the form:

asm ( character-string-literal );

Of course, such an extension would render the implementation
non-conforming, since it would break some strictly conforming
programs.

Richard Heathfield · Oct 29, 2006

fermineutron said:

Well, it seems to me that it is not so much the speed gain as a
functionality gain that can be realized from assembly language.

Um, *what*?

For
example, a while back i wrote a simple C profiler, which parces C file
and inserts RDTSC statements before and after each C statement,

Note that the term "RDTSC" is meaningless unless you happen to be using an
Intel processor in the x86 family, from the Pentium onwards, or a clone
thereof. Any code you write that relies on an RDTSC instruction is
inherently non-portable.

hence
determining the number of clock cycles it took to execute that line of
code. Now the only compiler that my profiler will work with is lcc
because the RDTSC is a part of intrinsics library of LCC, but it is not
a part of BC++ 5.02 for example. Had there been a full support for
assembly code within C I could have used inline assembly to do this and
not rely on intrinsics library of LCC.

BC++ 5.02 supports inline assembly language. So does Visual C++. Both are C
compilers if you tickle them properly. In both it is possible to access the
RDTSC by using the inline assembly language supported by that
implementation. It is also possible to access the RDTSC through inline
assembly language in gcc, which exists for your platform. But since inline
assembly language itself is not standardised, you may well end up having to
rewrite the program for each new platform. If you don't like this, complain
simultaneously to every assembly language designer in the world.

To the best of my knowlege most of modern C compilers produce assembly
code whih is as good as one could optimize it by hand, so clearly there
is no speed gain from asm for a general purpose code.

Speaking of speed gains:
Richard, you may be pleased to hear that after reworking my code for
calculation of factorials of large numbers, to not use the stack space
but to use malloc instead, i realized a performance gain of about 60
times.

I am delighted to hear it, but it would have been wiser to fix the bugs
first. Still, okay, you've got the speed somewhere approaching sensible, so
- better late than never, and now would therefore be a good time to fix
those bugs.

So a correctly writtec C code does not loose to correctly
writtem asm code in speed, but it is somewhat limited in CPU control.

It is certainly true that correctly written C code can be of comparable
performance to correctly written assembly language code, but with the added
advantage that it can run on any computer. Weighing C down with some way of
tickling the frobnitz might sound attractive to those with a frobnitz-based
machine, but everyone else is bound to see it as pointless fluff.

Theoretically it seems possible to develop a subset of assembly
languages which are used by motern CPUs

Feel free to try. Don't forget to include CPUs manufactured by Cray, Unisys,
the mainframe division of IBM, Analog, Motorola... and many many more
besides. Once you see how long the list is, and how many different assembly
languages with different syntaxes are out there, you'll realise why nobody
is doing this.

sjdevnull · Oct 29, 2006

jacob said:
My qfloat package has been ported to linux/windows/ and it will run
(unmodified) under Solaris, Mac (x86) and aix (x86).

Assembly is quite portable between OSes, but not within
different processors

Seems like you're using an odd definition of "portable" here. On
common platforms assembly is not portable between different
compilers/assemblers on the same OS and architecture, let alone between
OSes. Certainly on a single architecture it's possible to write an
assembler that runs under many OSes, but having just one standard
assembly language even in one OS is _not_ the current state of the
world on everyday architectures--witness the x86 example I gave in the
message you replied to.

Rod Pemberton · Oct 29, 2006

fermineutron said:
Theoretically it seems possible to develop a subset of assembly
languages which are used by motern CPUs and include in a C standard a C
library which would allow user to use the assembly subset. Since it is
a limited subset and its sintax is goverened by C it should not present
portability challenges with possible exception of older systems. any
thoughts about this?

You should ignore any response Healthfield gives to your question.

This problem was solved in the very first assembly language:
http://en.wikipedia.org/wiki/Autocode

A modern version is available here:
http://microautocode.sourceforge.net/

It has also been solved by many other languages, the most effectively by
FORTH and C. If hadn't been solved, the basic features of the C language
wouldn't be portable (at all):

constants
variables
simple flow control (if,while,etc)
complex flow control (procedures, setjmp)
arithmetic (addition, bitshifts)
pointers

The above functionality of C can be represented by 16 "actions" and 20
arithmetic operations. That means that C can be written on an interpreter.
The highly portable QEMU emulator reduces host specific CPU instructions to
"micro-ops" for a virtual machine. Those "micro-ops" could be considered to
be a portable assembly. Research into the FORTH language, shows that the
_entire_ functionality of FORTH language (which is just as powerful as C)
reduces to 13 "primitives." Many years ago, I personally reduced the full
functionality 6502 instruction set (56) to minimal set of 13. Betov, the
nemesis of Randall Hyde, has reduced the x86 instruction to a minimal set
for his own use.

The table I compiled (below) is a basic comparison of the required
functionality of various FORTH's, C libraries (including Plauger, Redhat),
OS's (GNU, HURD) , and Java.

The following table lists these:
1) primitives - smallest FORTH instructions, coded in assembly
2) functions - FORTH functions, coded in FORTH
3) syscalls - OS specific system calls, usually through an interrupt
interface
4) bytecodes - interpreter functions

Note that primitives and functions are small routines in assembly and FORTH
respectively, while the other two are large assembly routines.

3 primitives - Frank Sargent's "3 Instruction Forth"
13 primitives - theoretical minimum needed to implement full FORTH
16,29 primitives - CH Moore's word set for the F21 CPU (minimal or full)
18 syscalls - OS specific functions required by P.J. Plauger's Standard
C Library
19 syscalls - OS specific functions required by Redhat's newlib
20 primitives - Philip Koopman's "dynamic instruction frequencies"
25 primitives - CH Moore's instruction set for MuP21 CPU
36 primitives - Dr. CH Ting's eForth, a highly portable forth
40 syscalls - Linux v0.01 (67 total, 13 unimplemented, 14 minimally, 40
moderately)
46 primitives - GNU's GFORTH for 8086
58-255 functions - FORTH-83 Standard (255 defined, 132 required, 58 nucleus)
60-63 primitives - considered the essence of FORTH by CH Moore
72 primitives - Brad Rodriguez's 6809 CamelForth
74-236 functions - FORTH-79 Standard (236 defined, 147 required, 74 nucleus)
94-229 functions - fig-FORTH Std. (229 defined, 117 required, 94 level zero)
~120 syscalls - OpenWATCOM v1.3, calls DOS, BIOS, DPMI for PM DOS apps.
133-? functions - ANS-FORTH Standard (? defined, 133 required, 133 core)
150 syscalls - GNU HURD kernel
170 syscalls - DJGPP v2.03, calls DOS, BIOS, DPMI for PM DOS apps.
200 functions - FORTH 1970, the original Forth by CH Moore
200 syscalls - Linux Kernel (POSIX.1)
206 bytecodes - Java Virtual Machine bytecodes
240 functions - MVP-FORTH (FORTH-79)
~1000 functions - F83 FORTH
~2500 functions - F-PC FORTH

Rod Pemberton

Keith Thompson · Oct 29, 2006

Rod Pemberton said:
You should ignore any response Healthfield gives to your question.

[snip]

That's really bad advice.

Assembly in C Standard	14	Mar 6, 2008
Future standard GUI library	51	May 18, 2013
C Is Not Assembly	6	Apr 13, 2010
C and the future of computing	0	Apr 1, 2011
c standard in html	3	Jul 24, 2011
The CERT C Secure Coding Standard	0	Sep 11, 2013
Assigned gotos in standard C	14	Aug 29, 2010
Performance of hand-optimised assembly	99	Dec 23, 2011

assembly in future C standard

fermineutron

Richard Heathfield

Andrew Poelstra

Michal Nazarewicz

sjdevnull

Thomas Lumley

jacob navia

fermineutron

Richard Heathfield

Eric Sosman

Chris Torek

fermineutron

Gordon Burditt

Peter Nilsson

Christopher Benson-Manica

Keith Thompson

Richard Heathfield

sjdevnull

Rod Pemberton

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads