assembly in future C standard

Walter Banks · Nov 3, 2006

As this thread wanders off topic this industry was introduced to a new
mnemonic in Byte article about decoding the undocumented
Motorola 6800 instructions. The HCF (Halt Catch Fire) opcode $DD
or $D9. HFC locked up the processor and cycled the address bus
The author of that article was Gerry Wheeler.

Gerry Wheeler, 54, died October 15, 2006, advanced non-Hodgkins
lymphoma cancer. Gerry made significant contributions to the technology
of the embedded systems world and was a key part of the development
of many household name products.

Programmer, Ham KG4NBB, author, father, husband, active commuity
participant Gerry will be missed by all.

w..

Andrew Poelstra · Nov 3, 2006

Andrew Poelstra said:

That too. Try running cygwin on a 4341.

The closest I got was running Linux on a Linksys WRT54GL router and an
iPod. There was also that foray into NES assembler, but that ended with
a lack of EEPROM hardware.

Chris Thomasson · Nov 7, 2006

Chris Torek said:
A more usual motivation is to make use of special machine
instructions the compiler would not generate on its own.

Click to expand...

[...]

I agree with all of this; however, in some (sometimes significant)
cases (e.g., the actual implementation for a mutex), you may want
to have an inline expansion of the underlying atomic operation,
typically via a macro. For instance, if you have a mutex construct
that -- at least in the uncontested case -- is just a (possibly
locked) compare-and-swap, you may want the x86-specific version
of:

[...]

The tricky part lies not only in arranging for the assembly equivalent
to be inserted inline, but in *also* informing the compiler that
it must not move certain memory operations across the "special"
instruction(s).

[...]

Indeed!

http://groups.google.com/group/comp.arch/msg/c6f096adecdd0369
(refer to the last couple of paragraphs...)

;^)

FWIW, in order to correctly implement this kind of stuff, you simply have to
define exactly how you are going to address two fundamental problems:

1: Compiler Reordering
2: Hardware Reordering

--1-- The compiler reordering issue can "usually" be resolved by strictly
adhering to a design policy which declares that all functions that contain
"critical-sequences" of instructions that have to be executed in precise
order must be externally assembled. This is due to the current fact that the
C Standard doesn't think threads even exist. However, an Assembler is a
different story IMO simply because it gives you full access to the
architecture your targeting and it will not reorder any your assembly
statements; what you see is exactly what you get.

IMO, a typical C compiler is usually forced to treat any call into an
"unknown and external" function in a fairly "pessimistic" manor", which in
turn basically renders its behavior to something that is analogous to a
so-called "compiler barrier". However, please note that some compilers are
exploring link-time optimizations which can, and probably will, turn out to
be an annoying and potentially hazardous scenario to any function that
simply expects every instruction its made up of will be executed exactly
as-is. Period. Unfortunately, this definitely includes basically all
externally assembled functions that a lock-free library may export by
default.

;^(...

However, all is not list because it does seem like the compilers that do
support link-time optimizations' also provide some method for turning it
on/off. Usually, they allow you to decorate your "critical-function"
declarations with something that guarantees that they will never be
subjected to this particular type of optimization.

--2-- Hardware reordering is easily solved by placing the proper the memory
barrier instructions in the correct places throughout your externally
assembled lock-free library. The assembler won't reorder any instructions,
therefore, this is the only real solution wrt actually implementing this
kind of stuff.

Therefore, it is my theses that a safe method for ensuring that calls into
"critical-function" will not be tampered with must include a combination of
solutions that directly resolve all of the reordering issues that are
attributed to both the hardware your targeting, and the C compiler your
using...

Any thoughts?

[...]

The compiler may think the second version is superior (because it
uses less CPU time overall, e.g., due to reduced register pressure
or because it schedules better), but in fact, it is not.

;^)

Markus.Elfring · Nov 9, 2006

--1-- The compiler reordering issue can "usually" be resolved by strictly

adhering to a design policy which declares that all functions that contain
"critical-sequences" of instructions that have to be executed in precise
order must be externally assembled. This is due to the current fact that the
C Standard doesn't think threads even exist. However, an Assembler is a
different story IMO simply because it gives you full access to the
architecture your targeting and it will not reorder any your assembly
statements; what you see is exactly what you get.

I imagine that optimizing assembler implementations might exist that
break your expectation.
http://en.wikipedia.org/wiki/Assembly_language

Which pragmas or flags would you like to present to those specific
compiler tools to keep the specified instruction sequence until it will
be put into an executable file by an also optimizing linker?

I guess that such tools do not provide the guarantees that are required
for program correctness so far.

IMO, a typical C compiler is usually forced to treat any call into an
"unknown and external" function in a fairly "pessimistic" manor", which in
turn basically renders its behavior to something that is analogous to a
so-called "compiler barrier". However, please note that some compilers are
exploring link-time optimizations which can, and probably will, turn out to
be an annoying and potentially hazardous scenario to any function that
simply expects every instruction its made up of will be executed exactly
as-is. Period. Unfortunately, this definitely includes basically all
externally assembled functions that a lock-free library may export by
default.

I assume that the compiler barrier can be unsafe and fragile.
Portability will only be limited to some system environments.
There are so many preconditions to consider.

Regards,
Markus

Chris Thomasson · Nov 12, 2006

I imagine that optimizing assembler implementations might exist that
break your expectation.
http://en.wikipedia.org/wiki/Assembly_language

[...]

Here is my response:

http://groups.google.com/group/comp.programming.threads/msg/b09fe7e6a5b23de0

This is more topic in c.p.pt anyway: Optimizing assembler doesn't exist;
suspiciously sounds like C compiler anyway...

Dave Thompson · Nov 20, 2006

On 29 Oct 2006 08:45:11 -0800, "fermineutron" <[email protected]>
wrote:

By the way, returning to the minimal list of cpu instructions, i thing
that thre true bare bone will have 2 instructiions, Move and Add. Move
can be used to read and write, whie add can be used to to addition,
subtraction, multiplication and division. Shift instructions would
speed it up, but are redundant. So if C is to be truly compatible
whith anything and everything, should not it limit the compiler output
to these 2 instructions. Obviously its an overkill, but gets my point
across.

Actually that's neither necessary nor sufficient.

Add by itself cannot build negate, subtract, and divide, but subtract
can build add (and multiply) and compare.

You need some kind of transfer of control; you can use move (or add)
for this if (and only if) you make the PC addressable as a register
(e.g. PDP-11) or put it in memory (I think some early machines did
this); or you can have explicit successors (as I know some did). And
you need some kind of conditional operation. If by move vs add you
mean the RISC-style division between memory reference instructions and
in-CPU-only computation ones, yes you need both; but if you have a
common but general format you can use ADD-zero for move. In fact a
common and practical RISC technique is not to waste an ALU code for
move rega to regb, just have a general three-address format and a zero
operand and OR rega with zero to regb or ADD rega plus zero to regb.

But those functions needn't be in separate instructions. The "one
instruction set computer" architecture -- there's a website somewhere
I've forgotten -- is something like "decrement and jump if zero". Then
you don't need an opcode field at all, your instruction (format)
consists entirely of address fields. But it's about as hugely and
gloriously inefficient as a (conventional) universal Turing machine,
so no consumer would ever buy one to use for anything, which rather
limits the market for programs written on it.

And that (kind of limitation) isn't what we C folks want.

- David.Thompson1 at worldnet.att.net

Assembly in C Standard	14	Mar 6, 2008
Future standard GUI library	51	May 18, 2013
C Is Not Assembly	6	Apr 13, 2010
C and the future of computing	0	Apr 1, 2011
c standard in html	3	Jul 24, 2011
The CERT C Secure Coding Standard	0	Sep 11, 2013
Assigned gotos in standard C	14	Aug 29, 2010
Performance of hand-optimised assembly	99	Dec 23, 2011

assembly in future C standard

Walter Banks

Andrew Poelstra

Chris Thomasson

Markus.Elfring

Chris Thomasson

Dave Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads