assembly in future C standard

W

Walter Banks

As this thread wanders off topic this industry was introduced to a new
mnemonic in Byte article about decoding the undocumented
Motorola 6800 instructions. The HCF (Halt Catch Fire) opcode $DD
or $D9. HFC locked up the processor and cycled the address bus
The author of that article was Gerry Wheeler.

Gerry Wheeler, 54, died October 15, 2006, advanced non-Hodgkins
lymphoma cancer. Gerry made significant contributions to the technology
of the embedded systems world and was a key part of the development
of many household name products.

Programmer, Ham KG4NBB, author, father, husband, active commuity
participant Gerry will be missed by all.

w..
 
A

Andrew Poelstra

Andrew Poelstra said:

That too. Try running cygwin on a 4341. :)

The closest I got was running Linux on a Linksys WRT54GL router and an
iPod. There was also that foray into NES assembler, but that ended with
a lack of EEPROM hardware.
 
C

Chris Thomasson

Chris Torek said:
A more usual motivation is to make use of special machine
instructions the compiler would not generate on its own.
[...]

I agree with all of this; however, in some (sometimes significant)
cases (e.g., the actual implementation for a mutex), you may want
to have an inline expansion of the underlying atomic operation,
typically via a macro. For instance, if you have a mutex construct
that -- at least in the uncontested case -- is just a (possibly
locked) compare-and-swap, you may want the x86-specific version
of:

[...]


The tricky part lies not only in arranging for the assembly equivalent
to be inserted inline, but in *also* informing the compiler that
it must not move certain memory operations across the "special"
instruction(s).

[...]

Indeed!

http://groups.google.com/group/comp.arch/msg/c6f096adecdd0369
(refer to the last couple of paragraphs...)

;^)




FWIW, in order to correctly implement this kind of stuff, you simply have to
define exactly how you are going to address two fundamental problems:


1: Compiler Reordering
2: Hardware Reordering




--1-- The compiler reordering issue can "usually" be resolved by strictly
adhering to a design policy which declares that all functions that contain
"critical-sequences" of instructions that have to be executed in precise
order must be externally assembled. This is due to the current fact that the
C Standard doesn't think threads even exist. However, an Assembler is a
different story IMO simply because it gives you full access to the
architecture your targeting and it will not reorder any your assembly
statements; what you see is exactly what you get.

IMO, a typical C compiler is usually forced to treat any call into an
"unknown and external" function in a fairly "pessimistic" manor", which in
turn basically renders its behavior to something that is analogous to a
so-called "compiler barrier". However, please note that some compilers are
exploring link-time optimizations which can, and probably will, turn out to
be an annoying and potentially hazardous scenario to any function that
simply expects every instruction its made up of will be executed exactly
as-is. Period. Unfortunately, this definitely includes basically all
externally assembled functions that a lock-free library may export by
default.

;^(...


However, all is not list because it does seem like the compilers that do
support link-time optimizations' also provide some method for turning it
on/off. Usually, they allow you to decorate your "critical-function"
declarations with something that guarantees that they will never be
subjected to this particular type of optimization.




--2-- Hardware reordering is easily solved by placing the proper the memory
barrier instructions in the correct places throughout your externally
assembled lock-free library. The assembler won't reorder any instructions,
therefore, this is the only real solution wrt actually implementing this
kind of stuff.




Therefore, it is my theses that a safe method for ensuring that calls into
"critical-function" will not be tampered with must include a combination of
solutions that directly resolve all of the reordering issues that are
attributed to both the hardware your targeting, and the C compiler your
using...



Any thoughts?





[...]
The compiler may think the second version is superior (because it
uses less CPU time overall, e.g., due to reduced register pressure
or because it schedules better), but in fact, it is not. :)

;^)
 
M

Markus.Elfring

--1-- The compiler reordering issue can "usually" be resolved by strictly
adhering to a design policy which declares that all functions that contain
"critical-sequences" of instructions that have to be executed in precise
order must be externally assembled. This is due to the current fact that the
C Standard doesn't think threads even exist. However, an Assembler is a
different story IMO simply because it gives you full access to the
architecture your targeting and it will not reorder any your assembly
statements; what you see is exactly what you get.

I imagine that optimizing assembler implementations might exist that
break your expectation.
http://en.wikipedia.org/wiki/Assembly_language

Which pragmas or flags would you like to present to those specific
compiler tools to keep the specified instruction sequence until it will
be put into an executable file by an also optimizing linker?

I guess that such tools do not provide the guarantees that are required
for program correctness so far.

IMO, a typical C compiler is usually forced to treat any call into an
"unknown and external" function in a fairly "pessimistic" manor", which in
turn basically renders its behavior to something that is analogous to a
so-called "compiler barrier". However, please note that some compilers are
exploring link-time optimizations which can, and probably will, turn out to
be an annoying and potentially hazardous scenario to any function that
simply expects every instruction its made up of will be executed exactly
as-is. Period. Unfortunately, this definitely includes basically all
externally assembled functions that a lock-free library may export by
default.

I assume that the compiler barrier can be unsafe and fragile.
Portability will only be limited to some system environments.
There are so many preconditions to consider.

Regards,
Markus
 
D

Dave Thompson

On 29 Oct 2006 08:45:11 -0800, "fermineutron" <[email protected]>
wrote:
By the way, returning to the minimal list of cpu instructions, i thing
that thre true bare bone will have 2 instructiions, Move and Add. Move
can be used to read and write, whie add can be used to to addition,
subtraction, multiplication and division. Shift instructions would
speed it up, but are redundant. So if C is to be truly compatible
whith anything and everything, should not it limit the compiler output
to these 2 instructions. Obviously its an overkill, but gets my point
across.

Actually that's neither necessary nor sufficient.

Add by itself cannot build negate, subtract, and divide, but subtract
can build add (and multiply) and compare.

You need some kind of transfer of control; you can use move (or add)
for this if (and only if) you make the PC addressable as a register
(e.g. PDP-11) or put it in memory (I think some early machines did
this); or you can have explicit successors (as I know some did). And
you need some kind of conditional operation. If by move vs add you
mean the RISC-style division between memory reference instructions and
in-CPU-only computation ones, yes you need both; but if you have a
common but general format you can use ADD-zero for move. In fact a
common and practical RISC technique is not to waste an ALU code for
move rega to regb, just have a general three-address format and a zero
operand and OR rega with zero to regb or ADD rega plus zero to regb.

But those functions needn't be in separate instructions. The "one
instruction set computer" architecture -- there's a website somewhere
I've forgotten -- is something like "decrement and jump if zero". Then
you don't need an opcode field at all, your instruction (format)
consists entirely of address fields. But it's about as hugely and
gloriously inefficient as a (conventional) universal Turing machine,
so no consumer would ever buy one to use for anything, which rather
limits the market for programs written on it.

And that (kind of limitation) isn't what we C folks want.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top