Problem with asm

P

Patrik Huber

Hello!

I got the following Code in Assembler (NASM), which prints out "5" in
realmode:

mov ax, 0xB800
mov es, ax
mov byte [es:0], '5'

I want to do the same in gcc now, but I'm stuck. GCC doesn't like the [es:0]
syntax...

asm("mov %ax, 0xB800");
asm("mov %es, %ax");
asm("mov [%es:0], '5'"); <-- Error about "[" :-(

Can anyone help me how to do this?

Or can I also do this directly in C with pointers and don't use the asm() at
all? I couldn't get this to work either...

Thanks for any help!

Patrik
 
J

Joona I Palaste

Patrik Huber said:
I got the following Code in Assembler (NASM), which prints out "5" in
realmode:
mov ax, 0xB800
mov es, ax
mov byte [es:0], '5'
I want to do the same in gcc now, but I'm stuck. GCC doesn't like the [es:0]
syntax...
asm("mov %ax, 0xB800");
asm("mov %es, %ax");
asm("mov [%es:0], '5'"); <-- Error about "[" :-(
Can anyone help me how to do this?

What makes you think a question about assembly language has anything to
do with C? Why not try Fortran or COBOL questions while you're at it?
Or can I also do this directly in C with pointers and don't use the asm() at
all? I couldn't get this to work either...

Trying my best to read your assembly code, I figure you want to do
this...
char **pp = (char**)0xB000;
**pp = '5';
Which means storing the byte value corresponding to the character '5'
at the address located in the address 0xB000.

Be aware, though, that the above causes undefined behaviour by
indirecting through an absolute address. This might work on your
platform, but it might cause your program to segfault, or worse,
crash the entire computer, on some other platforms.
 
J

Jack Klein

Hello!

I got the following Code in Assembler (NASM), which prints out "5" in
realmode:

And why do you think your question belongs in comp.lang.c? Assembly
language is not C.

[snip off-topic code]
Can anyone help me how to do this?

Yes, the people in the proper newsgroup, which would be
almost certainly can. But you had better
specify what operating system you are using. Unless it is something
old like MS-DOS, most modern operating systems will not let that code
work even if you can get it to build.
Or can I also do this directly in C with pointers and don't use the asm() at
all? I couldn't get this to work either...

C allows assigning an arbitrary address, represented as an integer
type, to a pointer with an appropriate cast. The result is
implementation-defined. Attempting to use the pointer to read or
write is completely undefined. So you need to try the assembly
language group.
 
R

Richard Pennington

Jack Klein wrote:

[snip]
C allows assigning an arbitrary address, represented as an integer
type, to a pointer with an appropriate cast. The result is
implementation-defined. Attempting to use the pointer to read or
write is completely undefined. So you need to try the assembly
language group.

It may be undefined, but I suspect every computer system you use
is doing it constantly.

Hard to believe something so undefined is relied on to work.
Programmers must be a gullible lot. ;-)

I suspect the guys in comp.arch.embedded or an OS group could
give the OP a hand to convert his program to C (undefined, of course!)

I agree that it is implementation defined, but any implementation
that didn't do something reasonable would be considered broken.

-Rich
 
J

Joona I Palaste

Richard Pennington said:
Jack Klein wrote:
[snip]
C allows assigning an arbitrary address, represented as an integer
type, to a pointer with an appropriate cast. The result is
implementation-defined. Attempting to use the pointer to read or
write is completely undefined. So you need to try the assembly
language group.
It may be undefined, but I suspect every computer system you use
is doing it constantly.
Hard to believe something so undefined is relied on to work.
Programmers must be a gullible lot. ;-)

No one ever said using an arbitrary absolute address was undefined.
Doing it in C is, though.
I suspect the guys in comp.arch.embedded or an OS group could
give the OP a hand to convert his program to C (undefined, of course!)
I agree that it is implementation defined, but any implementation
that didn't do something reasonable would be considered broken.

Of course an implementation can, and frequently does, consistently do
things that cause undefined behaviour by the C standard. It would be
difficult to make an OS or an embedded device otherwise.
But this still doesn't mean that it has to be defined in C, too. The C
language and the implementation it runs on are two different things.
The C standard committee does not want to tie C down to platforms where
using arbitrary absolute addresses has a specific meaning, so they
leave this behaviour as undefined.
Undefined behaviour does not mean "must crash", "must cause an error"
or "must have unpredictable results". Replace "must" with "can" and you
are closer to the real meaning. Nothing is preventing anyone from
making a C implementation where arbitrary absolute addresses work in
the exact same way as they do in the underlying OS or hardware. OTOH,
nothing is preventing anyone from making one where they don't.
 
R

Richard Pennington

Joona I Palaste wrote:

[snip - lot's of good stuff, including my ;-)]
Of course an implementation can, and frequently does, consistently do
things that cause undefined behaviour by the C standard. It would be
difficult to make an OS or an embedded device otherwise.
But this still doesn't mean that it has to be defined in C, too. The C
language and the implementation it runs on are two different things.
The C standard committee does not want to tie C down to platforms where
using arbitrary absolute addresses has a specific meaning, so they
leave this behaviour as undefined.
Undefined behaviour does not mean "must crash", "must cause an error"
or "must have unpredictable results". Replace "must" with "can" and you
are closer to the real meaning. Nothing is preventing anyone from
making a C implementation where arbitrary absolute addresses work in
the exact same way as they do in the underlying OS or hardware. OTOH,
nothing is preventing anyone from making one where they don't.

Nothing prevents someone from making an implementation where absolute
addresses don't work as people expect. That's true. The implementor
might be sad when no one uses the compiler, however.

-Rich
 
F

Flash Gordon

Joona I Palaste wrote:

[snip - lot's of good stuff, including my ;-)]
Of course an implementation can, and frequently does, consistently
do things that cause undefined behaviour by the C standard. It would
be difficult to make an OS or an embedded device otherwise.
But this still doesn't mean that it has to be defined in C, too. The
C language and the implementation it runs on are two different
things. The C standard committee does not want to tie C down to
platforms where using arbitrary absolute addresses has a specific
meaning, so they leave this behaviour as undefined.
Undefined behaviour does not mean "must crash", "must cause an
error" or "must have unpredictable results". Replace "must" with
"can" and you are closer to the real meaning. Nothing is preventing
anyone from making a C implementation where arbitrary absolute
addresses work in the exact same way as they do in the underlying OS
or hardware. OTOH, nothing is preventing anyone from making one
where they don't.

Nothing prevents someone from making an implementation where absolute
addresses don't work as people expect. That's true. The implementor
might be sad when no one uses the compiler, however.

Try doing it with Microsoft Visual C++ either in C or C++ in a normal
application and watch it crash. Try doing it with gcc on either a Unix
derivative or a Windows NT derivative and again watch it crash.

Most modern operating systems prevent applications from accessing
arbitrary addresses (including memory mapped hardware) for very good
reasons. So most code running on a computer does not and cannot use
pointers to arbitrary locations.

Software which deals directly with the hardware (such as the OS, device
drivers, or parts of embedded applications) are written using documented
extensions to C or in some other language, where one documented
extension is how to access the HW directly.
 
M

Mac

Jack Klein wrote:

[snip]
C allows assigning an arbitrary address, represented as an integer
type, to a pointer with an appropriate cast. The result is
implementation-defined. Attempting to use the pointer to read or
write is completely undefined. So you need to try the assembly
language group.

It may be undefined, but I suspect every computer system you use
is doing it constantly.

Hard to believe something so undefined is relied on to work.
Programmers must be a gullible lot. ;-)

I suspect the guys in comp.arch.embedded or an OS group could
give the OP a hand to convert his program to C (undefined, of course!)

I agree that it is implementation defined, but any implementation
that didn't do something reasonable would be considered broken.

I think I have to disagree. What you say may be OK for kernels and what
not, but it is certainly not true for application code on systems with
memory management. You cannot assign an absolute address (a hardware
address, if you will) to a pointer and then dereference it.

You seem to be somewhat knowledgeable, and maybe you already realize this,
but you give the impression that it is perfectly normal to use hardware
addresses in any old C code.

--Mac
 
R

Richard Pennington

Mac said:
I think I have to disagree. What you say may be OK for kernels and what
not, but it is certainly not true for application code on systems with
memory management. You cannot assign an absolute address (a hardware
address, if you will) to a pointer and then dereference it.

You seem to be somewhat knowledgeable, and maybe you already realize this,
but you give the impression that it is perfectly normal to use hardware
addresses in any old C code.




--Mac

It could be OK in application code also. Image a system that memory
mapped a peripherial into a process' address space. A video frame
buffer for example.

I agree that this is not normally the case in application code.

I suspect the OP was targeting some kind of system where memory access
is allowed, especially since he mentioned "real mode".

There is nothing in the C standard that says C should only be used for
application code. I suspect, in numbers of processors running C code,
the reverse is true: There are probably many more C programs that *can"
access memory arbitrarily than can't.

My argument is meaningless, of course. ;-) I'm talking about all the
bazillions of embedded microcontrollers.

As for being somewhat knowledgeable, I don't know about that.
I've been doing embedded programming for 26 years and writing
compilers for 25 years. I do tend to forget the operating system
from time to time. Must be getting old.

-Rich
 
F

Fredrik Tolf

This is a bit off-topic in comp.lang.c, so I'm mailing my reply directly
to you instead.

Hello!

I got the following Code in Assembler (NASM), which prints out "5" in
realmode:

mov ax, 0xB800
mov es, ax
mov byte [es:0], '5'

I want to do the same in gcc now, but I'm stuck. GCC doesn't like the [es:0]
syntax...

asm("mov %ax, 0xB800");
asm("mov %es, %ax");
asm("mov [%es:0], '5'"); <-- Error about "[" :-(

Can anyone help me how to do this?

That's just so wrong on so many levels... sorry, but it really is.

First of all, the reason GCC is complaining to begin with is because
that construct is written "0(,1)" in gas syntax, if I recall correctly.
There are many other errors in your assembly syntax, since gas doesn't
use the same syntax as native x86 assemblers like NASM. In fact, there's
not a single part of that assembly code that is correct. You should read
about the differences in the gas texinfo, under the "Machine
Dependencies" section, "i386-Dependant" subsection.

Second, that code assumes that your program is running in real mode,
which it won't - GCC only compiles 32-bit code. After that, how to do
what you want depends on the operating system the program will be
running on. Many systems won't even allow you to do that, for
multitasking protection purposes. If you run that program on Linux, a
NTOS kernel (that is, WinNT4, Win2k or WinXP) or some other x86 UN*X,
for example, it will crash and burn.

If you're compiling it for DOS with DJGPP to run under some DPMI
interface like DOS4GW or a Win9x kernel, it's possible to get it to do
what you want, but it was quite some time since I programmed under DPMI
interfaces, so I don't really recall all the details. IIRC, you still
have to unprotect that memory and add it to your segment. When you have
done so, that memory will be available on the linear address that
corresponds to the real-mode address that you wish to access. B800:0000
in real mode corresponds to the linear address 000B8000. This is because
real mode addresses a 20-bit address bus by shifting the segment
register four bits to the left and adding the offset to that to produce
an address bus value.

If you manage to look up somewhere how to unprotect the memory (it's
somewhere in the DJGPP manual), the following assembly code would
accomplish your purpose:

asm("movl $0x35, 0xb8000(,1)");

You mustn't touch the segment registers in protected mode unless you
really know what you're doing, since in protected mode, the segment
registers cease to be segment registers, and are instead selector
registers, for selecting the segment descriptor you wish to operate
through (a segment in protected mode is _not_ the same thing as in real
mode). For more info on this, I suggest reading "Intel Architecture
Software Developer's Manual, Volume 3: System Programming", published by
Intel, order number 243192, downloadable as PDF through Intel's website.
Or can I also do this directly in C with pointers and don't use the asm() at
all? I couldn't get this to work either...

Indeed:

struct {
char glyph, color;
} *textmem = (void *)0x000b8000;
textmem[0].glyph = '5';

That will accomplish the same as the assembly code I gave above. Of
course, since it will most likely yield the exact same assembler output,
it is subject to the same operating system and protection constraints as
described above.

Fredrik Tolf
 
F

Fredrik Tolf

This is a bit off-topic in comp.lang.c, so I'm mailing my reply directly
to you instead.

Oops - Sorry about that. It seems Evolution didn't really do what I
thought. Pressing the "Reply to Sender" button posted back to the
newsgroup instead of mailing to the original author, as I had expected
it to.

Sorry for posting off-topic to the group.

Fredrik Tolf
 
R

Richard Pennington

Flash Gordon wrote:
[snip]
Try doing it with Microsoft Visual C++ either in C or C++ in a normal
application and watch it crash. Try doing it with gcc on either a Unix
derivative or a Windows NT derivative and again watch it crash.

Most modern operating systems prevent applications from accessing
arbitrary addresses (including memory mapped hardware) for very good
reasons. So most code running on a computer does not and cannot use
pointers to arbitrary locations.

Software which deals directly with the hardware (such as the OS, device
drivers, or parts of embedded applications) are written using documented
extensions to C or in some other language, where one documented
extension is how to access the HW directly.

I do it all the time. You do it all the time.
Linux, NetBSD, FreeBSD, Windows (I suspect) are all mostly or completely
written in C. Chances are you're running at least on of those OSs.

The code you run every day.

There are no extensions being used (for memory access at least). Just
implementation defined behavior.

I understand your point about memory protection. Even in a memory
protected system the compiler is doing the "right" thing. It is the
OS that is trapping the illegal access. The compiler still happily
attempts it.

-Rich
 
J

Joona I Palaste

Richard Pennington said:
Flash Gordon wrote:
[snip]
Try doing it with Microsoft Visual C++ either in C or C++ in a normal
application and watch it crash. Try doing it with gcc on either a Unix
derivative or a Windows NT derivative and again watch it crash.

Most modern operating systems prevent applications from accessing
arbitrary addresses (including memory mapped hardware) for very good
reasons. So most code running on a computer does not and cannot use
pointers to arbitrary locations.

Software which deals directly with the hardware (such as the OS, device
drivers, or parts of embedded applications) are written using documented
extensions to C or in some other language, where one documented
extension is how to access the HW directly.
I do it all the time. You do it all the time.
Linux, NetBSD, FreeBSD, Windows (I suspect) are all mostly or completely
written in C. Chances are you're running at least on of those OSs.

Not completely. Parts of them are written in assembly language. It is
pretty much impossible to write a real-world (in contrast to simulated)
OS in pure C.
The code you run every day.
There are no extensions being used (for memory access at least). Just
implementation defined behavior.
I understand your point about memory protection. Even in a memory
protected system the compiler is doing the "right" thing. It is the
OS that is trapping the illegal access. The compiler still happily
attempts it.

This still does not change the fact that using arbitrary absolute memory
addresses in C causes undefined behaviour. This means that the C
language does not define anything about the behaviour. The underlying
implementation can define this behaviour, but it does not have to.
If you are really upset about arbitrary absolute memory access causing
undefined behaviour, take the issue up at comp.std.c and submit a
change proposal to the standard. Here at comp.lang.c we stick by what
the standard says.
 
R

Richard Pennington

Joona said:
Richard Pennington <[email protected]> scribbled the following: [snip]
I do it all the time. You do it all the time.
Linux, NetBSD, FreeBSD, Windows (I suspect) are all mostly or completely
written in C. Chances are you're running at least on of those OSs.


Not completely. Parts of them are written in assembly language. It is
pretty much impossible to write a real-world (in contrast to simulated)
OS in pure C.

The examples I gave (with the possible exception of Windows, I haven't
seen the source) have very little assembly language used in their
implementation. Usually startup code, context switching, etc. and
very little else.

I think all of them, with the possible exception of Windows, are
real-world OSs.
This still does not change the fact that using arbitrary absolute memory
addresses in C causes undefined behaviour. This means that the C
language does not define anything about the behaviour. The underlying
implementation can define this behaviour, but it does not have to.
If you are really upset about arbitrary absolute memory access causing
undefined behaviour, take the issue up at comp.std.c and submit a
change proposal to the standard. Here at comp.lang.c we stick by what
the standard says.

I'm not upset about anything. I think that many people don't understand
how the real world works. We compiler and OS writers will wink and nod
at each other and continue to rely on undefined behavior.

I do agree that this is off topic here. I'll try to let it drop.

-Rich
 
J

Joona I Palaste

Richard Pennington said:
Joona said:
Richard Pennington <[email protected]> scribbled the following: [snip]
I do it all the time. You do it all the time.
Linux, NetBSD, FreeBSD, Windows (I suspect) are all mostly or completely
written in C. Chances are you're running at least on of those OSs.

Not completely. Parts of them are written in assembly language. It is
pretty much impossible to write a real-world (in contrast to simulated)
OS in pure C.
The examples I gave (with the possible exception of Windows, I haven't
seen the source) have very little assembly language used in their
implementation. Usually startup code, context switching, etc. and
very little else.

If they contain any assembly language at all, they're not pure C. Pure C
means you can't even use non-standard libraries that have been written
in assembly language.
I think all of them, with the possible exception of Windows, are
real-world OSs.

Even Windows is a real-world OS.
I'm not upset about anything. I think that many people don't understand
how the real world works. We compiler and OS writers will wink and nod
at each other and continue to rely on undefined behavior.

If you are writing your own OS, or your own compiler for a specific OS,
you can rely on undefined behaviour all you want, no one, not even none
of us here, will stop you or think you are doing anything wrong. After
all, when you are doing so, you are effectively working under *two*
definitions: the C language one and your own in addition to that. Your
own definition is free to define anything the C language does not
define.
Once again, undefined behaviour does not mean that nothing anywhere may
ever define the behaviour. All it means is that the C language does not
define it, but anything other, even the underlying implementation is
allowed to.
I do agree that this is off topic here. I'll try to let it drop.

You are correct, discussing specific implementations is off-topic here.
This is why we answer questions like "How do I use VGA graphics in C?"
with "By using non-standard libraries which are off-topic here".
This means that it is impossible in pure ISO standard C, but if you
use non-standard libraries, it may be possible. These non-standard
libraries cause undefined behaviour, but this is not necessarily a bad
thing, if your own implementation defines this undefined behaviour and
you accept that your code is now non-portable.
 
P

Patrik Huber

Hello Fredrik

Thank you very much, you were really helpful!
So do I understand correctly that if I write code with the asm() command in
gcc it compiles that with gas?
Second, that code assumes that your program is running in real mode,
which it won't - GCC only compiles 32-bit code. After that, how to do
what you want depends on the operating system the program will be
running on.

This really explains the trouble I have...
Can I get GCC to compile my code in 16-bit for running in realmode?
Or is there another C-compiler that can do that?

I do not want to run this on any OS like Linux, WinNT. I'm trying to do this
with my own bootsector (in real-mode). So the direct memory-access shouldn't
be a problem so far.
So basically I want to write my own little real-mode app in C.

For more info on this, I suggest reading "Intel Architecture
Software Developer's Manual, Volume 3: System Programming", published by
Intel, order number 243192, downloadable as PDF through Intel's website.

I'm currently reading this one, thank you anyway for pointing it out :)


To the others who said this is off-topic: You may be right, but I posted
this here because I thought it's a C problem and not an asm-one, since the
asm-code itself works. Sorry about that!

Thanks again

Patrik
 
F

Fredrik Tolf

Jack Klein wrote:

[snip]
Or can I also do this directly in C with pointers and don't use the asm() at
all? I couldn't get this to work either...


C allows assigning an arbitrary address, represented as an integer
type, to a pointer with an appropriate cast. The result is
implementation-defined. Attempting to use the pointer to read or
write is completely undefined. So you need to try the assembly
language group.

It may be undefined, but I suspect every computer system you use
is doing it constantly.

Hard to believe something so undefined is relied on to work.
Programmers must be a gullible lot. ;-)

I suspect the guys in comp.arch.embedded or an OS group could
give the OP a hand to convert his program to C (undefined, of course!)

I agree that it is implementation defined, but any implementation
that didn't do something reasonable would be considered broken.

I think I have to disagree. What you say may be OK for kernels and what
not, but it is certainly not true for application code on systems with
memory management. You cannot assign an absolute address (a hardware
address, if you will) to a pointer and then dereference it.

Correct me if I'm wrong, but I have to think that loading an absolute
address into a pointer cannot be wrong, no matter what code. The
impression I have of a pointer is that of a number which points out an
address. Just because you load an absolute address doesn't mean that
address has to be a hardware address. If the code is running under
memory management, segmentation, paging and what not, that absolute
address just points out an address in the process' address space, is
that not true?

Truly, if you just load an arbitrary pointer and dereference it, you're
likely to generate a page or segmentation fault, but that's an OS issue,
not a compiler issue, right?

I could imagine that the strictest of C standards would define that as
undefined since maybe not all architectures (like x86 real mode, like in
the case of the original message of this thread) don't have a concept of
truly linear addresses. However, correct me if I'm wrong - I really
don't know - but isn't ISO C defined on architectures with a linear
address space (whether or not it happens to be segmented, paged and what
not)?

Fredrik Tolf
 
M

Mark A. Odell

To the others who said this is off-topic: You may be right, but I posted
this here because I thought it's a C problem and not an asm-one, since
the asm-code itself works. Sorry about that!

Say, there are GCC newsgroups where I'd bet people could help you with
this issue, why not try there? See gnu.gcc.help to start maybe.
 
D

Dan Pop

In said:
Joona said:
Richard Pennington <[email protected]> scribbled the following: [snip]
I do it all the time. You do it all the time.
Linux, NetBSD, FreeBSD, Windows (I suspect) are all mostly or completely
written in C. Chances are you're running at least on of those OSs.

Not completely. Parts of them are written in assembly language. It is
pretty much impossible to write a real-world (in contrast to simulated)
OS in pure C.

The examples I gave (with the possible exception of Windows, I haven't
seen the source) have very little assembly language used in their
implementation. Usually startup code, context switching, etc. and
very little else.

OTOH, they abound with C code invoking undefined behaviour.

Dan
 
D

Dan Pop

In said:
Correct me if I'm wrong, but I have to think that loading an absolute
address into a pointer cannot be wrong, no matter what code.

1. This is downright impossible in user code running on systems with
virtual memory: all the addresses in the program are interpreted as
virtual addresses.

2. There are platforms where loading an arbitrary address in an address
register generates a fault. If the compiler decides to store the
pointer in an address register or if loading something into a pointer
involves storing the data first into an address register...

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top