Dynamic (as in Reflective) Programming in C?

E

Edoules

Dear All,

I'm curious to know if there has been any efforts to allow for the
byte-order run-time editing of compiled C code.

This request is informational only, sparked by curiousity.

Hypothetical solutions I've discussed with my colleagues include
casting a function pointer to a void or unsigned char pointer
(illegal, but oddly allowed by most compilers), then buffering all the
bytes up to a known return value-- although how to write back onto the
protected page of code has yet to be solved by us, and is likely
something requiring an "unsafe"/"custom" kernel.

This is clearly a system specific problem, though an abstract overview
of how this was approached in the past would be nice.


Thanks,
Eddie Ma
B.Sc. University of Guelph
 
E

Eric Sosman

Edoules said:
Dear All,

I'm curious to know if there has been any efforts to allow for the
byte-order run-time editing of compiled C code.

This request is informational only, sparked by curiousity.

Hypothetical solutions I've discussed with my colleagues include
casting a function pointer to a void or unsigned char pointer
(illegal, but oddly allowed by most compilers), then buffering all the
bytes up to a known return value-- although how to write back onto the
protected page of code has yet to be solved by us, and is likely
something requiring an "unsafe"/"custom" kernel.

This is clearly a system specific problem, though an abstract overview
of how this was approached in the past would be nice.

Others' experiences may differ, but I myself haven't run
across a C program that modifies its own code in this way.
I've run across one that treated its code as read-only data,
checksumming it periodically to detect whether someone was
running it under a debugger and had inserted breakpoints to
try to defeat the licensing, but even that's fairly rare.

A much more common practice is to deposit some bytes in
an array, convert the array pointer to a function pointer, and
call via the pointer, thus running the array's bytes as code.
Even this usually requires system-specific help beyond just
knowing the instruction set and calling conventions and so on:
you may need to synchronize instruction and data caches, change
memory access permission bits, and the like. All of this is
well outside the realm of C -- C doesn't even guarantee that
you can convert between data and function pointers meaningfully.
 
J

jacob navia

Edoules said:
Dear All,

I'm curious to know if there has been any efforts to allow for the
byte-order run-time editing of compiled C code.

I would understand the above sentence if the "byte order" term was
absent.

You want to change the "byte order" or you want to edit/run
the code?
This request is informational only, sparked by curiousity.

Hypothetical solutions I've discussed with my colleagues include
casting a function pointer to a void or unsigned char pointer
(illegal, but oddly allowed by most compilers), then buffering all the
bytes up to a known return value-- although how to write back onto the
protected page of code has yet to be solved by us, and is likely
something requiring an "unsafe"/"custom" kernel.

I have written a module that takes an obj file and loads
it, links it with the running program, and then it runs it.

If interested answer by email to me.
 
A

Anonymous

C doesn't even
guarantee that you can convert between data and function pointers
meaningfully.

[Slightly OT]
And on modern systems, it will likely not work, since the sections of
memory where the data is stored are typically marked non-executable by
the OS for security reasons
 
S

santosh

Anonymous said:
C doesn't even
guarantee that you can convert between data and function pointers
meaningfully.

[Slightly OT]
And on modern systems, it will likely not work, since the sections of
memory where the data is stored are typically marked non-executable by
the OS for security reasons

What wouldn't work? I suspect that on many modern systems the pointer
conversion itself will succeed, but as you say, the code may not be
writable.
 
R

Richard Tobin

Anonymous said:
And on modern systems, it will likely not work, since the sections of
memory where the data is stored are typically marked non-executable by
the OS for security reasons

On such modern operating systems there is bound to be a way to change
that protection. By definition no general-purpose operating system
would make it impossible to generate code on the fly.

-- Richard
 
R

Richard

On such modern operating systems there is bound to be a way to change
that protection. By definition no general-purpose operating system
would make it impossible to generate code on the fly.

-- Richard

Even something as base as exec'ing a system call which calls a newly
generated bash script which compiles and executes a new executable!
 
G

George Peter Staplin

Edoules said:
Dear All,

I'm curious to know if there has been any efforts to allow for the
byte-order run-time editing of compiled C code.

This request is informational only, sparked by curiousity.

Hypothetical solutions I've discussed with my colleagues include
casting a function pointer to a void or unsigned char pointer
(illegal, but oddly allowed by most compilers), then buffering all the
bytes up to a known return value-- although how to write back onto the
protected page of code has yet to be solved by us, and is likely
something requiring an "unsafe"/"custom" kernel.

I don't see why you need an "unsafe"/"custom" kernel. Some programmers
do things like this often. I have done this several times for a runtime
assembler, and a compiler.

It goes something like this (assuming you're using Windows or a
Unix-like system):

In Windows:

1. VirtualAlloc() the required number of executable pages you will
need for the function's instructions.
2. generate the CPU instructions for that executable memory.
3. flush any CPU caches if needed. This is very important with some
processors.
4. if say you have void *p which points to the start of the
VirtualAlloc memory you just convert it like:
void (*myfunc) (void); myfunc = p;
Note: IIRC ANSI-C doesn't allow casting a void * to a function type,
but POSIX and other standards do.
5. use myfunc() somewhere.

In Unix-like systems:
1. mmap with PROT_EXEC the required executable pages.
steps 2-5 are the same as with Windows.

If you really want to edit live code I would suggest allocating your own
pages like I suggested. However you can change the executable
permissions with mprotect() of the existing functions in most cases,
though I'm not sure if Windows has an equivalent.

Self-modifying code can be difficult to debug, especially if you're not
familiar with the assembly language of the processor.

This is something I wrote mostly in C that generates C functions from
machine code for the x86: http://wiki.tcl.tk/20273

Here is a more generic scripted assembler I wrote in Tcl and C that
produces C functions at runtime: http://wiki.tcl.tk/20286
This is clearly a system specific problem, though an abstract overview
of how this was approached in the past would be nice.

I hope this makes the system specifics more clear.


Have fun,

George
 
E

Edoules

Dear all,

Sorry for taking so long to reply, but all of your insights are very
interesting indeed.

I'm going to take a look at the posted code.

Thanks,
Eddie Ma.
 
T

Tomás Ó hÉilidhe

All of this is
well outside the realm of C -- C doesn't even guarantee that
you can convert between data and function pointers meaningfully.

extern void Func(void);
void (*pFunc)(void) = Func;

char unsigned buf[sizeof pFunc];
memcpy(buf,&pFunc,sizeof pFunc);

/* And then back again */

memcpy(&pFunc,buf,sizeof pFunc);
 
E

Eric Sosman

Tomás Ó hÉilidhe said:
All of this is
well outside the realm of C -- C doesn't even guarantee that
you can convert between data and function pointers meaningfully.

extern void Func(void);
void (*pFunc)(void) = Func;

char unsigned buf[sizeof pFunc];
memcpy(buf,&pFunc,sizeof pFunc);

/* And then back again */

memcpy(&pFunc,buf,sizeof pFunc);

The English language's habit of overloading tokens, when
coupled with its wavering rules of binding strength, sometimes
produce an ambiguity. By "data and function pointers" I did
not mean "{noun:data} and {adjective:function noun:pointers},"
but "{adjective:data} and {adjective:function} {noun:pointers},"
parsing as "(data and function) pointers," not as "data and
(function pointers)." Or, in that tricksy English, "data
pointers and function pointers."

Sorry for the confusion.
 
T

Tomás Ó hÉilidhe

     The English language's habit of overloading tokens, when
coupled with its wavering rules of binding strength, sometimes
produce an ambiguity.  By "data and function pointers" I did
not mean "{noun:data} and {adjective:function noun:pointers},"
but "{adjective:data} and {adjective:function} {noun:pointers},"
parsing as "(data and function) pointers," not as "data and
(function pointers)."  Or, in that tricksy English, "data
pointers and function pointers."

     Sorry for the confusion.


Strangely enough, I myself would have used your "tricksy" form, both
in written and spoken language.
 
I

Ian Collins

Tomás Ó hÉilidhe said:
Strangely enough, I myself would have used your "tricksy" form, both
in written and spoken language.

I see you also like tautological forms :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top