OT: Dynamic-CC, GC, Dynamic Types, and Prototype OO in C

C

cr88192

for my own dynamic C compiler effort, I have been specifying, and have
already largely implemented, a mass of library functionality offering
Garbage Collection (conservative), Dynamic Typing (through the "magically
typed void pointer" approach), and a Prototype Object System (similar to
that of Self or JavaScript).

for general context (not intended as SPAM, but may come off this way I
guess...):
this is a C compiler which serves to behave more like a traditional VM or
interpreter, namely in that it is capable of compiling and linking code into
the running app at runtime. unlike a traditional VM, I am using C as the
primary language (both for writing the implementation and for the scripts),
and this VM is capable of linking against and directly using existing
statically compiled/linked libraries (for example, stdio, OpenGL, the Win32
API, ...).

the goal is in part to have, in C, many of the capabilities traditionally
available to languages such as LISP, Smalltalk, or Python, while still
retaining the capabilities and raw performance of C (and hopefully
eventually C++ as well, but a C++ compiler looks like a horrid beast to
implement at present).

at present, 32 bit windows is my primary development target.
most of this is being developed under the LGPL.

at present, the compiler still needs a little more work (buggy, incomplete,
and non-standards-conformant in some places).

well, LLVM is another project people can look into (for anyone that wants
this kind of functionality, this project is a lot more usable at present).
my effort is not intended to preclude the efforts of the LLVM folks, just
doing my own thing mostly...


as for the GC, Dynamic Types, and Prototype OO in C:

in all this, the C language itself is left unaffected (this is library
functionality, and does not involve use of custom compiler extensions). as
such, it will not be the "nicest" possible option, but should be workable
(basically, since I am stalling out on my ideas for implementing a scripting
language based on my compiler core, I decided instead to implement all this
as library functionality and stick to more or less "ordinary" C for the
implementation/interface).

neither the spec nor the library are "complete" at this point (I may specify
a lot more than this, and at present have only implemented the GC and
dynamic type system).

I may also include the following other features in this framework:
exceptions; multithreading; file-management/VFS, ...

to a large degree, this is being derived from existing code of mine,
however, in the process I am making fairly extensive modification to the
code in which I am reusing (basically, I have about a 300 kloc existing
codebase that I am mining for pieces in implementing all this). if taken
"all the way", this will also include GL wrapping, a GL based GUI, 3D engine
code, ... however, all this is more likely to undergo much more minor
modification.

basically, in all this, the dynamic C compiler is the core (however, it will
not be a requirement for using any of this functionality).


it is possible, however, that some people may find some of this
"interesting".


a few of the lib specs:
----

GC Basics:
This will be a conservative GC.
Pointers may be freely stored on the stack, in global variables, or in other
garbage-collected objects. The pointer may point anywhere within the
allocated object.

Caution needs to be excersized when mixing GC'ed data with non-GC'ed data
(such as memory gained from malloc, mmap, ...). Likewise when "mutilating"
pointers. In these cases, it is no longer required that the GC be able to
correctly locate or identify these pointers (resulting in 'accidental'
collection).


GC API:
void *gcalloc(size_t sz);
void *gctalloc(char *type, size_t sz);

void *gcatomic(size_t sz);
void *gctatomic(char *type, size_t sz);

void *gcrealloc(void *p, size_t sz);
void *gcfree(void *p);
char *gcgettype(void *p);
int gctypep(void *p, char *ty);

int gcsetfinal(char *ty, void (*fcn)(void *p));

Uncertain Features:
void *gctallocu(char *type, size_t sz); //unaligned alloc
void *gcmalloc(size_t sz); //alloc initially-locked object
void gclock(void *p); //prevent collection
void gcunlock(void *p); //allow collection


gcalloc():
Allocates an object on the garbage-collected heap.
The type of such a value will be viewed as being NULL.

gctalloc():
Allocates a typed object on the GC heap.
The type is a string whose structure is not specified by the GC. For custom
types, effort should be used to ensure that the string does not clash with a
builtin type or one provided by another library.

The creation of a new type may consist simply of allocating an object of the
particular type.

gcatomic()/gctatomic():
Allocates an 'atomic' object on the garbage collected heap.
An atomic object will differ from a normal object primarily in that the
contents of an atomic object will not be scanned for references.
As a result, it will be more suited for safely handling buffers and other
types known not to contain memory references.

gcrealloc():
Reallocate an object to a larger or smaller size.
The type will be be retained from the parent object.

gcfree():
Forcibly free an object. This should only be done when it is known that this
object will no longer be accessed.

gcgettype():
Return the type associated with an object.
If the pointer is not part of the GC's heap, or was gained from gcalloc,
then the return value is NULL.

gctypep():
Returns a nonzero value if the type of p is the same as that given in ty.
This determination will be based on string equality.

gcsetfinal():
Set the finalizer called for a specific type.
This function will be called while freeing objects of this type.


gcmalloc():
If included, would create objects by default resembling those from malloc
(assuming manual memory management, ...).

However, unlike malloc(), the memory returned by gcmalloc() may be searched
for pointers.


----

Dynamic Type System:

GC type names beginning with '_' will be reserved for the implementation.

General Conventions:
void *dy<type>(...);
Will create an object of the type.

int dy<type>p(void *p);
Will return a nonzero value if p is the correct type.

<type> dy<type>v(void *p);
Will get the value for a specific type.

char *dygettype(void *p);
int dytypep(void *p, char *ty);
Will get or check the type of an object.
These will exist as there is no strict requirement that some of these
builtin types will have the same GC type (or even necessarily be on-heap).
In these cases, this function may need to do additional work in order to
locate the correct type string.

If only a single type needs to be checked, the provided specialized
predicate functions will be viewed as the preferred means.


Note:
For Complex Types (such as arrays, conses, and hashes), it is acceptable to
use any kind of pointer (it does not have to be from this type system, or
necessarily even the GC). However, that is the intent that this be used
mostly for dynamic types and GC'ed data, and other GC restrictions apply.


Primitive Types: int; long; long long; float; double; fcomplex; dcomplex.
Basic Types: string; wstring; symbol; keyword.
Complex Types: array; cons; hash.

....

so:
void *dyint(int v);
void *dylong(long v);
void *dylonglong(long long v);
void *dyfloat(float v);
void *dydouble(double v);
void *dyfcomplex(float _Complex v);
void *dydcomplex(double _Complex v);

char *dystring(char *s);
char *dysymbol(char *s);
char *dykeyword(char *s);
wchar_t *dywstring(wchar_t *s);

....

int dyintp(void *p);
int dylongp(void *p);
int dylonglongp(void *p);

....

int dyintv(void *p);
long dylongv(void *p);
long long dylonglongv(void *p);

....


Integer Type Disjointness
It will not be required that integer types be disjoint.
As a result, a value created with dyint() may still return a nonzero value
with dylonglongp().

Likewise, a value created by dylonglong() may be nonzero with dyintp() in
the case where dylonglong() was called with a value sufficiently small to
fit into an integer.

As such, these predicates will mostly serve to indicate range requirements,
rather than a strict type.

Float and Double
These will, however, be regarded as disjoint types.

Real will exist as a psuedo type, which will be true for values decodable as
a double.

int dyrealp(void *p);
double dyrealv(void *p);


Arithmetic
Functions will be provided for arithmetic on dynamic types.

void *dyadd(void *a, void *b);
void *dysub(void *a, void *b);
void *dymul(void *a, void *b);
void *dydiv(void *a, void *b);
void *dymod(void *a, void *b);
void *dyand(void *a, void *b);
void *dyor(void *a, void *b);
void *dyxor(void *a, void *b);

void *dyneg(void *a);
void *dynot(void *a);

int dyeqp(void *a, void *b);
int dyneqp(void *a, void *b);
int dygtp(void *a, void *b);
int dyltp(void *a, void *b);
int dygep(void *a, void *b);
int dylep(void *a, void *b);

int dytruep(void *a);
int dyfalsep(void *a);


String:
Will behave much the same as a GC'ed dynamicly typed strdup.
Additionally, string merging may be performed, so the result should be
regarded as immutable.

Symbol and Keyword:
These will be 2 different types, but each has essentially the same
semantics.

Beyond the different types, they will have another important property:
Any 2 calls to the same function with the same input string produces the
same output (allowing '==' to be used for equality checks).


Array:

void *dyarray(int cnt);
Create an array with initially cnt pointers.

void *dyarrayidx(void *p, int idx);
void dyarraysetidx(void *p, int idx, void *q);
Get and Set array index.

void **dyarrayv(void *p);
Get a 'void **' array representing the contents of the array.


Cons:

void *dycons(void *car, void *cdr);
void *dycar(void *p);
void *dycdr(void *p);

void *dycaar(void *p);
void *dycadr(void *p);
void *dycdar(void *p);
void *dycddr(void *p);
(up to 4 levels).
....

void *dylist1(void *a);
void *dylist2(void *a, void *b);
void *dylist3(void *a, void *b, void *c);
void *dylist4(void *a, void *b, void *c, void *d);
void *dylist5(void *a, void *b, void *c, void *d, void *e);
void *dylist6(void *a, void *b, void *c, void *d, void *e, void *f);
(up to 10).
....


Hash:

void *dyhash(int cnt);
Create a hash with initially cnt slots in the table. Calling this with '0'
will specify that the implementation use a default value.

void *dyhashget(void *p, char *s);
void *dyhashset(void *p, char *s, void *q);
Get or Set the value associated with a given string index.
Set will return the previous value (or NULL).

The hash may be resized if it is full.


----

Dynamically Typed Object Orientation

Purpose:
An dynamic object system is provided.
This will provide a general prototype-based OO setup.
The purpose will not to be "high performance", or to replace more
traditional object systems or limit custom types/data, however, a goal will
be that performance remain "acceptable" for many tasks.

Objects will use a Prototype Object System.

In this system, each object itself is viewed primarily as a bag of slots,
with additional behaviors being gained by 'delegation'. In this system, a
failure to locate a slot in a given object causes the slot to be looked up.
Within this system, multiple delegates are allowed per-object (however,
"parent" is the default). Additionally, the delegation graph will be allowed
to be cyclic.

Some calls may return a special value 'UNDEFINED', which will signal an
error condition (unlike NULL, which simply means 'no value'). It will be
considered invalid to use this value otherwise (this value is only returned,
in error conditions, and is not to be passed around or stored). Later this
value may also be used as a means of raising exceptions.

When calling a method, a special method 'doesNotUnderstand' may be called
prior to returning with an error condition. The 'doesNotUnderstand' method
will be called with the same arguments as the original method call, but the
args list will be prefixed with the method name.


Conventions:
DyOO will use a variant camelCase naming scheme.


DyOO will have 2 major interfaces:

Generating code to interface with existing dynamic types. This will be done
by registering certain callbacks with the types. This will be intended
mostly for types that are implemented in plain C (as structs, ...), but for
whatever reason it is useful to be able to them access dynamically.

A full prototype system, where methods are objects in themselves.

For the former interface, these functions are provided.
void dyTypeGetSlot(char *ty, dyt (*fcn)(dyt obj, dyt sym));
void dyTypeSetSlot(char *ty, dyt (*fcn)(dyt obj, dyt sym, dyt val));
void dyTypeNextSlot(char *ty, dyt (*fcn)(dyt obj, dyt sym));
void dyTypeCallMethod(char *ty, dyt (*fcn)(dyt obj, dyt sym, dyt *args, int
nargs));
void dyTypeApply(char *ty, dyt (*fcn)(dyt obj, dyt *args, int nargs));


For the latter interface:
dyt dyFunc(dyt (*fcn)(dyt *args, int nargs));
dyt dyMethod(dyt (*fcn)(dyt self, dyt *args, int nargs));
Create a function or method.
The primary difference between them will be the existence of the 'self'
argument.

Additional forms will be provided for functions with a fixed number of args:
dyt dyFunc0(dyt (*fcn)());
dyt dyFunc1(dyt (*fcn)(dyt));
dyt dyFunc2(dyt (*fcn)(dyt, dyt));
dyt dyFunc3(dyt (*fcn)(dyt, dyt, dyt));
dyt dyFunc4(dyt (*fcn)(dyt, dyt, dyt, dyt));
dyt dyFunc5(dyt (*fcn)(dyt, dyt, dyt, dyt, dyt));
dyt dyFunc6(dyt (*fcn)(dyt, dyt, dyt, dyt, dyt, dyt));
dyt dyFunc7(dyt (*fcn)(dyt, dyt, dyt, dyt, dyt, dyt, dyt));
dyt dyFunc8(dyt (*fcn)(dyt, dyt, dyt, dyt, dyt, dyt, dyt, dyt));

dyt dyMethod0(dyt (*fcn)(dyt self));
dyt dyMethod1(dyt (*fcn)(dyt self, dyt a));
dyt dyMethod2(dyt (*fcn)(dyt self, dyt a, dyt b));
.... (up to 8)
dyt dyMethodN(dyt (*fcn)(dyt self, dyt rest));
dyt dyMethod1N(dyt (*fcn)(dyt self, dyt a, dyt rest));
dyt dyMethod2N(dyt (*fcn)(dyt self, dyt a, dyt b, dyt rest));
.... (up to 8)


Objects:

dyt dyObject();
dyt dyObjectParent(dyt parent);
Create a new prototype object. In the former case, the object is empty, and
in the a latter, a parent is defined. The parent will be used in a 'parent'
delegate slot.

dyt dyGet(dyt obj, char *sym);
Get a slot from an object.
Returns UNDEFINED if the slot was not found.
May recurse into delegates in order to locate the slot.

void dySet(dyt obj, char *sym, dyt val);
Set a slot in an object.
Recursion may be used in order to locate the slot, and if found, it is set
in the object in which it is defined. If not found, it is bound within the
current object.

void dyBind(dyt obj, char *sym, dyt val);
This is similar to dySet(), but differs primarily in the way in which slots
are assigned. In the case of dyBind, if the slot does not exist within the
current object, it is created (rather than recursing and assigning).

void dyBindMethod(dyt obj, char *sym, dyt mth);
This will bind a method in the current object. This will behave differently
than the normal 'set' or 'bind' operations, in that multiple versions of a
method may be bound in an object. In this case, they will be merged into a
kind of 'multi-method', which will dispatch based on args count. As such,
Set() and Bind(), will not perform such merging (but will simply replace the
method, including multi-methods).

void dyDefMethod(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt *args, int nargs));
dyt dyDefMethod0(dyt obj, char *sym,
dyt (*fcn)(dyt self));
dyt dyDefMethod1(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt a));
dyt dyDefMethod2(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt a, dyt b));
.... (up to 8)

void dyDefMethodN(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt rest));
dyt dyDefMethod1N(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt a, dyt rest));
dyt dyDefMethod2N(dyt obj, char *sym,
dyt (*fcn)(dyt self, dyt a, dyt b, dyt rest));
.... (up to 8)

Define a method for an object.
This will be similar to the above, but will accept a raw function pointer
instead of a method object.

The N-forms will fold the rest of the args into a list.


dyt dyGetDelegate(dyt obj, char *sym);
void dySetDelegate(dyt obj, char *sym, dyt val);
Will get or set the value of an object's delegate.

dyt dyGetParent(dyt obj);
void dySetParent(dyt obj, dyt val);
Special cases of the above, but are specialized to the 'parent' delegate.

dyt dyCall(dyt obj, char *sym, dyt *args, int nargs);
dyt dyCall0(dyt obj, char *sym);
dyt dyCall1(dyt obj, char *sym, dyt a);
dyt dyCall2(dyt obj, char *sym, dyt a, dyt b);
.... (up to 8)
dyt dyCallN(dyt obj, char *sym, dyt args);
dyt dyCall1N(dyt obj, char *sym, dyt a, dyt args);
dyt dyCall2N(dyt obj, char *sym, dyt a, dyt b, dyt args);
.... (up to 8)


Call a method within an object, and return the result.
Returns UNDEFINED if the method was not found.
Otherwise, this will return the return value of the method.


A simple example:

dyt testMath_doesNotUnderstand(dyt self, dyt sym, dyt rest)
{
if(!strcmp(sym, "pyth"))
return(dysqrt(dyadd(
dysqr(dycar(rest)),
dysqr(dycadr(rest)))));
return(NULL);
}

....
dyt obj, v;

obj=dyObjectParent(dyTop());
dySetDelegate(dyTop(), "testMath", obj);
dyDefMethod1N(obj, "doesNotUnderstand", testMath_doesNotUnderstand);

v=dyCall2(dyTop(), "pyth", dyint(3), dyint(4));
 
C

Chris Thomasson

[...]
a few of the lib specs:
----

GC Basics:
This will be a conservative GC.
Pointers may be freely stored on the stack, in global variables, or in
other garbage-collected objects. The pointer may point anywhere within the
allocated object.

[...]

I hope that this is not a stop-the-world and/or collect-the-world scheme?
 
C

cr88192

Chris Thomasson said:
[...]
a few of the lib specs:
----

GC Basics:
This will be a conservative GC.
Pointers may be freely stored on the stack, in global variables, or in
other garbage-collected objects. The pointer may point anywhere within
the allocated object.

[...]

I hope that this is not a stop-the-world and/or collect-the-world scheme?

sadly, at present, it is...

however, I have it to be able to collect a 1GB heap in about 400ms on my
current main computer (Athlon 64 X2 4400+). it is a bit slower (about 10s)
if I fill the heap with random garbage, but this is outside its "reasonable"
operating range (a smaller heap is much faster).

however, my intention is to fix this when I get to it (when I implement
threading...).


I had written an incremental collector at one point in the past (2002 or
2003), but the main project in which I had been using this particular
collector (only slightly modified since about 2004, until recently), I have
been using single-threaded development (back then I had decided
multithreading was too much hassle and too error prone to be worthwhile).

but, now, I have been looking into implementing write barriers, ...

however, before this point, I had considered a half-assed approach:
I use SuspendThread for all the GC-managed threads on starting GC, and
ResumeThread on them when the collector finishes...

(worst case option: it would be possible to use Boehm in the background, and
simply dump the whole dynamic typesystem on top of Boehm...).


sad but, yeah...
 
C

cr88192

Chris Thomasson said:
[...]

Why does the Prototype OO need to be more complicated than the following
technique:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/1b106926ba5db19f

?

this Prototype OO is also Dynamically Typed...

the kind of informal OO mentioned in the referenced thread, is not
ammendable to really any kind of dynamic behavior.

now, more "manual" methods are still considered reasonable in my effort,
they are just handled a little differently...


ok, I have written several script VMs in the past, and the design I have
used, is based largely on the way I typically approach implementing
Prototype OO in a script VM...

the extra API complexity (all the variations for differing numbers and
configurations of arguments) is due to trying to make the interface more
usable from C code (getting a flat array or a list as the sole source of
function/method arguments, is, lame...).


the extra complexity in the 'call' variations is for a similar reason.

dyCall2(obj, "foo", bar, baz);
is slightly less cumbersome than:
dyCall(obj, "foo", dylist2(bar, baz));

and so on...


likewise, I may well have some intention of implementing global
object/method hashing (where the object and method name are used to
hash-optimize method lookups), and this requires a lot of what I have here.

I will argue what I have, for what it is worth, is likely to pay for
itself...

or such...
 
M

Marco van de Voort

["Followup-To:" header set to comp.lang.misc.]
sadly, at present, it is...
however, I have it to be able to collect a 1GB heap in about 400ms on my
current main computer (Athlon 64 X2 4400+). it is a bit slower (about 10s)
if I fill the heap with random garbage, but this is outside its "reasonable"
operating range (a smaller heap is much faster).

(I'd say the number of heap objects, and the way they are connected (the avg
distance from a root) would be the rate determinating factor, not the heap total size))
 
C

cr88192

Chris Thomasson said:
[...]
a few of the lib specs:
----

GC Basics:
This will be a conservative GC.
Pointers may be freely stored on the stack, in global variables, or in
other garbage-collected objects. The pointer may point anywhere within
the allocated object.

[...]

I hope that this is not a stop-the-world and/or collect-the-world scheme?

digging around the Win32 API, I have found that windows has built-in support
for write barriers.
this is, pretty damn useful...

it is actually seemingly nicer than the mmap/mprotect approach (no signal
catching, ...).


so, yeah, in my MM/GC, I will probably go and implement write barriers, and
probably go and try to make the thing thread safe.
as such, will also make it into an incremental collector, which may or may
not involve drastic restructuring. not like this has not already been done
in this (modifying it to support 'large objects' and make use of a
marking-stack were hardly minor either...).


of course, going from malloc to VirtualAlloc, may require some alteration to
the large object allocator (namely, the implementation of yet another
specialized memory manager). since it will need to do a lot of memory-object
queries, I may not implement a traditional block and free-list allocator,
but maybe base the allocator on AVL trees (combining AVL trees and
allocation of fragmentary underlying memory could be difficult though,
unless each major chunk had a semi-disjoint tree).

alternatively, I could use virtual alloc as an allocator directly
(questionable).
or, implement a good old block and free-list allocator (downsides: various
sized free lists, and the need to merge adjacent blocks on freeing).
another possibility is the implementation of a much larger cell allocator
(each cell being 256 bytes, or maybe 1kB, or even a full page...).


AVL trees:
pros: allow very fast pointer/memory lookups; adjacency queries; ...
cons: given the absence of free lists, the options amount to either a linear
best fit, or first fit (best fit is O(n log2 n));
memory lookups are unlikely to be the limiting factor in this case (already
done in the 'large object' GC code).

Direct VirtualAlloc:
pros: simple.
cons: 4kB alloc granularity, likely to do bad things to the address space.

Blocks and Free Lists:
pros: can look up reasonably well-sized memory blocks fairly quickly
(depends, but O(1) is reasonable);
cons: the need to explicitly split and merge memory blocks; absent
additional structures, memory object lookups are an O(n) operation (may be
acceptable, since these will only occure during the sweep phase).

Large Cells:
pros: fast lookups, no need for explicit merging when freeing;
cons: cell-based chunking; need to search for free memory; memory allocation
is almost always first-fit.


it looks like, in this case, and the usage pattern (as the back end
allocator/freer for the large-object part of the GC), the blocks and
free-list approach is probably the best option.


well, I will see how much time I have (I have other stuff that needs to be
done as well, groan...).
 
C

Chris Thomasson

cr88192 said:
Chris Thomasson said:
[...]
a few of the lib specs:
[...]
I hope that this is not a stop-the-world and/or collect-the-world scheme?

digging around the Win32 API, I have found that windows has built-in
support for write barriers.
this is, pretty damn useful... [...]
so, yeah, in my MM/GC, I will probably go and implement write barriers,
and probably go and try to make the thing thread safe.
as such, will also make it into an incremental collector, which may or may
not involve drastic restructuring. [...]
of course, going from malloc to VirtualAlloc, may require some alteration
to the large object allocator (namely, the implementation of yet another
specialized memory manager).
[...]

There are some advantages to using VirtualAlloc in the context of an
alignment trick using in lock/wait-free programming:

http://groups.google.ca/group/comp.programming.threads/msg/e70d09027a9dd9e2

http://groups.google.ca/group/comp.programming.threads/browse_frm/thread/dbb6d6fc3de1c5c3

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/80573a4efc77608d



Also IMHO, you simply HAVE to read all of the following:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/22b2736484af3ca6

http://blogs.sun.com/dave/resource/Asymmetric-Dekker-Synchronization.txt

http://www.trl.ibm.com/people/kawatiya/Kawachiya05phd.pdf
(chapter 6)

http://appcore.home.comcast.net/vzoom/round-1.pdf
(page 3/section 3)

http://appcore.home.comcast.net/vzoom/round-2.pdf
(page 2/section 1.2)

https://coolthreads.dev.java.net/servlets/ProjectForumMessageView?forumID=1797&messageID=11068
(lock-free reader pattern...)

http://en.wikipedia.org/wiki/Read-copy-update
(classic RCU...)


Follow the links at the end of the following post as well:

https://coolthreads.dev.java.net/servlets/ProjectForumMessageView?messageID=11001&forumID=1797


Also, (in a C++ context):

http://groups.google.com/group/comp.lang.c++/browse_frm/thread/95007ffdf246d50e

http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/86a5a3f804c84ea4

http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/ddd7fcd780dc467c



All of that information can be of service wrt "low-overhead" scaleable
multi-threading.
 
C

Chris Thomasson

Chris Thomasson said:
[...]

[...]

These were drafts. Sorry for any typos in there. For the scope of this
discussion, I would focus on page 3/section 3 of round-1... The logic is
correct wrt the given scenario.

Ahh shit!

I should not of linked to the pre-alpha one. Acutally, there is a logic
error concerining the following paragraph:


Bugged paragraph:
__________________________________________________
Lets briefly compare the per-node solution to a lock-free reader pattern
using the distributed proxy reference counting algorithm. Now the reader
threads could setup a dynamic ratio of searches to synchronization
operations. Lets say the programmer created a synchronization schedule that
allowed for reader threads to execute only 1 synchronization operation for
every, lets say, thousand searches. How does that compare with the per-node
locking scheme that can expose threads to possibly hundreds of lock/unlock
operations per-search? Well, by the time a reader thread in the per-node
locking setup finally completes a thousand searches with an average of
six-hundred lock/unlock operations per-search, it will have executed
approximately sixhundred thousand atomic operations and six-hundred thousand
StoreLoad style memory barriers. When a reader thread in the lock-free
reader pattern completes a thousand searches it will have only executed
four, or less, atomic operations and/or StoreLoad style memory barriers. It
can boil down to an average of twelve-million versus four, or less,
expensive operations perthread, per-thousand-searches. In my option, the
choice is clear: The lock-free reader pattern is a winner.
__________________________________________________


The above paragraph says that 1000 lock-based searches, with an average of
600 lock and unlock's per-search is only 600,000 atomic-ops and 600,000
membars... Well, that is simply NOT true. Since a single lock or unlock
operation usually takes 1 atomic-op and 1 membar; the figure '600,000'
should really be '1,200,000'."


So, the following statement is correct:

A 1000 lock-based searches, with an average of 600 lock/unlock per-search is
usually 1,200,000 atomic ops and 1,200,000 membars, for a total of 2,400,000
operations per-1000-lock-based-searches.




This means that the following statement:

"It can boil down to an average of twelve-million versus four, or less,
expensive operations per-thread, per-thousand-searches."


is wrong. Here is the fix:

It can boil down to an average of 2,400,000 expensive operations per-1000
searches when using lock-based, versus four, or less, expensive operations
using vZOOM for the same load.



Sorry!
 
C

cr88192

Chris Thomasson said:
cr88192 said:
Chris Thomasson said:
[...]

a few of the lib specs:
----

GC Basics:
This will be a conservative GC. [...]
I hope that this is not a stop-the-world and/or collect-the-world
scheme?

digging around the Win32 API, I have found that windows has built-in
support for write barriers.
this is, pretty damn useful... [...]
so, yeah, in my MM/GC, I will probably go and implement write barriers,
and probably go and try to make the thing thread safe.
as such, will also make it into an incremental collector, which may or
may not involve drastic restructuring. [...]
of course, going from malloc to VirtualAlloc, may require some alteration
to the large object allocator (namely, the implementation of yet another
specialized memory manager).
[...]

There are some advantages to using VirtualAlloc in the context of an
alignment trick using in lock/wait-free programming:

All of that information can be of service wrt "low-overhead" scaleable
multi-threading.

ok, a little skimming, not much of comment thus far...


but, mostly I am using VirtualAlloc because windows allows me to do certain
things I can't do with malloc'ed memory (such as detecting memory write
accesses).

this is necessary for an incremental garbage collector in order to try to
avoid accidentally losing references to memory objects due to actions being
performed in other threads.


ended up largely writing a kind of 'malloc replacement' for the purpose of
supporting large objects (all the smaller objects are handled with a cell
allocator).

this allocator is based on blocks and free lists.


I will probably also specify and implement code for wrapping the threads
interface.

or such...
 
C

Chris Thomasson

cr88192 said:
ok, a little skimming, not much of comment thus far...

I think the most interesting one for you to read first would be:

http://blogs.sun.com/dave/resource/Asymmetric-Dekker-Synchronization.txt

Because it talks about many different methods of biased-locking with
synchronization based on page-protections (mprotect), JNI execution
barriers, VirtualAlloc, QRL algorithms and more... You could use some of
those schemes in your garbage collector logic.


[...]
 
C

cr88192

Chris Thomasson said:
I think the most interesting one for you to read first would be:

http://blogs.sun.com/dave/resource/Asymmetric-Dekker-Synchronization.txt

Because it talks about many different methods of biased-locking with
synchronization based on page-protections (mprotect), JNI execution
barriers, VirtualAlloc, QRL algorithms and more... You could use some of
those schemes in your garbage collector logic.

yes, ok.

misc: earlier, I went and built my GC and compiler on linux (I had rebooted
into linux, mostly to reformat a new external HD into Fat32, rather than
NTFS as it came as), and was faced with the case that linux has (at least in
this case) __data_start and __bss_start, but not __data_end or __bss_end.
this was relied on in the windows version (mostly, this is needed to
garbage-collect the statically-linked code, but is not needed for the
dynamically compiled code).

eventually, for the linux build I just commented out the lines for
collecting the data and bss sections...

(I had thought I had remembered these symbols being around when I built for
linux before, but oh well...).

for now I am back in windows, where I have the needed symbols (just with
different names...).


(cleaned up a bunch of other build-related issues as well).

note that at present, my C compiler will not work for linux-x86_64 (sadly,
even though this is what I build on), mostly because the x86-64 support is
as of yet incomplete (the major remaining hurdle is actually the calling
convention, which will require something drastic somewhere or another).

the assembler core and so on work, so one could use dynamic assembler on
x86-64, just not C as of yet.

(had they just passed everything on the stack, or at least allowed 'holes'
where the values would go, that would have worked...).


partly spec'ed out an idea for a threading interface as well.
mostly unsure on some things, namely, where there are various differneces
between the Win32 API and pthreads (for example, should threads return ints
or pointers?, ...).

the API is at present representing more of a least common denominator.


but, things are slow, and I also have a lot of homework to do...

or such...

 
R

Rod Pemberton

cr88192 said:
... was faced with the case that linux has (at least in
this case) __data_start and __bss_start, but not __data_end or __bss_end.
this was relied on in the windows version

You may have some problems with that being portable. I read that *nix
linkers historically define etext, edata, end for the end of text, data, bss
respectively. I was trying to use them in my OS. DJGPP seems to define
some form of those and quite a few more. OpenWatcom AFAICT only defines
_edata and _end around bss.


Rod Pemberton
 
C

cr88192

Rod Pemberton said:
You may have some problems with that being portable. I read that *nix
linkers historically define etext, edata, end for the end of text, data,
bss
respectively. I was trying to use them in my OS. DJGPP seems to define
some form of those and quite a few more. OpenWatcom AFAICT only defines
_edata and _end around bss.

ok, I saw those, just wasn't sure what there were...

as for portability, yeah, but do I have much better?...
for all I am doing recently I have moved a little outside the realm of what
is portable...

or such...
 
C

cr88192

cr88192 said:
for my own dynamic C compiler effort, I have been specifying, and have
already largely implemented, a mass of library functionality offering
Garbage Collection (conservative), Dynamic Typing (through the "magically
typed void pointer" approach), and a Prototype Object System (similar to
that of Self or JavaScript).

next thing of note, the C Compiler API.
this is intended to provide a nifty frontend interface to the C Compiler
core.
basically, I wrap up some of the ugliness and internals, and try to make an
interface that is hopefully actually usable.

note, this is more presented as an 'idea' than anything else. for example,
is this sane? is the API good? ...


this frontend is threads-aware, or at least to the extent that it prevents
re-entrency into the compiler core (could very likely mess things up
something hard...).

at present, one can load things, such as source files (via
'ccLoadModule()'), and one can retrieve values and function pointers.

an example:
int main()
{
...
void *p;
...
ccLoadLibrary("opengl32"); //load a system library (say,
'opengl32.dll')
ccLoadModule("scripts/entry.c"); //load a C script
p=ccGetAddr("scripts_init"); //get a function's address
if(p)((int(*)())p)(); //if got something, call the function
...
}

note that library loading currently works for both DLLs and GNU-AR + COFF
style static libraries (static libraries currently only work on windows due
to lacking ELF support, then again, static libraries are much less needed in
linux than they are in windows...).

a basic spec beaten together describing the present interface:
----

BSCC API

Attempt to spec a general abstract interface to the compiler core.

void ccAddInclude(char *path);
void ccAddLibrary(char *path);
void ccAddSource(char *path);
Add a path. Include paths are searched for headers, library paths for
libraries, and source paths for source modules.

void ccLoadModule(char *file);
Load a source module or library (depends on file extension).

void ccLoadLibrary(char *name);
Loads a library. May go through extra magic in locating the lib, such that
name is a generic abstract name (rather than a filename as is the case for
ccLoadModule).

void ccPrecompileModule(char *file);
Precompile a module. File may be either a header or source module, and may
be under either the source or header paths.

void *ccGetAddr(char *sym);
Get the address of a given symbol.

void ccSetAddr(char *sym, void *ptr);
Set the address of a given symbol (or define a new symbol). This may
forcibly relink parts of the image if modifying a pre-existing symbol
(though this is not required, and may not be uniformly carried out).


void *ccGetPtr(char *sym)
void ccSetPtr(char *sym, void *v)
Get/Set global interpreted as a pointer.

int ccGetInt(char *sym)
void ccSetInt(char *sym, int v)
Get/Set global interpreted as an integer.

s64 ccGetLongLong(char *sym)
void ccSetLongLong(char *sym, s64 v)
Get/Set global interpreted as a long long.

float ccGetFloat(char *sym)
void ccSetFloat(char *sym, float v)
Get/Set global interpreted as a float.

double ccGetDouble(char *sym)
void ccSetDouble(char *sym, double v)
Get/Set global interpreted as a double.
 
J

jacob navia

cr88192 said:
next thing of note, the C Compiler API.
this is intended to provide a nifty frontend interface to the C Compiler
core.
basically, I wrap up some of the ugliness and internals, and try to make
an interface that is hopefully actually usable.

note, this is more presented as an 'idea' than anything else. for
example, is this sane? is the API good? ...


this frontend is threads-aware, or at least to the extent that it
prevents re-entrency into the compiler core (could very likely mess
things up something hard...).

at present, one can load things, such as source files (via
'ccLoadModule()'), and one can retrieve values and function pointers.

an example:
int main()
{
...
void *p;
...
ccLoadLibrary("opengl32"); //load a system library (say,
'opengl32.dll')
ccLoadModule("scripts/entry.c"); //load a C script
p=ccGetAddr("scripts_init"); //get a function's address
if(p)((int(*)())p)(); //if got something, call the function
...
}

note that library loading currently works for both DLLs and GNU-AR +
COFF style static libraries (static libraries currently only work on
windows due to lacking ELF support, then again, static libraries are
much less needed in linux than they are in windows...).

a basic spec beaten together describing the present interface:
----

BSCC API

Attempt to spec a general abstract interface to the compiler core.

void ccAddInclude(char *path);
void ccAddLibrary(char *path);
void ccAddSource(char *path);
Add a path. Include paths are searched for headers, library paths for
libraries, and source paths for source modules.

void ccLoadModule(char *file);
Load a source module or library (depends on file extension).

void ccLoadLibrary(char *name);
Loads a library. May go through extra magic in locating the lib, such
that name is a generic abstract name (rather than a filename as is the
case for ccLoadModule).

void ccPrecompileModule(char *file);
Precompile a module. File may be either a header or source module, and
may be under either the source or header paths.

void *ccGetAddr(char *sym);
Get the address of a given symbol.

void ccSetAddr(char *sym, void *ptr);
Set the address of a given symbol (or define a new symbol). This may
forcibly relink parts of the image if modifying a pre-existing symbol
(though this is not required, and may not be uniformly carried out).


void *ccGetPtr(char *sym)
void ccSetPtr(char *sym, void *v)
Get/Set global interpreted as a pointer.

int ccGetInt(char *sym)
void ccSetInt(char *sym, int v)
Get/Set global interpreted as an integer.

s64 ccGetLongLong(char *sym)
void ccSetLongLong(char *sym, s64 v)
Get/Set global interpreted as a long long.

float ccGetFloat(char *sym)
void ccSetFloat(char *sym, float v)
Get/Set global interpreted as a float.

double ccGetDouble(char *sym)
void ccSetDouble(char *sym, double v)
Get/Set global interpreted as a double.

Hi

Maybe can you explain a bit more what this API is doing?

When I write
ccGetPtr("myfn")
You will get me the pointer to a function in some dll?
Or you will get me the pointer to a function in some compiler
structure?
Or you will get me the pointer to a function in some executable
not currently loaded?
 
C

cr88192

jacob navia said:
Hi

Maybe can you explain a bit more what this API is doing?

When I write
ccGetPtr("myfn")
You will get me the pointer to a function in some dll?
Or you will get me the pointer to a function in some compiler
structure?
Or you will get me the pointer to a function in some executable
not currently loaded?

'GetAddr' would be used for getting a function pointer.
'GetPtr' will try to interpret whatever is there as a pointer.

so:
ccGetAddr("myfn");

will load it from wherever it is, DLL, or in a loaded C source file, ...

or such...
 
C

cr88192

Hi

Maybe can you explain a bit more what this API is doing?

maybe.

When I write
ccGetPtr("myfn")

as noted, in this case, we want ccGetAddr...

ccGetPtr("foo");
is technically, more like:
*(void **)ccGetAddr("foo");

You will get me the pointer to a function in some dll?

yes, if this function was declared in a DLL.
note that it will also fetch pointers to functions defined within the main
app (the exe which is using this library), any loaded static libraries, or
any dynamically compiled code.

on windows, at least part of the mechanics are implemented via LoadLibrary
and GetProcAddress (the rest is done via the linker core, and by manually
loading and processing the EXE file for the app).

on linux, for the most part (apart from the stuff the dynamic linker did
itself), I use dlopen and dlsym (which generally work a little better than
the stuff provided by windows...).

Or you will get me the pointer to a function in some compiler
structure?

well, it would be more termed an "assembler/linker" structure.
basically, the linker core maintains something resembling a pair of heaps,
where one is used for '.text' and the other for '.data' and '.bss' and
similar.

when an object file is linked in, space is allocated to hold the various
sections (pretty much everything non-text goes in the data/bss area).

also, compiler output is run through the assembler, which is also passed to
the linker and linked into the image in a similar manner to object files
(actually, object files are loaded into a form more resembling the assembler
output).

in this case, the returned pointer will point into this memory (it makes
sense to point to a linked-in place in memory, but little sense to point to
somewhere in an object file or similar...).

Or you will get me the pointer to a function in some executable
not currently loaded?

I don't get what you mean by this...


the pointers are only to resident code or data...

for example: 'ccGetAddr("ccGetAddr");' will presumably return the same value
as: '((void *)(&ccGetAddr));'...
 
B

Blake McBride

I added OO capabilities to C including a full MOP, multiple inheritance,
threads, and GC. It is very highly portable to Linux, Windows, Mac,
etc. I call it Dynace and it is open source. See:

http://blake.mcbride.name
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,001
Latest member
Kendra00E1

Latest Threads

Top