What's the connection between objects and threads?

G

Gerhard Fiedler

That's not the only guarantee it gives. It specifies the contract that
you have to respect. In other words, it is completely thread safe.

Using Wikipedia as one of the repositories that reflect "common use", it
seems that there is at least one use of "thread safety" that contradicts
you. From <http://en.wikipedia.org/wiki/Thread-safe>: "A piece of code is
thread-safe if it functions correctly during simultaneous execution by
multiple threads." Most STL implementations need additional implementation
elements, usually what the same article refers to as "mutual exclusion", to
achieve this goal -- which makes them not thread safe in the sense of this
article. At least that's how I read it.

(Your later comments about returned references notwithstanding...)

Gerhard
 
J

James Kanze

On 2008-05-21 05:47:37, James Kanze wrote:
Using Wikipedia as one of the repositories that reflect
"common use",

Common use by whom? The Wikipedia isn't exactly what I would
call a reference.
it seems that there is at least one use of "thread safety"
that contradicts you.

There are a lot of people who seem to think that "thread safety"
means that the programmer doesn't have to do anything, and that
everything will just magically work. Such thread safety doesn't
exist in the real world, however; such a definition is useless;
and I don't know of any competent person who would use it.
From <http://en.wikipedia.org/wiki/Thread-safe>: "A piece of
code is thread-safe if it functions correctly during
simultaneous execution by multiple threads."

Which is rather ambiguous. What is meant by "simultaneous
execution"? The code in the allocators of an STL vector cannot
normally be executed simultaneously by multiple threads, but
certainly you wouldn't consider STL vector to not be thread safe
because it uses some sort of protection interally to ensure that
it doesn't execute simultaneously. And if you follow the rules
laid down by SGI (which are basically those of Posix), all of
the functions in their implementation work in a multithreaded
environment. (Roughly speaking, SGI uses the same definition as
Posix. Except that I don't think that any of the functions in
their implementation of the library are not thread safe, where
as in Posix, things like localtime, etc. are not thread safe.)
Most STL implementations need additional implementation
elements, usually what the same article refers to as "mutual
exclusion", to achieve this goal -- which makes them not
thread safe in the sense of this article.

If that's what the article actually says, then it's off base.
One can talk of different degrees of thread safety (although I
don't like the expression), with different types of guarantees,
but when you say that a function is not thread-safe, at all, you
mean something like localtime, which isn't thread safe and
cannot be used at all in a multithreaded environment. And if
you call the SGI implementation of std::vector "not
thread-safe", what do you call something like localtime?
 
J

Jerry Coffin

[ ... ]
If that's what the article actually says, then it's off base.
One can talk of different degrees of thread safety (although I
don't like the expression), with different types of guarantees,
but when you say that a function is not thread-safe, at all, you
mean something like localtime, which isn't thread safe and
cannot be used at all in a multithreaded environment. And if
you call the SGI implementation of std::vector "not
thread-safe", what do you call something like localtime?

Umm...."incompetent"? :)

The Posix version of localtime is completely unsafe in a multithreaded
environment. There are other multithreaded environments (e.g. Win32) in
which one can use localtime quite safely. Of course, it's thread-safe
only in a manner roughly similar to SGI's STL -- you certainly _can_
cause problems if you abuse it sufficiently.
 
K

kwikius

Any thoughts?

Well, as I said various times before I'm pretty green when it comes to
threads.

My only thought was that it would be cool if I had control of the thread
scheduler and all threads communicated with the scheduler. The scheduler
keeps a data structure for each thread which is basically read only as far
as the scheduler is concerned Each thread has a yield_to_scheduler(
my_thread_id, mystate) primitive which means that when it has done some
significant work it suspends and its data structure is updated (The data
structure per thread would probably be as simple a I have now completed
step3 of my process or whatever). The scheduler then can block progress of
other threads or start other threads dependent on the current state it sees.
The scheduler acts somewhat like a synchronous clock and each thread excutes
one step or clock cycle then suspends. Essentially the idea would be to try
to make a synchronous state machine rather than asynchronous system.

What do you think.. Cool or naff ?

regards
Andy Little
 
G

gpderetta

Well, as I said various times before I'm pretty green when it comes to
threads.

My only thought was that it would be cool if I had control of the thread
scheduler and all threads communicated with the scheduler. The scheduler
keeps a data structure for each thread which is basically read only as far
as the scheduler is concerned Each thread has a yield_to_scheduler(
my_thread_id, mystate) primitive which means that when it has done some
significant work it suspends and its data structure is updated (The data
structure per thread would probably be as simple a I have now completed
step3 of my process or whatever). The scheduler then can block progress of
other threads or start other threads dependent on the current state it sees.
The scheduler acts somewhat like a synchronous clock and each thread excutes
one step or clock cycle then suspends. Essentially the idea would be to try
to make a synchronous state machine rather than asynchronous system.

What do you think.. Cool or naff ?

This is cooperative multithreading, which is orthogonal to preemptive
multithreading. The first is used to simplify event driven programs,
the second is used to take advantage of natural parallelism of the
platform you are running on. The first one is
important to simplify systems which have to deal with a very large
amount of tasks (for example a web service), most of which are idle
most of the time; while the second is to speed up parallel CPU bound
programs.

Really, the serve for two completely different purposes, and can even
be used at the same time! In fact many platforms for
years used the so called M:N model for implementing threads, i.e. M
cooperatively scheduled user space threads running on N preemptively
scheduled kernel threads. Most of these implementations have since
moved to pure kernel based threads though, because
the complexity of an user space scheduler didn't justify the ease of
spawning a thread for very fine grained tasks.

I think that the perfect solution is having preemptive threads as the
usual threading primitive, and, for event driven programs,
cooperatively scheduled threads on top of an application specific
scheduler.
 
S

Szabolcs Ferenczi

[...]
Well, as I said various times before I'm pretty green when it comes to
threads.

You have been begging for a course before:
http://groups.google.com/group/comp.lang.c++/msg/8b50cee90665066b

Now, here is your mini-course:
My only thought was that it would be cool if I had control of the thread
scheduler and all threads communicated with the scheduler. The scheduler
keeps a data structure for each thread which is basically read only as far
as the scheduler is concerned Each thread has a yield_to_scheduler(
my_thread_id, mystate) primitive which means that when it has done some
significant work it suspends and its data structure is updated (The data
structure per thread would probably be as simple a I have now completed
step3 of my process or whatever). The scheduler then can block progress of
other threads or start other threads dependent on the current state it sees.
The scheduler acts somewhat like a synchronous clock and each thread excutes
one step or clock cycle then suspends. Essentially the idea would be to try
to make a synchronous state machine rather than asynchronous system.

What do you think.. Cool or naff ?

I think you are re-inventing the wheel here. If you learnt about
concurrent programming, you would know that something like this is
called non-preemptive scheduling. It simplifies a lot but on the other
hand it serialises the whole system at the global level. So it works
well on a uni-processor but it just simulates a uni-processor on a
multi-processor system.

However, there were attempts to partition the global system state
while keeping non-preemptive scheduling. It is the programming model
for which I have called your attention so many times in this
discussion thread too. Brinch Hansen in the Distributed Processes
programming model just partitions the global state and says that one
instance of the Distributed Processes (DP) works this way (with pre-
emptive scheduling, called interleaving in DP) while the Distributed
Processes themselves are active simultaneoulsy and they are
communicating with each other via Remote Procedure Calls (called
external requests in DP).

http://brinch-hansen.net/papers/1978a.pdf

You were considering something like DP but, of course, in a much lower
level.

We have just made a step during this discussion to get closer to
unifying objects and threads. The resulting active object is similar
to a Distributed Process except that it does not use non-preemptive
scheduling. See the result here from Chris:

http://groups.google.com/group/comp.lang.c++/msg/20c14991f9e88482

It would even make sense for the C++0x committee to take it over into
the draft proposal as an optional higher level construction:
structured parallelism.

Best Regards,
Szabolcs
 
K

kwikius

This is cooperative multithreading, which is orthogonal to preemptive
multithreading. The first is used to simplify event driven programs,
the second is used to take advantage of natural parallelism of the
platform you are running on. The first one is
important to simplify systems which have to deal with a very large
amount of tasks (for example a web service), most of which are idle
most of the time; while the second is to speed up parallel CPU bound
programs.

OK. So we are differentiating here between concurrency within an
application (user space) and running several unrelated applications
concurrently (kernel space).

What's different about the 2. IIRC Windows 3.1 used cooperative
multitasking in kernel space. An application (essentially a list of
callback functions called by the OS when user input occurred over
their patch) was supposed to call 'yield' periodically if it
was doing a lengthy operation.

The problem is that kernel space is hostile. The kernel just allocates
memory and services requests but has no real interest in what the
application is doing. The application is competing with other
applications and has no real incentive to call yield, it only makes
the app look slower. IOW cooperative multitasking doesnt work in
kernel space, the only solution then is preemptive multitasking, which
is a blunt instrument, but pragmatic.

Within an application however cooperative multitasking seems to make a
lot of sense. The application adds up to a common whole and its parts
are designed to cooperate. Its a friendly environment not a hostile
one. In this environment preemptive multitasking doesnt seem to make
much sense, although it seems to be what we have with a messy fight by
threads to take control of some slice of shared memory at some
arbitrary time and lock out other threads who might to try to grab it
at any time.

regards
Andy Little
 
G

gpderetta

OK. So we are differentiating here between concurrency within an
application (user space) and running several unrelated applications
concurrently (kernel space).

What's different about the 2. IIRC Windows 3.1 used cooperative
multitasking in kernel space. An application (essentially a list of
callback functions called by the OS when user input occurred over
their patch) was supposed to call 'yield' periodically if it
was doing a lengthy operation.

The problem is that kernel space is hostile. The kernel just allocates
memory and services requests but has no real interest in what the
application is doing. The application is competing with other
applications and has no real incentive to call yield, it only makes
the app look slower. IOW cooperative multitasking doesnt work in
kernel space, the only solution then is preemptive multitasking, which
is a blunt instrument, but pragmatic.

yes, more or less this is the situation.
Within an application however cooperative multitasking seems to make a
lot of sense. The application adds up to a common whole and its parts
are designed to cooperate. Its a friendly environment not a hostile
one. In this environment preemptive multitasking doesnt seem to make
much sense, although it seems to be what we have with a messy fight by
threads to take control of some slice of shared memory at some
arbitrary time and lock out other threads who might to try to grab it
at any time.

It still makes sense if you need one or more of these situations:

- need take advantage of multiple CPU for parallelism (no way to do
cooperative multitasking there and still get any reasonable
parallelism),

- get bounded worst case latencies (with a real time scheduler)

- your OS doesn't support an asynchronous variant of the blocking
system call you need.
 
G

Gerhard Fiedler

In this environment preemptive multitasking doesnt seem to make much
sense, although it seems to be what we have with a messy fight by
threads to take control of some slice of shared memory at some arbitrary
time and lock out other threads who might to try to grab it at any time.

There are situations where one application needs to serve several
asynchronous events (for example a web server). In a cooperative
multitasking scheme the application is required to periodically poll each
device that may create an event. With preemptive multitasking, threads can
just wait for an event, and leave the actual communication mechanism to the
asynchronous event source up to the OS (which may poll a device, or react
to a hardware interrupt, or whatever).

Gerhard
 
G

gpderetta

There are situations where one application needs to serve several
asynchronous events (for example a web server). In a cooperative
multitasking scheme the application is required to periodically poll each
device that may create an event. With preemptive multitasking, threads can
just wait for an event, and leave the actual communication mechanism to the
asynchronous event source up to the OS (which may poll a device, or react
to a hardware interrupt, or whatever).

But see http://www.kegel.com/c10k.html for a critique of using
preemptive
threads for scalable event driven applications (in particular web
servers).
 
K

kwikius

yes, more or less this is the situation.



It still makes sense if you need one or more of these situations:


- need take advantage of multiple CPU for parallelism (no way to do
cooperative multitasking there and still get any reasonable
parallelism),

Well heres my current insight

Separate nonshared data by space -> execute in parallel
Separate shared data by time -> execute sequentially

That requires a top down approach, IOW a scheduler, but a scheduler can
separate processes either by space or by time intelligently only if the
scheduler knows about the application semantics, IOW runs in application
space.

Currently the usual option is a scheduler that runs in kernel space and
there is no cooperation with the application.

regards
Andy Little
 
K

kwikius

kwikius said:
Well heres my current insight

Separate nonshared data by space -> execute in parallel
Separate shared data by time -> execute sequentially

That requires a top down approach, IOW a scheduler, but a scheduler can
separate processes either by space or by time intelligently only if the
scheduler knows about the application semantics, IOW runs in application
space.

Currently the usual option is a scheduler that runs in kernel space and
there is no cooperation with the application.

hmm maybe not ...

http://msdn.microsoft.com/en-us/library/aa175393.aspx

hadnt read that when I wrote the original :)

regards
Andy Little
 
J

Jerry Coffin

[ ... ]
My only thought was that it would be cool if I had control of the thread
scheduler and all threads communicated with the scheduler. The scheduler
keeps a data structure for each thread which is basically read only as far
as the scheduler is concerned Each thread has a yield_to_scheduler(
my_thread_id, mystate) primitive which means that when it has done some
significant work it suspends and its data structure is updated (The data
structure per thread would probably be as simple a I have now completed
step3 of my process or whatever). The scheduler then can block progress of
other threads or start other threads dependent on the current state it sees.

If you want to play with this under Windows, look up fibers. I doubt
it's used much anymore, but at one time there was a package named
cthreads for Unix and similar systems. I'm pretty sure with a little bit
of looking, you can probably find it.
 
K

kwikius

Jerry Coffin said:
[ ... ]
My only thought was that it would be cool if I had control of the thread
scheduler and all threads communicated with the scheduler. The scheduler
keeps a data structure for each thread which is basically read only as
far
as the scheduler is concerned Each thread has a yield_to_scheduler(
my_thread_id, mystate) primitive which means that when it has done some
significant work it suspends and its data structure is updated (The data
structure per thread would probably be as simple a I have now completed
step3 of my process or whatever). The scheduler then can block progress
of
other threads or start other threads dependent on the current state it
sees.

If you want to play with this under Windows, look up fibers. I doubt
it's used much anymore, but at one time there was a package named
cthreads for Unix and similar systems. I'm pretty sure with a little bit
of looking, you can probably find it.

Actually I think I have wandered way O.T. for c.l.c++
Apologies...

regards
Andy Little
 
G

gpderetta

Humm... How is using 60,000 user-threads on a 8-CPU box different from using
highly efficient state-machine and thread-pool algorithms driven by an
average of 16 or so kernel-threads? Before you answer, please realize that
IMVHO, state-machines and shared memory multi-threading are not all that
complicated...

There is nothing complicated with shared memory nor state machines. I
object to gratuitous use of preemption.

IMHO there isn't much difference between using _cooperative_ threads
or a pure callback based approach on top of a state machine. Sometimes
the former makes more sense, sometimes the other.
More often, it is better to use both of them and chose on a case by
case basis.

Certainly I wouldn't even think of using 60000 preemptive threads.
 
S

Szabolcs Ferenczi

Do you have any other suggestions on how to further improve the `active<T>'
helper object?

Well, it is getting better and better.

The next problem is how you can handle active objects that have
arguments in their constructors. Just think about a simple producer-
consumer problem. Suppose you have the classes Buffer, Producer, and
Consumer. Then you want to define the access rights among these
objects and define the instance from class Buffer as a passive object,
but the instances from classes Producer and Consumer should be active
objects:

{
Buffer b;
active<Producer> p(b);
active<Consumer> c(b);
}

Best Regards,
Szabolcs
 
S

Szabolcs Ferenczi

Sorry about that non-sense!

Never mind, getting used to it.

Now, can you think
of anything else to further enhance my `active<T>' helper template?
</quote>

I am afraid the parameter specification is not a general solution but
it is better than nothing. I guess it can be improved by some kind of
variadic templates.

I think the last issue to think about is how you can specify a larger
number of of active objects so that they should have their own id. For
instance, how can you specify N producers and M consumers for a
bounded buffer.

If this is solved, it should really be considered by the C++0x
committee, since it would be a huge improvement over the existing
solution with respect to process creation. Having active objects is
the genuine way of combining objects and processes. Besides, it could
be just a higher level construction next to the existing lower level
proposal (similarly to the higher level wait() on the condition
variable).

Still, it is a library level solution and not a language level one, so
everybody can be happy.

Best Regards,
Szabolcs
 
G

gpderetta

std::string x[3] = { "One", "Two", "Three" };

AFAICT, that is not valid C++...

It is. In fact you can actually omit the '3':

std::string x[] = { "One", "Two", "Three" };

HTH,
 
S

Szabolcs Ferenczi

Humm. Could you give a pseudo-code example?

I mean there are situations where you want to have many entities of
the same kind. E.g. you want to have 5 producers and 3 consumers.
Until the number is small, you can copy paste it like this:

<code_0>
{
Buffer b;
active<Producer> p(0, b);
active<Producer> p(1, b);
active<Producer> p(2, b);
active<Producer> p(3, b);
active<Producer> p(4, b);
active<Consumer> c(0, b);
active<Consumer> c(1, b);
active<Consumer> c(2, b);
}
</code_0>

Since these are RAII objects, you cannot simply put them into a for or
a while loop, can you? However, obviously we should find some C++
compatible notation to have multiple RAII objects with different
arguments, e.g.:

<pseudocode_1>
{
Buffer b;
active<Producer, 5> p(i, b);
active<Consumer, 3> c(i, b);
}
</pseudocode_1>

Or not C++ notation:

<pseudocode_2>
{
Buffer b;
foreach int i (0..4) active<Producer> p(i, b);
foreach int i (0..2) active<Consumer> c(i, b);
}
</pseudocode_2>

Additionally, some general solution would be nice that makes it
possible to have any kind and number of arguments for the active
object and not just an integer id.
Are you talking about
integrating some sort of conditional waits into the mix?

That would come next when we finish with the structured parallel
construction part.

Best Regards,
Szabolcs
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top