TCP posix thread buffer question across threads

Bob · Jul 5, 2009

I have an odd situation that I didnt expect, so I probably dont
understand enough which is why Im asking here.

I have created a multithreaded TCP server which has the
following buffer in each thread:

tcpThread.h:

unsigned char in_buffer[MAX_TCP_BUF];

tcpThread.cpp:

size_t n = read(fd, in_buffer, MAX_TCP_BUF];

PROBLEM:
This is under linux. I only have 2 clients connected, so each has its
own
pthread() of the above, but I see in_buffer[] occasionally getting
characters
and/or leftover characters from the other client in the 'in_buffer'.

I changed the code to make in_buffer[] a static array in the .cpp file
instead and I believe that has solved the problem, but I would like to
further understand the problem.

I had thought that having it in the .h file just meant that as each
thread
was created, it had its own address of the array, but apparently thats
not the case?

thanks in advance.

Alf P. Steinbach · Jul 5, 2009

* Bob:

I have an odd situation that I didnt expect, so I probably dont
understand enough which is why Im asking here.

I have created a multithreaded TCP server which has the
following buffer in each thread:

tcpThread.h:

unsigned char in_buffer[MAX_TCP_BUF];

tcpThread.cpp:

size_t n = read(fd, in_buffer, MAX_TCP_BUF];

PROBLEM:
This is under linux. I only have 2 clients connected, so each has its
own
pthread() of the above, but I see in_buffer[] occasionally getting
characters
and/or leftover characters from the other client in the 'in_buffer'.

I changed the code to make in_buffer[] a static array in the .cpp file
instead and I believe that has solved the problem, but I would like to
further understand the problem.

I had thought that having it in the .h file just meant that as each
thread
was created, it had its own address of the array, but apparently thats
not the case?

No, the buffer you declare in the header has extern linkage and is global.

You need one buffer for each thread.

Cheers & hth.,

- Alf

Bob · Jul 5, 2009

No, the buffer you declare in the header has extern linkage and is global.

You need one buffer for each thread.

Cheers & hth.,

- Alf

Thanks Alf, so that also means that any of my tracking
variables in the .h file also need to be inside each .cpp
file as well. Time to go fix them

James Kanze · Jul 5, 2009

I have an odd situation that I didnt expect, so I probably
dont understand enough which is why Im asking here.

I have created a multithreaded TCP server which has the
following buffer in each thread:

tcpThread.h:

Click to expand...

unsigned char in_buffer[MAX_TCP_BUF];

tcpThread.cpp:

Click to expand...

size_t n = read(fd, in_buffer, MAX_TCP_BUF];

PROBLEM:
This is under linux. I only have 2 clients connected, so each
has its own pthread() of the above, but I see in_buffer[]
occasionally getting characters and/or leftover characters
from the other client in the 'in_buffer'.

I changed the code to make in_buffer[] a static array in the
.cpp file instead and I believe that has solved the problem,
but I would like to further understand the problem.

I had thought that having it in the .h file just meant that as
each thread was created, it had its own address of the array,
but apparently thats not the case?

There is no thread local storage in C++ (at least not
currently). Objects can have static, auto or dynamic lifetimes,
but any given instance of an object is the same everywhere. (In
some ways, this is the very definition of what is meant by
threading, as opposed to separate processes.) Linux and other
Unix (and I believe Windows as well, and probably most other
systems) do have a provision for thread local storage, but it's
relatively complicated to use, and not necessarily very
performant, see pthread_getspecific and pthread_setspecific.
(Basically, one legal implementation would be to use something
like std::map, indexed on the thread id and the object key.)
The usual solution is to only use local variables, since each
time you enter the block, you get a new instance of any local
variables. Be careful with regards to stack size if you do this
for buffers, however; you'd probably be better off using
std::vector< unsigned char >, rather than a C style array.

Also, depending on what you're doing, a better solution is often
to use a separate process, rather than a separate thread, for
each connection. I know that threads are in, and separate
processes aren't, but unless the connections are sharing a lot
of data, the separate process model is a lot more robust. And
since you have a separate program execution per process, you
have separate instances of everything in each process. The Unix
process model works particularly well for this one particular
case.

Bible Trivia Extreme · Jul 6, 2009

There is no thread local storage in C++ (at least not
currently). Objects can have static, auto or dynamic lifetimes,
but any given instance of an object is the same everywhere. (In
some ways, this is the very definition of what is meant by
threading, as opposed to separate processes.) Linux and other
Unix (and I believe Windows as well, and probably most other
systems) do have a provision for thread local storage, but it's
relatively complicated to use, and not necessarily very
performant, see pthread_getspecific and pthread_setspecific.
(Basically, one legal implementation would be to use something
like std::map, indexed on the thread id and the object key.)
The usual solution is to only use local variables, since each
time you enter the block, you get a new instance of any local
variables. Be careful with regards to stack size if you do this
for buffers, however; you'd probably be better off using
std::vector< unsigned char >, rather than a C style array.

Also, depending on what you're doing, a better solution is often
to use a separate process, rather than a separate thread, for
each connection. I know that threads are in, and separate
processes aren't, but unless the connections are sharing a lot
of data, the separate process model is a lot more robust. And
since you have a separate program execution per process, you
have separate instances of everything in each process. The Unix
process model works particularly well for this one particular
case.

--
James Kanze (GABI Software) email:[email protected]
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France,+33 (0)1 30 23 00 34

Thank you James for that detailed explanation. Since I havent
dont production level multithreading yet, Im wondering is seperate
process more common in full on production servers or is
multithreading?

I could go with either as this particular application wont exceed 250
threads/processes, but I know that in the past I worked on something
that
could potentially have 40k+ threads, and Im assuming processes are
heavier
than threads overall for resource consumption etc.

Jorgen Grahn · Jul 6, 2009

Thank you James for that detailed explanation. Since I havent
dont production level multithreading yet, Im wondering is seperate
process more common in full on production servers or is
multithreading?

I am not James, but I reply anyway.

It probably has a *lot* to do with the application: do the connections
need to share read/write data? and so on. And on whether the
programmer is the kind of Unix programmer who dislikes threads strongly:

http://catb.org/~esr/writings/taoup/html/ch07s03.html#id2923889

Also note that these are not the only two alternatives. One
thread/process per client is not the only alternative, either.

Apache httpd implements several different models; you may want to
google for information and opinions on those.

I could go with either as this particular application wont exceed 250
threads/processes, but I know that in the past I worked on something that
could potentially have 40k+ threads

40000 threads seems like a bad idea, but it might be OK if it's some
special language-specific kind of thread.

See for example this discussion over at comp.protocols.tcp-ip:

Message-ID:

said:
and Im assuming processes are heavier
than threads overall for resource consumption etc.

If you are on Unix (which you appear to be), you shouldn't assume
that. A fork(2) can be surprisingly fast, and after the fork you
don't have to care about synchronization, rare deadlock bugs and
tedious stuff like that.

/Jorgen

POSIX threads problem	0	Nov 3, 2010
Basic Question on POSIX Threads	14	Oct 14, 2007
bounded buffer	2	Jan 23, 2010
Ring Buffer & templates	8	Oct 28, 2010
Thread program	9	Nov 3, 2012
Posix Thread : C++ : poiinter to Member function	2	May 13, 2005
Thread Pool versus Dedicated Threads	21	Aug 14, 2008
boost::thread	6	Aug 1, 2011

TCP posix thread buffer question across threads

Bob

Alf P. Steinbach

Bob

James Kanze

Bible Trivia Extreme

Jorgen Grahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads