How to reuse TCP listening socket immediately after it was connectedat least once?

Igor Katson · May 24, 2009

I have written a socket server and some arbitrary clients. When I
shutdown the server, and do socket.close(), I cannot immediately start
it again cause it has some open sockets in TIME_WAIT state. It throws
address already in use exception at me. I have searched for that in
google but haven't found a way to solve that.

Tried
setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
but that does not help.

Is there a nice way to overcome this?

Lawrence D'Oliveiro · May 24, 2009

Igor Katson said:
I have written a socket server and some arbitrary clients. When I
shutdown the server, and do socket.close(), I cannot immediately start
it again cause it has some open sockets in TIME_WAIT state. It throws
address already in use exception at me.

There's a reason for that. It's to ensure that there are no leftover packets
floating around the Internet somewhere, that you might mistakenly receive
and think they were part of a new connection, when they were in fact part of
an old one.

The right thing to do is try to ensure that all your connections are
properly closed at shutdown. That may not be enough (if your server crashes
due to bugs), so the other thing you need to do is retry the socket open,
say, at 30-second intervals, until it succeeds.

May 24, 2009

I have written a socket server and some arbitrary clients. When I
shutdown the server, and do socket.close(), I cannot immediately start
it again cause it has some open sockets in TIME_WAIT state. It throws
address already in use exception at me. I have searched for that in
google but haven't found a way to solve that.

Tried
setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
but that does not help.

This should work, AFAIK you only need to do it before you call .bind(..)
on the accept-ing socket

--
Ð´Ð°Ð¼Ñ˜Ð°Ð½ ( http://softver.org.mk/damjan/ )

Give me the knowledge to change the code I do not accept,
the wisdom not to accept the code I cannot change,
and the freedom to choose my preference.

Roy Smith · May 24, 2009

Lawrence D'Oliveiro said:
There's a reason for that. It's to ensure that there are no leftover packets
floating around the Internet somewhere, that you might mistakenly receive
and think they were part of a new connection, when they were in fact part of
an old one.

In theory, that is indeed the reason for the TIME_WAIT state. In practice,
however, using SO_REUSEADDR is pretty safe, and common practice.

You've got several things working in your favor. First, late-delivery of
packets is pretty rare. Second, if some late packet were to arrive, the
chances of them having the same local and remote port numbers as an
existing connection is slim. And, finally, the TCP sequence number won't
line up.

One thing to be aware of is that SO_REUSEADDR isn't 100% portable. There
are some systems (ISTR HP-UX) which use SO_REUSEPORT instead of
SO_REUSEADDR. The original specifications weren't very clear, and some
implementers read them in strange ways. Some of that old code continues in
use today. I only mention this because if you try SO_REUSEADDR and it's
not doing what you expect, it's worth trying SO_REUSEPORT (or both) to see
what happens on your particular system.

The right thing to do is try to ensure that all your connections are
properly closed at shutdown. That may not be enough (if your server crashes
due to bugs), so the other thing you need to do is retry the socket open,
say, at 30-second intervals, until it succeeds.

That may be a reasonable thing to do for production code, but when you're
building and debugging a server, it's a real pain to not be able to restart
it quickly whenever you want (or need) to.

Igor Katson · May 24, 2009

Roy said:
In theory, that is indeed the reason for the TIME_WAIT state. In practice,
however, using SO_REUSEADDR is pretty safe, and common practice.

You've got several things working in your favor. First, late-delivery of
packets is pretty rare. Second, if some late packet were to arrive, the
chances of them having the same local and remote port numbers as an
existing connection is slim. And, finally, the TCP sequence number won't
line up.

One thing to be aware of is that SO_REUSEADDR isn't 100% portable. There
are some systems (ISTR HP-UX) which use SO_REUSEPORT instead of
SO_REUSEADDR. The original specifications weren't very clear, and some
implementers read them in strange ways. Some of that old code continues in
use today. I only mention this because if you try SO_REUSEADDR and it's
not doing what you expect, it's worth trying SO_REUSEPORT (or both) to see
what happens on your particular system.

That may be a reasonable thing to do for production code, but when you're
building and debugging a server, it's a real pain to not be able to restart
it quickly whenever you want (or need) to.

Thanks for a great answer, Roy!

Lawrence D'Oliveiro · May 24, 2009

That may be a reasonable thing to do for production code, but when you're
building and debugging a server, it's a real pain to not be able to
restart it quickly whenever you want (or need) to.

On the contrary, I run exactly the same logic--and that includes socket-
handling logic--in both test and production servers. How else can I be sure
it'll work properly in production?

Roy Smith · May 25, 2009

Lawrence D'Oliveiro said:
On the contrary, I run exactly the same logic--and that includes socket-
handling logic--in both test and production servers. How else can I be sure
it'll work properly in production?

If running without SO_REUASEADDR works for you, that's great. I was just
pointing out how it can be useful in cases such as the OP's, where he's
getting bind errors when he restarts his server.

Lawrence D'Oliveiro · May 27, 2009

I was just pointing out how it can be useful in cases such as the OP's,
where he's getting bind errors when he restarts his server.

And I was pointing out how important it was to make sure your code deals
gracefully with those errors.

Thomas Bellman · May 28, 2009

Roy Smith said:
That may be a reasonable thing to do for production code, but when you're
building and debugging a server, it's a real pain to not be able to restart
it quickly whenever you want (or need) to.

Click to expand...

Speaking as a sysadmin, running applications for production,
programs not using SO_REUSEADDR should be taken out and shot.

You *can't* ensure that TCP connections are "properly closed".
For example, a *client* crashing, or otherwise becoming
unreachable, will leave TCP connections unclosed, no matter
what you do.

Not using SO_REUSEADDR means forcing a service interruption of
half an hour (IIRC) if for some reason the service must be
restarted, or having to reboot the entire machine. No thanks.
I have been in that situation.

Lawrence D'Oliveiro · May 28, 2009

Speaking as a sysadmin, running applications for production,
programs not using SO_REUSEADDR should be taken out and shot.

Not using SO_REUSEADDR means forcing a service interruption of
half an hour (IIRC) if for some reason the service must be
restarted, or having to reboot the entire machine.

No, you do not recall correctly. And anybody wanting to reboot a machine to
work around a "problem" like that should be taken out and shot.

Thomas Bellman · May 30, 2009

No, you do not recall correctly.

*Tests* It seems to be 100 seconds in Fedora 9 and 60 seconds in
Solaris 10. OK, that amount of time is not totally horrible, in
many cases just annoying. Still much longer for an interruption
of service that could have been just 1-2 seconds.

However, I *have* used systems where it took much longer. It was
slightly more than ten years ago, under an earlier version of
Solaris 2, problably 2.4. It may be that it only took that long
under certain circumstances that the application we used always
triggered, but we did have to wait several tens of minutes. It
was way faster to reboot the machine than waiting for the sockets
to time out.

And anybody wanting to reboot a machine to
work around a "problem" like that should be taken out and shot.

We weren't exactly keen on rebooting the machine, but it was the
fastest way of getting out of that situation that we could figure
out. How *should* we have dealt with it in your opinion?

Lawrence D'Oliveiro · May 31, 2009

We weren't exactly keen on rebooting the machine, but it was the
fastest way of getting out of that situation that we could figure
out. How *should* we have dealt with it in your opinion?

Remember, the timed_wait timeout is there for a reason, and trying to defeat
it could reduce the reliability of your application--that's why cutting
corners is a bad idea.

If you want to minimize the effect of the timeout, then just use different
ports, and have the clients find them via DNS SRV records.

Newbie question: how to keep a socket listening?	7	Jun 24, 2005
How to end TCP socket data while using readline()?	2	Feb 26, 2010
Listening socket not seen outside of localhost	2	Jun 21, 2004
How to close a listening socket asynchronously	6	Feb 11, 2005
How to close a TCP socket? (TCPSocket#close doesn't close it)	7	Jan 12, 2010
Socket Question	3	Jul 5, 2007
Server listening on 2 differents ports	3	Jul 5, 2005
How relevant is "Automatic TCP Window Tuning" in Networking Applications ?	0	Feb 2, 2007

How to reuse TCP listening socket immediately after it was connectedat least once?

Igor Katson

Lawrence D'Oliveiro

Ð”Ð°Ð¼Ñ˜Ð°Ð½ Ð“ÐµÐ¾Ñ€Ð³Ð¸ÐµÐ²ÑÐºÐ¸

Roy Smith

Igor Katson

Lawrence D'Oliveiro

Roy Smith

Lawrence D'Oliveiro

Thomas Bellman

Lawrence D'Oliveiro

Thomas Bellman

Lawrence D'Oliveiro

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads