logging.handlers.SocketHandler

W

writeson

Hi everyone,

I wrote a logging server that receives messages from
logging.handlers.SocketHandler objects in client Python programs. This
works well so long as the client programs are start/stop affairs.
However, if the client is also a long running daemon a problem shows
up. If the logging server is restarted log messages from the daemon
client go no where, and no errors are thrown. To correct the problem
the daemon client has to be restarted, then it will reconnect with the
logging server and all is well. My reading of the
logging.handlers.SocketHandler documentation makes me think this isn't
how things are supposed to work. Failures of the emit() method (how
log messages are sent) will close the connection and retry it again at
the next log message.

By the way, we're running this with Python 2.4 on a CentOS Linux
server.

Does anyone have any ideas, pointers or suggestions about how to
address this problem?

Thanks in advance!
Doug
 
V

Vinay Sajip

Hi everyone,

I wrote aloggingserver that receives messages fromlogging.handlers.SocketHandler objects in client Python programs. This
works well so long as the client programs are start/stop affairs.
However, if the client is also a long running daemon a problem shows
up. If theloggingserver is restarted log messages from the daemon
client go no where, and no errors are thrown. To correct the problem
the daemon client has to be restarted, then it will reconnect with theloggingserver and all is well. My reading of thelogging.handlers.SocketHandler documentation makes me think this isn't
how things are supposed to work. Failures of the emit() method (how
log messages are sent) will close the connection and retry it again at
the next log message.

By the way, we're running this with Python 2.4 on a CentOS Linux
server.

Does anyone have any ideas, pointers or suggestions about how to
address this problem?

Thanks in advance!
Doug

It may be platform-related. I don't have access to your specific
platform, but I tried with ActivePython 2.5.2.2 on Windows and also
Python 2.5.2 on Ubuntu 8.04 (Hardy Heron). I used the scripts
described in

http://docs.python.org/library/logging.html#sending-and-receiving-logging-events-across-a-network

but modified the client script to place the logging statements in a
"while True:" loop. What I observed on both Windows and Ubuntu was
this: when I killed the socket receiver and restarted it, data from
the client was received by the new receiver process after a short
delay (of a few seconds). I had no need to restart the client to
achieve this.

Can you try testing with these specific scripts in your environment?

Regards,

Vinay Sajip
 
W

writeson

Vinay,

Thanks for the quick response, very much appreciated. I tried the two
scripts you pointed to, modifying the client so it produces log
messages in an endless loop, and it worked fine. If I left the client
running and stopped and started the server, the client would reconnect
and messages would start coming through. This is what you observed and
what is stated in the documentation. Looking at the SocketHandler
Python code shows this would be the behavior as well.

To further complicate my original question, here is more information.
The logging server I created uses Twisted (2.5.0) as it's network
framework and gets network log messages from the clients using Twisted
code. However, once a message is received it is parsed the same way as
the example you pointed, in fact that's what I used as a model to
build the code in my def dataReceived() method of my logging server
Twisted protocol. I don't think there is anything unusual about the
network handling in the Twisted framework.

The client daemons are also Twisted processes, but in their cases they
don't use any Twisted code to send log messages. The log system is
entirely based on the Python logging module, and the
logging.handlers.SocketHandler system specifically. So the clients
should try to reconnect if the server goes down and comes back up. So
I'm a little confused what to try next to resolve the problem I'm
seeing.

By the way, the Python logging system is great, really nice work!
Again, thank you for your quick response and help.

Doug
 
V

Vinay Sajip

To further complicate my original question, here is more information.
Theloggingserver I created uses Twisted (2.5.0) as it's network
framework and gets network log messages from the clients using Twisted
code. However, once a message is received it is parsed the same way as
the example you pointed, in fact that's what I used as a model to
build the code in my def dataReceived() method of myloggingserver
Twisted protocol. I don't think there is anything unusual about the
network handling in the Twisted framework.

The client daemons are also Twisted processes, but in their cases they
don't use any Twisted code to send log messages. The log system is
entirely based on the Pythonloggingmodule, and thelogging.handlers.SocketHandler system specifically. So the clients
should try to reconnect if the server goes down and comes back up. So
I'm a little confused what to try next to resolve the problem I'm
seeing.

This would appear to indicate that the problem is to do with how
Twisted is being used. A couple of avenues to explore would be:

1. Try the simple test client I suggested with your Twisted server and
see what happens. If stuff comes through when the server is restarted,
that would appear to isolate the problem to somewhere in your client
daemons (at least as a working hypothesis).

2. Try the Twisted client daemons with the simple server from the
logging docs. If stuff comes through when the server is restarted,
then the problem might be with your Twisted server daemon.

Of course, it may be necessary to modify the simple seem-to-work test
scripts to accommodate whatever specific wire format is being used in
your problem scenario.

If either of these experiments yield some more information, it's
probably worth taking that to the Twisted community, perhaps the
twisted-python mailing list.

Regards,

Vinay Sajip
 
W

writeson

Vinay,

I did as you suggested and everything seemed to work; client programs
were able to reconnect to the servers and log messages started showing
up soon after the logging server was running again. I did this with my
Twisted client/server setup and it showed the same behavior; clients
reconnected to the server after a short delay. Messages were dropped
while the server was down, but that was what I expected. This makes it
look like I raised a "false alarm" and the
logging.handlers.SocketHandler is behaving as expected, and it is more
likely there is a problem in my application.

My manager is suggesting that the underlying problem is using TCP
rather than UDP (SocketHandler vs DatagramHandler) for logging from
clients to the logging server. His assertion is that using TCP would
guarantee the loss of 2 messages at the logging server from an
attached daemon before a reconnect was established. I don't know
enough about network protocols to determine if this is true or not,
but the reading I've done about UDP talks about UDP being an
unreliable protocol, so I'm not sure how using it would change the
loss of 2 or more messages while a reconnect occurs. Perhaps because
it is stateless and doesn't have to re-establish a connection? I'm not
sure, what are your thoughts?

Again, thanks for your help and support,
Doug
 
V

Vinay Sajip

My manager is suggesting that the underlying problem is using TCP
rather than UDP (SocketHandler vs DatagramHandler) forloggingfrom
clients to theloggingserver. His assertion is that using TCP would
guarantee the loss of 2 messages at theloggingserver from an
attached daemon before a reconnect was established. I don't know
enough about network protocols to determine if this is true or not,
but the reading I've done about UDP talks about UDP being an
unreliable protocol, so I'm not sure how using it would change the
loss of 2 or more messages while a reconnect occurs. Perhaps because
it is stateless and doesn't have to re-establish a connection? I'm not
sure, what are your thoughts?

UDP makes no guarantees about delivery or ordering of packets which
are delivered, so moving to UDP could mean that you lose packets even
when the server is always up. If a server has been brought down and
restarted, a number of messages (typically more than 2) will have been
lost during the period of unavailability of the server, so I am not
sure why the loss of some messages during server start-up cannot be
tolerated. If it is important not to lose messages, then you will
probably have to use some other handler which buffers log events and
uses an always-up store-and-forward mechanism to guarantee delivery
across periods of downtime.

Regards,

Vinay Sajip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top