Subtle bug: Telnet / socket / thread?

M

Mark Probert

Hi.

I was wondering if anyone has advice on how to debug the following
problem. I am using Ruby 1.8.1, establishing telnet sessions whilst in
a thread. Most of the time, it is fine. Every now and then, the
socket.syswrite() in telnet seems to send characters bound for another
socket / thread. Or so it seems.

The trouble is that the issue appears to be timing related. If I turn on
the dump log in Telnet, then the error, and the extra characters on
input, go away (a short term fix). So, I am not sure on how to isolate
the issue.

The problem is reproducible in the sense that I can get the rubbish output
almost everytime. Unfortunately, the combination of factors is pretty
complex and the test setup is not easily reproduciable.

Any thoughts?

-mark.
 
B

Bill Kelly

Hi,

From: "Mark Probert said:
I was wondering if anyone has advice on how to debug the following
problem. I am using Ruby 1.8.1, establishing telnet sessions whilst in
a thread. Most of the time, it is fine. Every now and then, the
socket.syswrite() in telnet seems to send characters bound for another
socket / thread. Or so it seems.

Sorry I can't be of more help.... I just wanted to mention that
I have an application that is a telnet/VT100 server that has
worked reliably on 1.6.8, 1.8.0, 1.8.1, and 1.8.2 now.. However,
I have never used socket.syswrite()... Only #send and #recv...

(I have had an unexpected issue with select() saying "ok" and
then UDPSocket#recv hanging, but I've been able to work around that
in what seems to be a reliable way. (Hundreds of days uptime.))

So anyway - if it's not a drastic change to your code to try
using *only* #send and #recv, it might be worth a try, just to
see whether the problem disappears? . . . Keeping in mind that
send and recv only transmit "up to" the number of bytes you
request, so you'll need to go in a loop if you want to be sure
to transmit/recv the full amout...


Hope this helps,

Regards,

Bill
 
M

Mark Probert

Hi ..

Bill Kelly said:
Sorry I can't be of more help.... Only #send and #recv...
Thanks, Bill.

the @sock.syswrite() is the base call in all of Telnet, underlying the
print(), puts(), cmd() and so on. It is the primative that is called to
send data to the host.

As an update, I managed to get the system to 'fail' with dump-log turned
on. the dump log records that the command is corrupted prior to it
being sent. The code flow looks like:

puts "sending cmd -- #{c}" # c is correct here
@conn.write(c) # @conn is a Telnet object --> calls Telnet.write()

def write(string)
length = string.length
while 0 < length
IO::select(nil, [@sock])
@dumplog.log_dump('>', string[-length..-1]) if @options.has_key?
"Dump_log") # <--- string is bad here! length -=
@sock.syswrite(string[-length..-1])
end
end

so, i can only assume that one of the other threads is writing to the
this string. Is this possible? Or is there some other way that the
passed string can become corrupt?

Perplexed ...

-mark.
 
J

James Edward Gray II

so, i can only assume that one of the other threads is writing to the
this string. Is this possible? Or is there some other way that the
passed string can become corrupt?

This sounds like the Threads are sharing this String resource. Did it
exist outside of the Threads when you created them? If so, did you
pass into the Thread with something like:

Thread.new(outer_string) do |thread_local_string|
# ...
end

?

James Edward Gray II
 
M

Mark Probert

hi ..
James Edward Gray II said:
This sounds like the Threads are sharing this String resource. Did it
exist outside of the Threads when you created them? If so, did you
pass into the Thread with something like:
I don't think so. The telnet object is completely wrapped in a class and
each instance of the class is unique to each thread (it is constructed
inside the thread). There are no class variables. And the variables
leading into the @conn.write() call are all local.

This is tricky 'cause the problem is intermittent.

Thanks,

-mark.
 
S

Sam Roberts

Quoteing (e-mail address removed), on Tue, Nov 23, 2004 at 12:27:25PM +0900:
I had a bug recently with strings being corrupted, it was because
a string was shared, and was modified by another piece of code.

I don't know about your app, but perhaps you could freeze the strings
either before you pass them to telnet, or inside telnet, which would
allow you to detect the modfier, if thats whats happening.

Cheers,
sam
 
B

Bill Kelly

Hi,

From: "Mark Probert said:
the @sock.syswrite() is the base call in all of Telnet, underlying the
print(), puts(), cmd() and so on. It is the primative that is called to
send data to the host.

To the best of my (limited) knowledge, #send and #recv
are lower level than syswrite(). *Assuming* #send and #recv
translate in Ruby to the Berkeley socket system functions
of the same names. (I have not looked at ruby's implementation
of #syswrite and #send / #recv... so.. I could be mistaken. I'm
just judging by their name and corresponding behavior.)
As an update, I managed to get the system to 'fail' with dump-log turned
on. the dump log records that the command is corrupted prior to it
being sent. The code flow looks like:

puts "sending cmd -- #{c}" # c is correct here
@conn.write(c) # @conn is a Telnet object --> calls Telnet.write()

def write(string)
length = string.length
while 0 < length
IO::select(nil, [@sock])
@dumplog.log_dump('>', string[-length..-1]) if @options.has_key?
"Dump_log") # <--- string is bad here! length -=
@sock.syswrite(string[-length..-1])
end
end

so, i can only assume that one of the other threads is writing to the
this string. Is this possible? Or is there some other way that the
passed string can become corrupt?

What if you try freezing "c" ? Maybe c.freeze before the
printout at the top verifying it's correct... Perhaps another
thread is unexpectedly modifying it?



HTH,

Regards,

Bill
 
J

James Edward Gray II

To the best of my (limited) knowledge, #send and #recv
are lower level than syswrite(). *Assuming* #send and #recv
translate in Ruby to the Berkeley socket system functions
of the same names. (I have not looked at ruby's implementation
of #syswrite and #send / #recv... so.. I could be mistaken. I'm
just judging by their name and corresponding behavior.)

My big confusing stems from both of these approaches. Isn't one of the
big advantages of using a thread design that you can use the higher
level IO calls without fear of blocking?
What if you try freezing "c" ? Maybe c.freeze before the
printout at the top verifying it's correct... Perhaps another
thread is unexpectedly modifying it?

This is really the key to the solution, whether through freezing or
not. If the String is being modified externally, you have to isolate
the chunk of your code doing it. Ruby doesn't randomly modify the
contents of your variables, we hope.

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top