threads within a thread

B

Brad Tilley

I have a program that uses threads to quickly check class B networks (65,536)
hosts for public web servers. It works great. I'd like to check for other
servers as well. Basically, I've got the hosts threaded. Now, I'd like to
thread the ports so that while the hosts are being probed concurrently that
ports on the hosts could be probed concurrently as well. It might look like
this:

A Thread
host
A Thread
port 80
port 443
port 25
host
A Thread
port 445
port 139
host
A Thread
...

I can demonstrate actually code (with only threaded hosts) if that would be
helpful, but I'd rather keep it abstract and discuss how to handle threads
within a thread... hope that makes sense.
 
G

Gary Wright

Since you're writing this in Ruby, I have to suggest that you just
write
this single-threaded. Ruby uses green threads, meaning it handles
it's own
threading and not the system, leading to on average decreased
execution time
vs single-threaded.

If you were talking about CPU bound jobs that might be true, but
probing networks has lots of inherent I/O latency. Green threads should
be just fine for this sort of thing as you are basically waiting on
various packets to return from the probed hosts.

Gary Wright
 
B

Brad Tilley

Quoting Gary Wright said:
If you were talking about CPU bound jobs that might be true, but
probing networks has lots of inherent I/O latency. Green threads should
be just fine for this sort of thing as you are basically waiting on
various packets to return from the probed hosts.

Gary Wright

I've found Ruby to work quite well for this. I can process 65,536 hosts in about
15 minutes in Ruby or Python. They both take about the same amount of time.
That's probing only one port.
 
B

Brad Tilley

Quoting Francis Cianfrocca said:
What's the nature of the probe that you're doing? A TCP connect on a given
port? If so, look at EventMachine (on Rubyforge). You may be able to
significantly improve on your runtime.

That's the extent of it (TCP connect). I tried EventMachine, but I found plain
old threads to be simpler for me to use. I could not get my head wrapped around
EventMachine. Can't teach an old dog new tricks :)
 
R

Robert Klemme

I have a program that uses threads to quickly check class B networks (65,536)
hosts for public web servers. It works great. I'd like to check for other
servers as well. Basically, I've got the hosts threaded. Now, I'd like to
thread the ports so that while the hosts are being probed concurrently that
ports on the hosts could be probed concurrently as well. It might look like
this:

I can demonstrate actually code (with only threaded hosts) if that would be
helpful, but I'd rather keep it abstract and discuss how to handle threads
within a thread... hope that makes sense.

Basically there is no such thing as a thread within a thread. You can
start threads from a thread - actually that's the only way since the
main program is run in a thread as well.

In your case I'd probably do not want to have one thread per port per
host. The reason being that the overhead of a thread is not
insignificant and you can generate a huge amount of threads that way.

I would rather use EventMachine (as mentioned, would be a great
opportunity to learn it) or a fixed number of threads like in a typical
farmer worker scenario. Basically you create N threads and feed them
tasks via a queue. A task would be to check one port on one host. The
advantage is that you can control concurrency and find out the optimal
level. Another advantage is that you save the overhead of multiple
thread creations and destructions.

Kind regards

robert
 
L

Logan Capaldo

If you were talking about CPU bound jobs that might be true, but
probing networks has lots of inherent I/O latency. Green threads should
be just fine for this sort of thing as you are basically waiting on
various packets to return from the probed hosts.
Actually, (appealling to authority here :) ) I believe Francis has
stated in the past that the Thread overhead kills the advantage for even
IO bound tasks (in Ruby specifically) and that a select loop is better.
 
G

Gary Wright

Actually, (appealling to authority here :) ) I believe Francis has
stated in the past that the Thread overhead kills the advantage for
even
IO bound tasks (in Ruby specifically) and that a select loop is
better.

I'd definitely defer to Francis on this but, under the hood, Ruby
uses select to multiplex IO from multiple threads so I'd be
surprised if a single Ruby thread using Kernel#select explicitly
would do better than the builtin multiplexing.

I suppose it depends what kind of IO you are talking about. Disk
I/O is going to be faster than network IO. I would be *really*
surprised to find that Ruby's green threads overhead is high enough
to swamp the effects of network I/O.

Gary Wright
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,157
Latest member
MercedesE4
Top