Odd result when attempting to use Mechanize in parallel with Threads

R

Richard Conroy

I wrote a simple tool to iterate a network to try and find web servers
running on specific ports. We have a lot of devices & software with a
web UI, and I thought that this would be a handy way to find them,
and even tell what they are.

I thought this would be a handy coding project too, and a good way to
cut my teeth on Ruby threads, and build up some usage with
Mechanize.

BTW I am running this on *Windows XP*.

However my code is quite obviously executing this serially. Is there
something obviously wrong with my code below? (results after
code snippet). I am aware this could make my machine choke from
thread overkill, but I wanted to get it working in parallel first.
Perhaps Mechanize instances have some shared elements?

============================
require 'mechanize'

threads = Array.new
puts "sweep of 153.200.72.* segment http ports"
(1..254).each do |ran|
threads << Thread.new(ran) { |r|
agent = WWW::Mechanize.new
agent.user_agent_alias = 'Windows Mozilla'
ports = [80,8080]
ports.each do |p|
begin
page = agent.get("http://153.200.72."+r.to_s+":"+p.to_s)
puts "153.200.72."+r.to_s+":"+p.to_s+" - "+page.title
rescue
puts "153.200.72."+r.to_s+":"+p.to_s+" - NOTHING"
end
end
}
threads.each { |aThread| aThread.join }
end
============================

153.200.72.10:80 - NOTHING
153.200.72.10:8080 - NOTHING
153.200.72.11:80 - NOTHING
153.200.72.11:8080 - NOTHING
153.200.72.12:80 - NOTHING
153.200.72.12:8080 - NOTHING
153.200.72.13:80 - NOTHING
153.200.72.13:8080 - NOTHING
153.200.72.14:80 - NOTHING
153.200.72.14:8080 - NOTHING
153.200.72.15:80 - NOTHING
153.200.72.15:8080 - NOTHING
153.200.72.16:80 - NOTHING
153.200.72.16:8080 - NOTHING
153.200.72.17:80 - NOTHING
153.200.72.17:8080 - NOTHING
 
A

ara.t.howard

I wrote a simple tool to iterate a network to try and find web servers
running on specific ports. We have a lot of devices & software with a
web UI, and I thought that this would be a handy way to find them,
and even tell what they are.

I thought this would be a handy coding project too, and a good way to
cut my teeth on Ruby threads, and build up some usage with
Mechanize.

BTW I am running this on *Windows XP*.

However my code is quite obviously executing this serially. Is there
something obviously wrong with my code below? (results after
code snippet). I am aware this could make my machine choke from
thread overkill, but I wanted to get it working in parallel first.
Perhaps Mechanize instances have some shared elements?

============================

require 'mechanize'

threads = Array.new

puts "sweep of 153.200.72.* segment http ports"

(1..254).each do |ran|
threads << Thread.new(ran) { |r|
agent = WWW::Mechanize.new
agent.user_agent_alias = 'Windows Mozilla'
ports = [80,8080]
ports.each do |p|
begin
page = agent.get("http://153.200.72."+r.to_s+":"+p.to_s)
puts "153.200.72."+r.to_s+":"+p.to_s+" - "+page.title
rescue
puts "153.200.72."+r.to_s+":"+p.to_s+" - NOTHING"
end
end
}
end

threads.each { |aThread| aThread.join } # THIS MUST BE OUTSIDE THE LOOP!



fyi. starting a thread, and then immediately joining it is the same as not
using a thread at all!

another fyi - threads are io (even socket io) is a dealy combination on
windows. run this on linux/mac if possible.

regards.

-a
 
R

Richard Conroy

threads.each { |aThread| aThread.join } # THIS MUST BE OUTSIDE THE LOOP!

fyi. starting a thread, and then immediately joining it is the same as not
using a thread at all!

Ah yes, cutting & pasting a line too high ....
another fyi - threads are io (even socket io) is a dealy combination on
windows. run this on linux/mac if possible.

Has to be windows, but this isn't mission critical code - just a
development tool that may eventually post the results to a wiki or
something. I can break this
up a bit so it doesn't kill my laptop later.

Thanks. I knew it had to a WTF.
 
R

Richard Conroy

Hi, Richard,

Actually in Ruby, only by the method ".new" we can make threads
run in parallel rather than serially. And I think it can meet your
requirement, pls see the programs<multithreads_ProbingHttp.rb> I post at
the end of this mail, plus the running results.
Firstly pls notice the following points: 1) The method ".new"
means "Creates and runs a new thread to execute the instructions given
in block". 2) The method ".join" means "The calling thread will suspend
execution and run the called thread. Does not return until the called
thread exits or until limit seconds have passed".
".new" doesn't only mean "creates", it means both "creates" and
"runs". So ".new" can make son threads run in parallel. And ".join"
needs to wait for the exit of the called thread, so it gives you the
illusion that the theads are running serially, but in fact ".join" just
wraps up the threads. It is inappropriate for us to say whether ".join"
is making threads run in parallel or serially. We can say ".join" is
serially waiting for the exits of threads that might be already running
in parallel. :) :)

This is what I noticed. I join up 5 threads at a time, the output jumps
up in batches of 5. This does slow down the algorithm, especially
if there is a lot of positive results - most of these threads are
waiting for the http
connection to timeout.

But I run this thing at night anyway.

As an aside, I have had difficulty getting more than ~ 5 joined threads
to work at all in windows.
 

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top