Ruby Threads question

C

Cd Cd

The following code is a slight modification of the code found at the top
of page 136 in the book "Programming Ruby The Pragmatic Programmers'
Guide" by Dave Thomas with Chad Fowler and Andy Hunt

#!/usr/bin/ruby -w

require 'net/http'

pages = %w(www.google.com www.slashdot.org www.mit.edu)
threads = []

for page_to_fetch in pages
threads << Thread.new(page_to_fetch) do |url|
h=Net::HTTP.new(url, 80)
puts "Fetching: #{url}"
resp = h.get('/', nil)
puts "Got #{url}: #{resp.message}"
end
end

threads.each {|thr| thr.join }

How come when I run this code, the following output goes line by line?
Fetching: www.google.com
Fetching: www.slashdot.org
Fetching: www.mit.edu
Got www.slashdot.org: Moved Permanently
Got www.google.com: OK
Got www.mit.edu: OK


Shouldn't the output be all done at once since the code is running in
parallel?
 
R

Rick DeNatale

The following code is a slight modification of the code found at the top
of page 136 in the book "Programming Ruby The Pragmatic Programmers'
Guide" by Dave Thomas with Chad Fowler and Andy Hunt

#!/usr/bin/ruby -w

require 'net/http'

pages = %w(www.google.com www.slashdot.org www.mit.edu)
threads = []

for page_to_fetch in pages
threads << Thread.new(page_to_fetch) do |url|
h=Net::HTTP.new(url, 80)
puts "Fetching: #{url}"
resp = h.get('/', nil)
puts "Got #{url}: #{resp.message}"
end
end

threads.each {|thr| thr.join }

How come when I run this code, the following output goes line by line?
Fetching: www.google.com
Fetching: www.slashdot.org
Fetching: www.mit.edu
Got www.slashdot.org: Moved Permanently
Got www.google.com: OK
Got www.mit.edu: OK


Shouldn't the output be all done at once since the code is running in
parallel?

I'm not sure what you mean "all done at once".

If the code wasn't running in parallel threads then I'd expect to see:

Fetching: www.google.com
Got www.google.com: OK
Fetching: www.slashdot.org
Got www.slashdot.org: Moved Permanently
Fetching: www.mit.edu
Got www.mit.edu: OK

What's really happening is something like:

Thread 1 Thread 2
Thread 3
opens HTTP
prints Fetching:www.google.com
issues get
opens http
prints
Fetching: www.slashdot.org
issues get

prints Fetching: www.mit.edu

issues get
get finishes
prints Got
www.slashdot.org: Moved Permanently
get finishes
prints Got www.google.com: OK

get finishes

prints Got www.mit.edu: OK

So the output order is evidence that there is indeed parallel activity.
 
A

ara.t.howard

Shouldn't the output be all done at once since the code is running in
parallel?

what would that look like exactly though? ;-)

seriously there are more than one reason that cannot happen.

1) the console being printed to can only allow one thing to write to
it at a time. generally speaking the console is line buffered up to
some limit so, as long as a program writes a chunk of chars less that
that limit the lines will not appear to intermixed. things get more
complicated when programs are not writing to the console, but to file
instead.

2) threads are never run in parallel from the perspective of the
computer: the cpu can only run one program at once. it simply does
very intelligent switching very quickly to make you think it's doing
more than one thing ;-). with ruby's threads, which are known as
'green' threads, ruby itself does this switching for you. with
'native' threads the switching is done by the operating system. in
either case the concept of 'parallel' is really from the perspective
of the programmer. this isn't strictly true as sometimes a program
might be able to write to disk or the network but not need the cpu -
in that case it might do two things at once. the central concept is
basically that a thread is a programming abstraction for
*programmers* to imagine themselves getting more done. sometimes
it's a useful one.

now, in a multi-cpu machine this gets even murkier - threads (native
ones not green ones) are actually very close to a whole process and
the operating system may in fact allow to bits of your program to run
on two cpus. in the end you must assume that only one piece of a
program is using a given piece of computer hardware at once, since we
simply cannot defy physics, but there are various approaches and
abstractions that help us let the computer help us, like threads, by
structuring our program such that the operating system *might* be
able to run two bit s of our code on two bits of hardware at once.

3) threads are evil and best understood via meditation - not actual
thinking.

kind regards.

a @ http://drawohara.com/
 
C

Charles Oliver Nutter

Cd said:
How come when I run this code, the following output goes line by line?
Fetching: www.google.com
Fetching: www.slashdot.org
Fetching: www.mit.edu
Got www.slashdot.org: Moved Permanently
Got www.google.com: OK
Got www.mit.edu: OK

You could probably explain this as follows:

Thread 1 goes for Google. Waiting for response is a blocking call, so it
yields.
Thread 2 goes for Slashdot. Waiting for response is a blocking call, so
it yields.
Thread 3 goes for MIT. Waiting for response is a blocking call, so it
yields.

From there the order largely depends on which response comes back first
and which thread get scheduled next. Given that Ruby uses only Green
threads that must generally reach an appropriate point to switch, this
seems perfectly logical.

I would also expect the result would remain mostly the same for those
first three lines, and mostly random for the next three lines, as long
as you run under Ruby 1.8. Other implementations that use all native
threads will be largely unpredictable for any sequence. When I tested
this code under JRuby, it produced a different sequence every time (and
required joining on the threads so they'd run to completion).

- Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,150
Latest member
MakersCBDReviews
Top