net/http performance question

B

Bjorn Borud

I have the following code:

def fetch_into (uri, name)
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Get.new(uri.path)
req.basic_auth(USERNAME, PASSWORD)
start_time = Time.now.to_f
File.open(name, "w") do |f|
print " - fetching #{name}"
http.request(req) do |result|
f.write(result.body)
f.close()
elapsed = Time.new.to_f - start_time
bps = (result.body.length / elapsed) / 1024
printf ", at %7.2f kbps\n", bps
end
end
end

this is run in a very simple loop that doesn't do anything that
requires much CPU. the files downloaded are about 10Mb and since the
connection is not that fast (about 15Mbit/sec) I would expect this to
consume little CPU, but in fact it *gobbles* up CPU. on a 2Ghz AMD it
eats 65% CPU on average (the job runs for hours on end).

where are the cycles going? I assumed it would be a somewhat
suboptimal way of doing it since there might be some buffer resizing
in there, but not *that* badly.

anyone care to shed some light on this?

(I would assume that there is a way of performing an http request in a
way where you can read chunks of the response body at a time?)

-Bjørn
 
J

Jan Svitok

I have the following code:

def fetch_into (uri, name)
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Get.new(uri.path)
req.basic_auth(USERNAME, PASSWORD)
start_time = Time.now.to_f
File.open(name, "w") do |f|
print " - fetching #{name}"
http.request(req) do |result|
f.write(result.body)
f.close()
elapsed = Time.new.to_f - start_time
bps = (result.body.length / elapsed) / 1024
printf ", at %7.2f kbps\n", bps
end
end
end

this is run in a very simple loop that doesn't do anything that
requires much CPU. the files downloaded are about 10Mb and since the
connection is not that fast (about 15Mbit/sec) I would expect this to
consume little CPU, but in fact it *gobbles* up CPU. on a 2Ghz AMD it
eats 65% CPU on average (the job runs for hours on end).

where are the cycles going? I assumed it would be a somewhat
suboptimal way of doing it since there might be some buffer resizing
in there, but not *that* badly.

anyone care to shed some light on this?

(I would assume that there is a way of performing an http request in a
way where you can read chunks of the response body at a time?)

Hi,
there seems to be HTTPResponse#read_body, that can provide the chunks
as they come (not tested, copy&paste from docs:

# using iterator
http.request_get('/index.html') {|res|
res.read_body do |segment|
print segment
end
}

BTW, you could move the File.open later, saving f.close() call
try fiddling with GC - GC.disable when receiving might help or not.
don't forget to enable it between requests.

so

def fetch_into (uri, name)
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Get.new(uri.path)
req.basic_auth(USERNAME, PASSWORD)
start_time = Time.now.to_f
print " - fetching #{name}"
# GC.disable # optional
http.request(req) do |result|
File.open(name, "w") do |f|
result.read_body do |segment|
f.write(segment)
end
end
elapsed = Time.new.to_f - start_time
bps = (result.body.length / elapsed) / 1024
printf ", at %7.2f kbps\n", bps
end
# GC.enable
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,520
Members
44,996
Latest member
rainocode

Latest Threads

Top