Reading web service resposes in chunks?

B

Ben Johnson

I am making a web service call and getting back very large responses
(sometimes 5gb). When I get this response it eats all of my RAM. I need
to read the response in chunks so I can store it in a file and I have no
idea how to do this. Any help is greatly appreciated.

Here is my code:


require 'soap/wsdlDriver'

soap = SOAP::WSDLDriverFactory.new("some url").create_rpc_driver
soap.wiredump_file_base = "soapfile"

response = soap.GetWhatever:)whatever => "whatever)


Ironically, when reading the response it doesn't dump it into the file
until it gets the entire response into memory, this is what's killing my
server. Is there a more efficient way of doing this?

Thanks for your help and time.
 
B

Brian Candler

I am making a web service call and getting back very large responses
(sometimes 5gb). When I get this response it eats all of my RAM. I need
to read the response in chunks so I can store it in a file and I have no
idea how to do this. Any help is greatly appreciated.

Here is my code:


require 'soap/wsdlDriver'

soap = SOAP::WSDLDriverFactory.new("some url").create_rpc_driver
soap.wiredump_file_base = "soapfile"

response = soap.GetWhatever:)whatever => "whatever)


Ironically, when reading the response it doesn't dump it into the file
until it gets the entire response into memory, this is what's killing my
server. Is there a more efficient way of doing this?

You just want to get the whole response into a file? Then I'd suggest:

1. build the SOAP XML request as a string

2. connect to the server using HTTP

3. post the XML you built in step 1

4. read the response as a stream and write it to a file.

To get the response as a stream, you can probably still use Net::HTTP for
this. If the response from the server is chunked (use tcpdump to check
this), you can call HTTPResponse#read_body with a block, and you will get
the chunks passed to you in turn. The following example is given in the
documentation:

# using block
http.request_post('/cgi-bin/nice.rb', 'datadatadata...') {|response|
p response.status
p response['content-type']
response.read_body do |str| # read body now
print str
end
}

If the response is not chunked, then just pull out the @socket from the
object and read(65536) it in a loop.

If you want to *parse* the response on the fly, then you could use rexml in
stream parsing mode: see
http://www.germane-software.com/software/XML/rexml/docs/tutorial.html and
scroll down to "Stream Parsing"

You then may need an IO.pipe or similar object which accepts the HTTP chunks
on one side and gives a readable stream on the other.

But this may still be a problem if your 5GB response consists mainly of a
single element, <some-tag>...5GB of data...</some-tag>. I'm not sure if
REXML will call text() with blocks, or will try to slurp the whole 5GB in
before calling text() once.

HTH,

Brian.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top