What's the Best Way to Mimic an HTTP Request?

  • Thread starter Daniel Miessler
  • Start date
D

Daniel Miessler

I'm trying to write a tool that will take a domain as an argument and
make a request to http://onsamehost.com and then capture the list of
domains that share that same IP. I want to parse out those IPs and put
them into an array that I can print to a file later.

Here's the code I'm trying to use:

--
require 'net/http'
require 'uri'

PATH = '/query.jsp'
USERAGENT = 'Opera'
HOST = 'onsamehost.com'

@http = Net::HTTP.new(HOST, 80)

resp, data = @http.get2(PATH, {'User-Agent' => USERAGENT})

puts resp
puts data
--

The problem is that I keep getting a redirect
(#<Net::HTTPMovedPermanently:0xb7c35ffc>), which doesn't happen when I
make the request from a regular browser.

So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.

Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?

Thanks for any thoughts...
 
J

James Herdman

[Note: parts of this message were removed to make it a legal post.]

Is there a Ruby front end for Curl?

James
 
H

Hassan Schroeder

The problem is that I keep getting a redirect
(#<Net::HTTPMovedPermanently:0xb7c35ffc>), which doesn't happen when I
make the request from a regular browser.

Actually, it does -- you just don't see it.

When you request e.g. `http::/example.com` most servers will send
a redirect to the default page, e.g. `http://example.com/index.html`.

You need to either handle it or pass the default page's full URL.

HTH,
 
M

Michael Libby

The problem is that I keep getting a redirect
(#<Net::HTTPMovedPermanently:0xb7c35ffc>), which doesn't happen when I
make the request from a regular browser.

That site makes heavy use of redirects. Watch closely while running
queries or check your browser history.
So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.

Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?

Headers probably have no effect here.

What you probably want is code like this:

require 'net/http'
require 'uri'

def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0

response = Net::HTTP.get_response(URI.parse(uri_str))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
response.error!
end
end

resp = fetch('http://www.ruby-lang.org')
puts resp.body

(from http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html --
"Following Redirection")

regards,
Michael Libby
 
D

Daniel Miessler

Thanks, much, Michael. Unfortunately I'm not quite tracking on why that
was necessary. It just seems a bit elaborate given what I thought was a
simple problem.

But I totally appreciate it...I just wish it were something simpler.
 
M

Michael Libby

Thanks, much, Michael. Unfortunately I'm not quite tracking on why that
was necessary. It just seems a bit elaborate given what I thought was a
simple problem.

But I totally appreciate it...I just wish it were something simpler.

The site you're hitting makes heavy use of redirects (and not really
for their intended purpose). What this means is that you submit your
request for a given URL and the server responds with a redirect and a
new URL. If you are working in a browser, your browser automatically
requests that URL, and the server again responds with a redirect and a
new URL. Again, a web browser handles requesting that next URL
automatically. This URL is the actual results page with the data you
want. It's the web site making you jump through hoops to get where you
want to go.

Net::HTTP does not have a built in facility for following redirects
the way your browser does. So you have to write code to follow
redirects by submitting new requests until you get to one that is not
a redirect, which is what the fetch() method from the Net::HTTP
example does.

-Michael
 
D

Daniel Miessler

Michael said:
The site you're hitting makes heavy use of redirects (and not really
for their intended purpose). What this means is that you submit your
request for a given URL and the server responds with a redirect and a
new URL. If you are working in a browser, your browser automatically
requests that URL, and the server again responds with a redirect and a
new URL. Again, a web browser handles requesting that next URL
automatically. This URL is the actual results page with the data you
want. It's the web site making you jump through hoops to get where you
want to go.

Ah, I see.

You appear, by my estimation, to rock.

: Daniel :
 
D

Daniel Miessler

Avdi said:
You may want to look into using Mechanize rather than straight-up
Net::HTTP.

Mechanize for Ruby? Interesting. I didn't know Ruby had an
implementation. Thanks, Avdi.
 
U

Uwe Petschke

Daniel said:
The problem is that I keep getting a redirect
(#<Net::HTTPMovedPermanently:0xb7c35ffc>), which doesn't happen when I
make the request from a regular browser.

So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.

Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?

We have had similar issues where we didn't see a redirect when sniffing
the browser but it happened for our code. The reason was HTTP/1.1.
With HTTP/1.1 it is required to specify the host you expect to be
talking with (as more than one virtual host may be serviced by one
server):
GET / HTTP/1.1
Host: www.apache.org
(see http://www.apacheweek.com/features/http11 for reference)

Hope that helps in avoiding the redirect ;-)

Uwe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top