Faking the refering page with Mechanize

E

ehudros

Hi,
I am trying to access a certain web page from my server, but I want the
referring page to be a different one.
It seems the the get method accepts both a url and a "ref" parameter
that is is supposed to be the referring page, but when I try to send
that url as follows:
agent.get(url_to_get, "http://www.mininova.org")
I get the following error:
ndefined method `uri' for "http://www.mininova.org":String.

Anyone knows what is the correct way to do this?

Thanks!
Ehud
 
E

ehudros

Konrad said:
Quoth Ehud Rosenberg:

What you need to do is create a Mechanize::URI object (or something like
that)
from the string "http://www.mininova.org", and call #get() with that.

HTH,

Hi Konrad,
Thanks for the quick reply :)
I managed to get around that error by doing the following:
ref = WWW::Mechanize::page.new(URI.parse("http://www.mininova.org"),
{'content-type'=>'text/html'})
doc = agent.get(url_to_get, ref)

Problem is, when I sniff the package it seems that the referring url is
ignored and it is still set to localhost:3000. I've also tried
specifying a nil page (as I've seen in the mechanize get code itself)
but to no avail...

Any help would be appreciated.
BTW - removing the referrer altogether would also work for, albeit it's
a less preferred option.

Thanks!
 
C

Cédric Finance

Note: parts of this message were removed by the gateway to make it a legal Usenet post.

I tried it and it seems that the header is correct.

require 'mechanize'
require 'logger'
logger = Logger.new $stdout
agent = WWW::Mechanize.new
agent.log = logger
ref = WWW::Mechanize::page.new("http://www.mininova.org',{ 'content-type' =>
'text/html' })
agent.get url, ref

output from logger ==>
I, [2007-11-07T14:34:06.401219 #23290] INFO -- : Net::HTTP::Get: /
D, [2007-11-07T14:34:06.401585 #23290] DEBUG -- : request-header:
accept-languag
e => en-us,en;q0.5
D, [2007-11-07T14:34:06.401664 #23290] DEBUG -- : request-header: accept =>
*/*
D, [2007-11-07T14:34:06.401733 #23290] DEBUG -- : request-header:
accept-encoding => gzip,identity
D, [2007-11-07T14:34:06.401801 #23290] DEBUG -- : request-header: user-agent
=> WWW-Mechanize/0.6.5 (http://rubyforge.org/projects/mechanize/)
D, [2007-11-07T14:34:06.401869 #23290] DEBUG -- : request-header: referer =>
http://www.mininova.org
 
E

Ehud Rosenberg

Wow, that's strange.
Thanks for taking the time to check this, I'll try and run it again to
see if I can figure out what im doing wrong.

Thanks :)
 
E

Ehud Rosenberg

In the following line:
ref =
WWW::Mechanize::page.new("http://www.mininova.org',{'content-type'=>'text/html'
})
the second apostrophe after mininova.org does not match the first one ("
vs. ').

When I try to run it with both sides contained by a double quote, i get
a message saying:
NoMethodError: undefined method `path' for
"http://www.mininova.org":String

That's why I used the URI.parse method before.
this is really strange, are you sure that's the exact syntax you used?

Thanks!
Ehud
 
E

Ehud Rosenberg

OK, I'm stupid :)
I don't need mechanize to fake the referring page at all, but rails
itself in a redirect_to statement.
The flow is as such:
a. mechanize retrieves a page and scrapes it for a url
b. a redirect_to to that url is sent back to the cient
c. he goes and fetches the url (only right now he gets redirected since
his referrer is wrong.

I am trying to do the following to set the referrer:
request.env['HTTP_REFERER'] = 'http://www.something.com
redirect_to "www.site.com"

but when I sniff the headers, it's still set to my own site...

Any ideas?
Thanks!
 
C

Cédric Finance

Note: parts of this message were removed by the gateway to make it a legal Usenet post.

#irb
irb(main):001:0> require 'mechanize'
=> true
irb(main):002:0> require 'logger'
=> true
irb(main):003:0> logger = Logger.new $stdout ; ""
=> ""
irb(main):004:0> agent = WWW::Mechanize.new ; ""
=> ""
irb(main):005:0> agent.log = logger ; ""
=> ""
irb(main):006:0> ref = WWW::Mechanize:: Page.new("http://www.mininova.org",{
'co
ntent-type' => 'text/html' }) ; ""
=> ""
irb(main):007:0> agent.get "http://google.com", ref ; ""
I, [2007-11-08T19:25:06.144484 #21091] INFO -- : Net::HTTP::Get: /
D, [2007-11-08T19:25:06.144937 #21091] DEBUG -- : request-header:
accept-language => en-us,en;q0.5
D, [2007-11-08T19:25:06.145032 #21091] DEBUG -- : request-header: accept =>
*/*
D, [2007-11-08T19:25:06.145106 #21091] DEBUG -- : request-header:
accept-encoding => gzip,identity
D, [2007-11-08T19:25:06.145178 #21091] DEBUG -- : request-header: user-agent
=> WWW-Mechanize/0.6.5 (http://rubyforge.org/projects/mechanize/)
D, [2007-11-08T19:25:06.145253 #21091] DEBUG -- : request-header: referer =>
http://www.mininova.org


Which version of mechanize do you use?
 
E

Ehud Rosenberg

Which version of mechanize do you use?
0.6.10.

But as it seems, I don't need to set the referring page with mechanize
but with rails itself...
 
C

Cédric Finance

Note: parts of this message were removed by the gateway to make it a legal Usenet post.

I don't think that you can do what you want this way, the referer is set by
your browser not by the website you are browsing.
(Moreover, you are changing the referer from the request, not from the
response). And I don't see why a server response
could have a referer header.
If you want to load a page with another referer from your rails app, try to
use some javascript to change the referer of the
browser or make an ajax page request with the right referer (I think it
should be possible to set the header from the ajax
request).
 
E

Ehud Rosenberg

Cédric Finance said:
I don't think that you can do what you want this way, the referer is set
by
your browser not by the website you are browsing.
(Moreover, you are changing the referer from the request, not from the
response). And I don't see why a server response
could have a referer header.
If you want to load a page with another referer from your rails app, try
to
use some javascript to change the referer of the
browser or make an ajax page request with the right referer (I think it
should be possible to set the header from the ajax
request).

Yes, it has finally dawned on me that what I'm trying to do is not
really possible this way. I doubt it can be done using ajax, though it's
work a bit of research.
Anyway, thanks for putting me on the right track.
 
C

Cédric Finance

the XMLHttpRequest Object has a method setRequestHeader so we can change
the header from the request but i'm not sure that we can do an
xmlhttprequest to
a different domain than the current website.
And in javascript, document.referrer is read only.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top