http-proxy in Ruby?

M

Michael Schuerig

I'm thinking of implementing a http-proxy in Ruby that processes the
retrieved HTML before passing it on. Ideally, I'd like to rely on a
small existing framework or example code that does most of the work for
me. Does anything like that exist?

Michael
 
D

Daniel Berger

Michael said:
I'm thinking of implementing a http-proxy in Ruby that processes the
retrieved HTML before passing it on. Ideally, I'd like to rely on a
small existing framework or example code that does most of the work for
me. Does anything like that exist?

Michael

There's an httpproxy.rb as part of the webrick library. Will that work
for you?

Regards,

Dan
 
T

Tanner Burson

=20
I'm thinking of implementing a http-proxy in Ruby that processes the
retrieved HTML before passing it on. Ideally, I'd like to rely on a
small existing framework or example code that does most of the work for
me. Does anything like that exist?
I've attempted the same thing and found there is very little base to
work from. You can take a look at WEBrick's httpproxy.rb but I found
it hard to determine where I would place my "hooks" to reprocess the
content. I've got a partially functional proxy that I wrote from the
ground up, but it has issues displaying certain pages. If you're
interested I can get the code up somewhere that it can be seen.
=20
Michael
=20
--
Michael Schuerig Most people would rather die than think.
mailto:[email protected] In fact, they do.
http://www.schuerig.de/michael/ --Bertrand Russell
=20
=20


--=20
=3D=3D=3DTanner Burson=3D=3D=3D
(e-mail address removed)
http://tannerburson.com <---Might even work one day...
 
M

Michael Schuerig

Daniel said:
There's an httpproxy.rb as part of the webrick library. Will that
work for you?

Thanks for pointing this out. It might do what I need, I'll have a
closer look.

Michael
 
P

Paul Battley

I've attempted the same thing and found there is very little base to
work from. You can take a look at WEBrick's httpproxy.rb but I found
it hard to determine where I would place my "hooks" to reprocess the
content. I've got a partially functional proxy that I wrote from the
ground up, but it has issues displaying certain pages. If you're
interested I can get the code up somewhere that it can be seen.

Hmm... I actually did this last week, and I found some example code on
the web pretty quickly (it was in Japanese, admittedly...). Here's
the simple AdBlock proxy I ran up whilst playing around (it uses the
pierceive adblock list). It returns an empty document for disallowed
addresses, and removes all img tags, just as an example of processing.
It's not meant to be feature-rich or even high-quality code, but it
does most of what you seem to want.

Paul.

#!/usr/bin/env ruby

require 'webrick/httpproxy'
require 'stringio'
require 'zlib'
require 'open-uri'
require 'iconv'

class AdBlocker
def initialize
reload
end

def reload
bl =3D []
File.open('adblock.txt').each_line do |line|
line.strip!
next if (line =3D~ /\[Adblock\]/ || line =3D~ /^!/)
if (%r!^/.*/$! =3D~ line)
bl << Regexp.new(line[1..-1])
else
bl << line
end
end
@block_list =3D bl
end

def blocked?(uri)
@block_list.each { |rx|=20
if (uri.match(rx))=20
return true=20
end
}
return false
end
end

module WEBrick
class RejectingProxyServer < HTTPProxyServer
def service(req, res)
if (@config[:proxyURITest].call(req.unparsed_uri))
super(req, res)
else
blank(req, res)
end
end

def blank(req, res)
res.header['content-type'] =3D 'text/plain'
res.header.delete('content-encoding')
res.body =3D ''
end
end
end

class ProxyServer
#
# Handler that is called by the proxy to process each page
#
def handler(req, res)
#p res.header
# Inflate content if it's gzipped
if ('gzip' =3D=3D res.header['content-encoding'])
res.header.delete('content-encoding')
res.body =3D Zlib::GzipReader.new(StringIO.new(res.body)).read
end
res.body.gsub!(%r!<img[^>]*>!im, '[image]')
end

def uri_allowed(uri)
b =3D @adblocker.blocked?(uri)
#puts("--> URI #{b ? 'blocked' : 'allowed'}: #{uri}")
return !b
end

def initialize
@server =3D WEBrick::RejectingProxyServer.new(
:BindAddress =3D> '0.0.0.0',
:port =3D> 8181,
:proxyVia =3D> false,
# :proxyURI =3D> URI.parse('http://localhost:8118/'),
:proxyContentHandler =3D> method:)handler),
:proxyURITest =3D> method:)uri_allowed)
)
@adblocker =3D AdBlocker.new
end

def start
@server.start
end

def stop
@server.shutdown
end
end

#
# Create and start the server
#
ps =3D ProxyServer.new
%w[INT HUP].each { |signal| trap(signal) { ps.stop } }
ps.start
 
J

James Britt

Paul said:
Hmm... I actually did this last week, and I found some example code on
the web pretty quickly (it was in Japanese, admittedly...). Here's
the simple AdBlock proxy I ran up whilst playing around (it uses the
pierceive adblock list). It returns an empty document for disallowed
addresses, and removes all img tags, just as an example of processing.
It's not meant to be feature-rich or even high-quality code, but it
does most of what you seem to want.



Thanks; this is super handy.


James
 
M

Michael Schuerig

Paul said:
Hmm... I actually did this last week, and I found some example code on
the web pretty quickly (it was in Japanese, admittedly...). Here's
the simple AdBlock proxy I ran up whilst playing around (it uses the
pierceive adblock list). It returns an empty document for disallowed
addresses, and removes all img tags, just as an example of processing.
It's not meant to be feature-rich or even high-quality code, but it
does most of what you seem to want.

Thanks, that's great. My Japanese is severely lacking unfortunately.
Your code appears to be very close to what I'm intending to do. I don't
want to remove stuff from pages, rather I want to insert. Specifically,
I want to insert Greasemonkey (-> http://greasemonkey.mozdev.org/)
scripts in the hope of using them with browsers other than
Mozilla/Firefox.

Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,216
Latest member
topweb3twitterchannels

Latest Threads

Top