Another Wiki/Spam Update

J

Jim Weirich

During the a question/answer session at the NoFluff/JustStuff conference in
Cincinnati this summer, someone asked since there are so many things in the
IT world to learn, how does one tell what technologies to investigate and
what technologies to put on the back burner. The general answer from the
panel was to wait until you hear about something 6 times. At that point it
is probably worth investigating.

So, I'm jumping the clock here because I only heard of the following twice,
but it was twice in a two day period and it does have bearing on the wiki
spam issue.

I first heard about this from Austin Ziegler in an IM message about ruwiki.
Austin told me that Ruwiki will not link to external sites directly, but will
go through a page rank stripping redirect service supported by Google.
Hmmm ... interesting I thought.

Maybe I'm missing something, but why not pass all outgoing links through the
google redirect, thereby denying the spammer of their all important
PageRank?.
http://www.google.com/url?sa=D&q=URL

Ok, thats two references. LeoO also provided a link to
http://simon.incutio.com/archive/2004/05/11/approved where you can read more
details.

So, I went ahead and enabled the Google redirect for external links on the
RubyGarden wiki. I'll leave it there for a few days and see how it works.
If anyone has problems, feel free to drop me a line at (e-mail address removed).

Just a couple of observations:

(1) Although it denies spammers the benefits of their activies, I'm not
convinced that it will prevent spamming in anything but the most indirect
ways. However, denying them those benefits still makes me feel all tinglely
inside.

(2) As currently implemented, URL with CGI parameters in them might have
problems. For example, in the link:

http://rubygarden.org/ruby?action=browse&id=RubyDiscussions

everything from "&id=" to the end will be ignored when translated to

http://www.google.com/url?sa=D&q=http://rubygarden.org/ruby?action=browse&id=RubyDiscussions

A workaround is to use something like http://tinyurl.com (e.g. the above link
is equivalent to http://tinyurl.com/5jmyb).

(3) As I mentioned, if there is negative pushback on this change, it can be
easily backed out.

Thanks for listening.
 
J

James Britt

Jim Weirich wrote:

...
(2) As currently implemented, URL with CGI parameters in them might have
problems. For example, in the link:

http://rubygarden.org/ruby?action=browse&id=RubyDiscussions

everything from "&id=" to the end will be ignored when translated to

http://www.google.com/url?sa=D&q=http://rubygarden.org/ruby?action=browse&id=RubyDiscussions

A workaround is to use something like http://tinyurl.com (e.g. the above link
is equivalent to http://tinyurl.com/5jmyb).

As practical as they may be, I'm less than enthused with passing my
links through tinyurl. I have much more faith in Google, and expect
that redirection through tinyurl will ultimately lead to some business
plan I may not care for.

Implementing the same behavior in Ruby should be trivial, and I would be
far more comfortable seeing links go through a Ruby-oriented site run by
a known member of the Ruby community (e.g., www.rubyurl.com, which
appears to be free)

Interesting idea, though, passing through Google.

James
 
G

gabriele renzi

James Britt ha scritto:

As practical as they may be, I'm less than enthused with passing my
links through tinyurl. I have much more faith in Google, and expect
that redirection through tinyurl will ultimately lead to some business
plan I may not care for.

Implementing the same behavior in Ruby should be trivial, and I would be
far more comfortable seeing links go through a Ruby-oriented site run by
a known member of the Ruby community (e.g., www.rubyurl.com, which
appears to be free)

qurl.net runs with ruby FWIW.
 
E

Eric Hodel

(2) As currently implemented, URL with CGI parameters in them might
have
problems. For example, in the link:

http://rubygarden.org/ruby?action=browse&id=RubyDiscussions

everything from "&id=" to the end will be ignored when translated to


http://www.google.com/url?sa=D&q=http://rubygarden.org/ruby?
action=browse&id=RubyDiscussions

You just need to escape all [^a-zA-Z]:

http://www.google.com/url?
sa=D&q=http%3a%2f%2frubygarden.org%2fruby%3faction%3dbrowse%26id%3dRubyD
iscussions

pull the code out of cgi.rb and you're done!
 
J

Jim Weirich

(2) As currently implemented, URL with CGI parameters in them might
have problems. [...]

You just need to escape all [^a-zA-Z]:

Actually, I tried this, but then google barfed on the resulting URL. Perhaps
I encoded incorrectly. I'll give it another try when I get a chance.

Thanks.
 
J

Jim Weirich

(2) As currently implemented, URL with CGI parameters in them might
have problems. [...]

You just need to escape all [^a-zA-Z]:

Actually, I tried this, but then google barfed on the resulting URL.
Perhaps I encoded incorrectly. I'll give it another try when I get a
chance.

Got it working now. I must have fat fingered it earlier. Thanks.
 
J

Jim Weirich

Can't you just use CGI.escape for this?

You know, its funny how the brain works. I saw this comment and thought to
myself "Of course! It would be much nicer just to use the CGI module
directly. That's what I will do."

So I bring up the editor and actually enter the code "CGI.escape($url)" into
the program, save it, and run a quick test.

But now I get the error:
Bareword "CGI" not allowed while "strict subs" in use at [...]

Now I'm sure most everybody who has been following this thread probably
realizes what is going on, but I still didn't see it. Half of my brain is
processing the problem that Perl doesn't like a bare CGI stuck into its code,
and the other half of the brain is trying to figure out why perfectly legal
Ruby code is causing an error. All of a sudden, the two halves of my brain
decided to talk to each other: "Duh! You're writing Ruby code in a Perl
program! Of course it doesn't work. Sheesh!"

After my brain got done rsyncing itself, I tried the code "$q->escape($url);"
and that works great.

Austin... it's become imperative that you get Ruwiki released soon [1]. I'm
afraid if I spend much more time in this Perl code I will become permanently
brain damaged.
 
B

Belorion

I don't mean to drudge up an old thread unnecessarily, but I
encountered this today:
http://www.google.com/googleblog/2005/01/preventing-comment-spam.html.
Basically, it looks like google is trying to do something to help
stop comment/wiki spam. Implementing something like this won't *stop*
spammers (unless they know the site uses it), but if enough people
start doing it maybe this sort of spam will decrease in the long run.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top