RubyForge has been slow today because...

C

Charles Hixson

gabriele said:
Markus ha scritto:



imho they could be even simpler, just have an array of
question/answer things like
"enter 2 plus 2" "4"
"the color of a white horse" "white"
"4 letters, Read The Fricking Manual" "RTFM"
"the thing after 1 and 2 " "3"
no standard way for a bot to guess, and totally dumb for a user




I recall a t-shirt over thinkgeek that was "if you're with an halfing
and a dragon, remember, you don't have to outrun the dragon, you have
to outrun the halfling" :)
 
C

Charles Hixson

...
Sorry, that was supposed to be:
The color of a white horse is grey!

But I slipped and not only replied too soon, but even to the wrong parent.
 
D

David Ross

Richard said:
Some freaking dork at the following IP address(s) was continually
downloading ruby182-14_RC8a.exe from here:

200.98.63.142

Then from here...

200.98.136.108

How is this for an example log:

200.98.63.142 - - [23/Oct/2004:17:41:34 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:17:53:18 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:17:56:34 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:00:47 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:06:31 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:10:56 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:11:14 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:11:28 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:11:41 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:19:10 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 9190167
200.98.63.142 - - [23/Oct/2004:18:19:12 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:19:18 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:23:16 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:23:55 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:26:32 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:26:36 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:27:46 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:28:32 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:29:58 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:31:51 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136
200.98.63.142 - - [23/Oct/2004:18:32:07 -0400] "GET
/frs/download.php/1205/ruby182-14_RC8a.exe HTTP/1.1" 200 11613136

And I mean continually. Those IP address are now officially blocked. If we
find the perp who did this, they are going to be NAILED. We realize that
this is probably a DSL line or cable modem. If someone wants to help track
down who is doing this it would be great. It seems to be coming from Brazil
(www.uol.com.br) RubyForge is a community resource and this screws the
whole community.

I can only assume this was a denial of service attack. I will block the
entire 200.98 subnet and every other subnet owned by uol.com.br if these
things continue (which may negatively effect innocent people...and I don't
want to do that).

Best,

Rich
Team RubyForge
Solution, check RBL lists..
http://rbls.org/?q=200.98.136.108

implement to check these I use as well..
opm.blitzed.org, /* Remeber this is a hijacked-IP
range domain, so its your choice to use. Questions, ask me. */
list.dsbl.org,
bl.spamcop.net,
sbl-xbl.spamhaus.org,
dnsbl.njabl.org,
http.dnsbl.sorbs.net,
socks.dnsbl.sorbs.net,
misc.dnsbl.sorbs.net,
smtp.dnsbl.sorbs.net,
web.dnsbl.sorbs.net,
spam.dnsbl.sorbs.net,
block.dnsbl.sorbs.net,
zombie.dnsbl.sorbs.net,
rhsbl.sorbs.net,
dnsbl.ahbl.org

You were attacked, yes. Solution is to implement RBLs. This is what to
do if you are going to be under attack. Some people don't care like big
sites. I know Rubyforge isn't HUGE and has 100000 Terrabytes of
transfer a month, so its best to implement RBL


Thanks have a nice day, for the solution.

David Ross
 
B

Brian Candler

Think creatively. You could fairly easily come up with a text
based captcha system that was screen reader friendly and had no external
dependencies. For example test riddles / story problems that would be
dirt simple for a human but next to impossible for a program "in the
general case" could be rather easily generated in pure ruby.

For example:

Three things that go "quack" landed in a circular pond
that was 10 meters across. They found fourteen early shoes
and each of them ate as many as he wanted. How many shoes
were left?

Most humans could get this on their first try

I disagree; I'd be surprised if more than 10% of the general population
could get past that barrier (although maybe if you gave them two or three
tries you'd get more).

I was listening to a (UK) breakfast radio show this morning, and they
decided to give some questions from a pub quiz. The question "what is two
cubed divided by two squared?" had the *presenters* stumped, let alone the
audience.

Perhaps that duck example is not typical of what you were proposing, but in
any case I find it highly ambiguous. Are we supposed to assume that ducks
don't eat shoes? (=> Answer 14). Or to assume that all the shoes would be
eaten? (=> Answer 0). Since we are clearly in a nonsense world here, with
such things as "early shoes" which don't exist in the real world, we can
apply whatever rules we like to come up with an answer.

Or, given that there were 14 shoes, perhaps 7 were right and 7 were left.
(Argh!!)

Regards,

Brian.
 
B

Brian Schröder

For example:

Three things that go "quack" landed in a circular pond
that was 10 meters across. They found fourteen early shoes
and each of them ate as many as he wanted. How many shoes
were left?
I have to admit that I don't get it. It reminds me of the old joke:

Six men get into the bus,
three leave,
how old is the driver?

Regards,

Brian
 
T

trans. (T. Onoma)

| >
| > Three things that go "quack" landed in a circular pond
| > that was 10 meters across. They found fourteen early shoes
| > and each of them ate as many as he wanted. How many shoes
| > were left?
|
| I have to admit that I don't get it. It reminds me of the old joke:
|
| Six men get into the bus,
| three leave,
| how old is the driver?

:) I think it was just demo of potential. And I think we sort of narrowed it
to simple word scrambles. But you are right, I don't think any of it is a
great idea.

But I wonder how effective a simple javascript combination lock would be, For
examplem imagine three button labelled 1..3 .

Click on buttons in labelled order to save:

[ 2 ] [ 3 ] [ 1 ]


It might seem silly, but I wonder how well spammers would be able to adapt to
having to interact with a javascript like this?

T.
 
P

Phlip

Brian said:
Six men get into the bus,
three leave,
how old is the driver?

If a chicken and a half can lay an egg and a half in a day and a half, how
long does it take a grasshopper to kick the seeds out of a watermelon?
 
R

Ruby Noob

Phlip said:
Brian Schröder wrote:




If a chicken and a half can lay an egg and a half in a day and a half, how
long does it take a grasshopper to kick the seeds out of a watermelon?

Five pounds of flax.
 
M

Markus

I disagree; I'd be surprised if more than 10% of the general population
could get past that barrier (although maybe if you gave them two or three
tries you'd get more).

*smile* Less than that, because the majority of the general
population does not know english. I someone else noted, this was a
contrived example (and, I gather, not a very good one). But note also
that I suggested having a quiz/contest to come up with better ideas--the
important point here being the general structure, not the particulars.
Perhaps that duck example is not typical of what you were proposing, but in
any case I find it highly ambiguous. Are we supposed to assume that ducks
don't eat shoes? (=> Answer 14). Or to assume that all the shoes would be
eaten? (=> Answer 0). Since we are clearly in a nonsense world here, with
such things as "early shoes" which don't exist in the real world, we can
apply whatever rules we like to come up with an answer.

I tried to cram all of the proposed mechanisms into one example
(all except the spelling errors, which I hadn't thought of yet); I see
now that this was a very bad way to present them. Imagine instead of my
duck problem a whole collection of questions that require different
sorts of answers but look structurally similar to some of the other
questions in the collection. In other words:

* If I have seven pairs of shoes how many shoes do I have?

* What goes "quack"?

* In "The tired man gave algebraic his wife a kiss" what word
doesn't belong?

* In "Bananas are generally yellow or greed?" what word is
misspelled?

* If you could have all the shoes you wanted, how many would you
eat?

* ...and so forth


If these questions were "randomly" generated be a heterogeneous
collection of simple plug-ins, it would be easy for us and hard for
them.
Or, given that there were 14 shoes, perhaps 7 were right and 7 were left.
(Argh!!)

Yes, putting all the obfuscation techniques in one example was
definitely dumb on my part. Sorry.

-- Markus
 
D

dross

Yes, putting all the obfuscation techniques in one example was
definitely dumb on my part. Sorry.

-- Markus
I don't know if anyone read the [SOLUTION] RubyGarden thread from all the
other emails. The best solution is to implement checks to the RBL servers,
then have cleanup. over 85%(maybe more) would be blocked, and the rest
could be implemented to have a way to reverse any changes made by certain
ips which were not in the RBLs(which is unlikely but possible).

David Ross
 
B

Brian Candler

I tried to cram all of the proposed mechanisms into one example
(all except the spelling errors, which I hadn't thought of yet); I see
now that this was a very bad way to present them. Imagine instead of my
duck problem a whole collection of questions that require different
sorts of answers but look structurally similar to some of the other
questions in the collection. In other words:

* If I have seven pairs of shoes how many shoes do I have?

* What goes "quack"?
...etc

I can only think of a few different ways to generate the pool of questions
to show to the user and the expected answers:

1. Someone writes a database of questions and answers and distributes it
to whoever wants it (but then the spammers get it as well)

2. Someone writes a knowledgebase which can be used to generate the
questions algorithmically (but that means it will be easy to reverse
the process, given the same knowledgebase and the program which generates
the questions)

So you would also need some obfuscation: enough to be hard to reverse for a
computer program, but not enough to confuse a human.

Alternatively:

3. Each wiki author writes their own questions and answers (too much work)

4. Each visitor to the site can contribute a new question and answer
(risks pollution with questions that don't work; probably needs
moderating)

Others?

Regards,

Brian.
 
E

Eivind Eklund

[ ... on using question based "captchas" with questions a human can
recognize, but a computer cannot ...]
I can only think of a few different ways to generate the pool of questions
to show to the user and the expected answers:

[ ... ways to get questions deleted ... ]

The idea here is really using the ability of humans to recognize
patterns and the set of facts that humans know to distinguish a human
from a computer.

An example of such a question is "Is the following page data from a
legitimate page in this Wiki or from a spoof?", then showing a random
page (or part of one) either from the legitimate Wiki pages in that
Wiki, or from another Wiki.

Unfortunate issue: This is possibly possible to overcome by mirroring
the entire site and doing matching. I initially thought that that
could be defended against by linking in the "fake data" and showing
that as part of the real wiki (not much bother for visitors, noticable
bother for bots), but unfortunately it may be possible to look at the
interconnectedness of the two sets of pages (one from the real
on-topic Wiki, the other from the "shadow" Wiki) and thus do a good
guess on whether the page is real or fake. It DOES add significant
hurdles, though.

Eivind.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,074
Latest member
StanleyFra

Latest Threads

Top