My multipost-detecting usenet bot (David Filmer)


U

usenet

Ben said:
Something you could perhaps try is keeping a database of all your bot's posts

already do that...
and cancelling them if the original article gets cancelled.

Not sure how to do that... can I query for canceled jobs? Or would I
need to just check every post and see if it was still available?

Of course, in this particular case, apparently a message was canceled
on John's newsserver but not on mine (GigaNews). The bot runs on
GigaNews; it has no way to know about cancels on other servers.

I have never heard anything negative about GigaNews (such as that they
don't honor cancels). One thing I can say for sure about GN - it's
darn fast. But, if I find that they routinely don't honor cancels that
other servers do, I'll regroup.
 
Ad

Advertisements

B

Ben Morrow

Quoth (e-mail address removed):
already do that...


Not sure how to do that... can I query for canceled jobs? Or would I
need to just check every post and see if it was still available?

I admit I don't really know, but I would have thought that the control
message would just come down to you with the rest of the news. Then you
can grab the msgid and cancel your replies.
Of course, in this particular case, apparently a message was canceled
on John's newsserver but not on mine (GigaNews). The bot runs on
GigaNews; it has no way to know about cancels on other servers.

Well, this is Usenet :).

Ben
 
J

John Bokma

already do that...


Not sure how to do that... can I query for canceled jobs?

yes, those are posted in control.cancel
Of course, in this particular case, apparently a message was canceled
on John's newsserver but not on mine (GigaNews). The bot runs on
GigaNews; it has no way to know about cancels on other servers.

I am not sure about that one. I know a bit about Usenet, but no idea if
cancels that are not honored might show up in control.cancel anyway.
I have never heard anything negative about GigaNews (such as that they
don't honor cancels). One thing I can say for sure about GN - it's
darn fast. But, if I find that they routinely don't honor cancels
that other servers do, I'll regroup.

Newsservers are connected and have up and down feeds, and each have their
own cancel policy. If A sends cancels to B, and B propogates them to C,
and C decideds not to propogate them further they never reach D (for
example).

individual.net was free, but costs now (since 2 years) 10 euro/year (about
10 USD).
 
U

usenet

I recently wrote and deployed a usenet 'bot which identifies
multiposted messages.

Per several helpful suggestions (thanks), the bot has been considerably
refined. Several people who had originally expressed reservations have
given positive feedback on the changes.

The big problem with the original bot (as I now realize) was the long
and heavy message. That issue was fixed a couple of days ago, and the
message text has now been further refined per an additional sugestion.

As several folks also suggested, the message cross-references now
include groupnames (indented for clarity).

The bot now ignores messages which contain e-mail addresses and certain
URIs. I believe this should be very effective in preventing the bot
from flagging spam (without admitting many false-negatives). Keyword
filtering (inclusive) was suggested, but I believe the e-mail/URI
approach would be more robust, based on a bit of research into old
multiposts.

The "References" headers have been tweaked so the I-R-T is always the
last item listed. Some readers weren't properly threading the bot's
reply; there was some speculation that re-ordering the References in
this manner might help (verification appreciated).

It was also suggested that the bot not reply to messages which already
have a reply. I'm looking into that - it would require some significant
changes to program logic; it's not a quick-n-easy thing to do (as were
these other things). And I'm not 100% convinced it's even a good idea
(multiposts with other replies are often flagged manually, right?).

I have not had a chance to look into ignoring control.cancel items yet.
But I've observed that cancels on some servers don't ever show up at
(or are not honored by) my provider (GigaNews), so even that would not
be guaranteed effective for all servers, given the oddities of Usenet
(however, I believe that most spam-related (non-)cancels would be
ignored by the e-mail/URI filtering anyway).

You may see an example of the current behavior of the bot at:
http://tinyurl.com/oll3u
or <or see recent "Lorem Ipsum" postings in alt.test.test

To tell you the truth, if I knew then what I know now, I don't think I
would have ever written this bot in the first place. But I *have*
written it, and it's getting pretty well refined (thanks to many
suggestions), and it seems to have settled into something that is
favorable (or at least not patenly objectionable) to many folks. John
Bokma has suggested some sort of vote, and I like that idea (though I'm
still not sure how to conduct it), but before attempting anything like
that, I'd like to let the bot run for 30 days or so to prove it out (so
folks have a more informed idea of exactly what they're voting
for/against). And September is just around the corner, so there should
be some good test cases popping up soon...

Further input, of course, is always welcomed and appreciated.
 
Ad

Advertisements

D

Dr.Ruud

(e-mail address removed) schreef:
I've observed that cancels on some servers don't ever show
up at (or are not honored by) my provider (GigaNews)

Also, the same provider can have 10 types of newsservers: ones that do
and ones that don't honor cancels.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top