My multipost-detecting usenet bot (David Filmer)

U

usenet

(Note: This message is crossposted to the following newsgroups, as
these groups are affected by the subject bot: comp.lang.perl.misc,
perl.beginners, comp.lang.perl.modules, perl.dbi.users,
perl.beginners.cgi, alt.perl)

Greetings. As many of you are doubtless aware, I recently wrote and
deployed a usenet 'bot which identifies multiposted messages. After
manually flagging such messages for some time, it occurred to me that I
could let Perl do the work for me, and laziness took over.

FIRST OF ALL, I would like to apologize to the usenet community for
having done this unilaterally. I had genuinely not anticipated that
many folks would object or even care about this - it was a very minor
project for me to save me a bit of trouble now and then. I now know
that I should have posted an RFC before deploying my bot, and I would
have done so had I realized the level of interest it would generate.

If I've angered or annoyed anyone, I do apologize. I had no such
intent.

This topic is presently being discussed in a number of threads:
http://tinyurl.com/rdedx, http://tinyurl.com/m2e2r, and
http://tinyurl.com/oubbn (and possibly others), and the topic is
certainly OT to the first two threads (and the third thread is postured
as an attack article). Multiple threads are an ineffective way to
discuss a topic, and I hope that by opening this thread, I can
consolidate (rather than contribute) to the mess. I don't want to
re-hash these threads here; I hope interested folks will read those
messages but continue the discussions here (with good quoting, of
course, so others will be able to follow along).

I have read comments from many respected posters which were both
supportive and critical of my bot. In both cases, however, there was
often a strong sentiment that the bot message was too long and too
harsh.

I had a lot of temporary introductory text in the first couple of
messages that was never intended to be part of the regular bot
messages. That, however, was a mistake (as it led folks to believe that
I really intended to post such a long reply to every multipost). I
should have posted the messages without that additional explanatory
verbiage and perhaps included that additional information in a reply.

HOWEVER, reading many comments has led me to believe that it may not be
a good idea to include very much more than a very basic reply and a
link for more info. I argued against this idea (because I thought the
reply would not be very effective, as novice OPs don't often appear
to follow links) but I have reconsidered my opinion (due to what seems
to be a rough consensus, and because I realize the various strengths of
the other position).

I will therefore modify the bot to something per the suggestions that
John & Sinan made in http://tinyurl.com/oubbn. I have also changed the
bot's handle to my personally-named domain (so it's not anonymous).
It's a different handle than I'm posting under now (for folks who
may wish to killfile the bot but not killfile me). Those who have
already killfiled the bot will need to do so again (sorry) - you may
killfile (e-mail address removed) if you wish to killfile the bot.

Opinions have been expressed in roughly four categories:
1 - The whole idea of a bot sucks
2 - The idea is OK, but the implementation (auto-message) sucks
3 - Rock on
4 - Indifference (posted messages without expressing an opinion)

So far, most opinion seems to fall in the second or third category
(though opinions of the first category have been somewhat more vocal).
I believe I have taken measures to address many of the concerns of the
second category. As the discussion develops, if it seems the group
consensus does generally oppose the idea, I have no problem with
shutting it down and I will readily do so.
 
U

usenet

(Note: This message is crossposted to the following newsgroups

Hmmm. I'm posting from GoogleGroups (it's all I have access to at the
moment). Apparently GG must have some sort of crosspost limitation,
because the message didn't go to the comma-separated list that I
provided. Grrr - darn Google Groups.

Well, this is really the group whose members' opinions I respect the
most... so maybe it's for the best anyway.
 
J

John Bokma

Hmmm. I'm posting from GoogleGroups (it's all I have access to at the
moment). Apparently GG must have some sort of crosspost limitation,
because the message didn't go to the comma-separated list that I
provided. Grrr - darn Google Groups.

You didn't set a follow up to, so I am quite thankfull that the cross
failed.

Probably my last remark regarding this bot, personally I think a CfV
should be held. For example at least 50 votes, and the majority must vote
yes for this project to continue. Which is better then guessing where
opinions fall.
 
U

usenet

John said:
You didn't set a follow up to,

Ha - as if you can set a follow-up in GG... If there's a way to do
that, I don't know what it is. You just list your groups and hope GG
figures it out.
Probably my last remark regarding this bot, personally
I think a CfV should be held.

If I could trouble you for one more thing... Is there a
generally-accepted procedure for issuing (or voting in) such a call?
I've heard of this, but I don't actually believe I've ever seen it
done. I could envision such a call becoming (yet) another discussion.

I think a CfV is a good idea (but don't know how it should be handled).
I would feel better about continuing or canning the bot if I had a firm
idea of what the consensus is.

BTW, I would like to thank you... you were obviously very peeved at me,
but you were kind enough to provide helpful and constructive input when
I asked you for it.
 
J

John Bokma

Ha - as if you can set a follow-up in GG... If there's a way to do
that, I don't know what it is. You just list your groups and hope GG
figures it out.


If I could trouble you for one more thing... Is there a
generally-accepted procedure for issuing (or voting in) such a call?
I've heard of this, but I don't actually believe I've ever seen it
done. I could envision such a call becoming (yet) another discussion.

Technically (and IIRC) there should be first an RFD which contains the
proposal for such a bot, what it should and shouldn't do, what to use
for "From" etc.

Once such a discussion ends the starter of the RFD could create a
summary and post a second RFD.

If nothing new comes out of that one, a Call for Votes can take place.
Normally a CfV will not cause new discussions.

My experiece with RFDs and CFVs is limited to the creation of new groups
in the nl.* hierarchy. IIRC the 50 votes, and majority must be Y comes
from my memories of the documents I once read on the topic.

Did some googling:

<http://users.tkk.fi/~jpatokal/uvv/vote-faq.html>
<http://www.faqs.org/faqs/usenet/creating-newsgroups/part1/>

Not sure if the UVV wants to handle this vote.
I think a CfV is a good idea (but don't know how it should be
handled). I would feel better about continuing or canning the bot if I
had a firm idea of what the consensus is.

BTW, I would like to thank you... you were obviously very peeved at
me, but you were kind enough to provide helpful and constructive input
when I asked you for it.

Yes, I am well known for overreacting, I guess what triggered me the
most was that there was no contact info associated with the bot (it ran
anonymously), and the message was way too lengthy.

I still disagree with the whole idea, but have a few tips:

Remove all whitespace before you calculate the MD5SUM, this way you
might find posts that have been made by copy + paste and have additional
trailing/leading whitespace.

Make sure that the bot posts with a From that is easy to recognize.

Make sure that you provide a contact email address.

Only post a reply if there hasn't been made one yet.

Especially regarding the latter, due to how Usenet works your bot might
post the 2nd, 3rd or even more reply to a multipost.

Also, some people who multipost understand the issue, and cancel the
wrong post. Cancels always run after the facts. What you really want to
avoid is having your bot reply to a message that has been canceled a few
seconds earlier.
 
S

Sherm Pendley

Opinions have been expressed in roughly four categories:
1 - The whole idea of a bot sucks
2 - The idea is OK, but the implementation (auto-message) sucks
3 - Rock on
4 - Indifference (posted messages without expressing an opinion)

So far, most opinion seems to fall in the second or third category
(though opinions of the first category have been somewhat more vocal).

My opinion of the bot itself is somewhat indifferent. I don't see a need
for it and I don't believe it will significantly reduce the "problem" -
which I don't see as such - of multi-posting.

On the other hand, I dislike the way the message chastises anyone who's
thinking of replying to a multi-posted message, and attempts to "burn the
thread" by encouraging others to ignore it. Not everyone agrees with your
idea that a multi-posted message should receive no reply other than "please
don't multi-post".

For myself, I'd prefer to answer the posted question, and include a comment
in the answer about multi-posting, netiquette, and the group guidelines. If
constructive criticism of that sort is given *along with* an answer to the
posted question, it's more likely to be taken seriously. If it's given on
its own, the receiver is (IMO) more likely to dismiss the sender as a crank
and ignore the advice.

sherm--
 
J

John Bokma


I probably would word it as follows:

You have posted the same message to several news groups in a form that
is called multiposting:

group1
<
group2
<
Multiposting is generally considered impolite. For an
explanation, please see:

http://www.cs.tut.fi/~jkorpela/usenet/xpost.html

which also explains crossposting, which is the recommended way to post
a single message to more then one group, if such is really needed.


(I left out Usenet, because most people consider Usenet "Google").
 
B

Brian Greenfield

If I've angered or annoyed anyone, I do apologize. I had no such
intent.

As a long time lurker,and very occasional poster to clpm, I do find
your bot to be both angering and annoying, Please stop. Now.
 
U

usenet

John said:

Thanks. I'll read up. Does anyone know: Has this group ever conducted
such a vote (such as for the Posting Guidelines, etc?) or is stuff like
that done by an informal consensus?
I still disagree with the whole idea, but have a few tips:

Remove all whitespace before you calculate the MD5SUM, this way you
might find posts that have been made by copy + paste and have additional
trailing/leading whitespace.

Actually, the script has always:
$body =~ s/\W//g;
(I have observed several multiposts with extra leading spaces, and even
trailing ...'s)
Make sure that the bot posts with a From that is easy to recognize.

That has now been done (see alt.test.testing or
http://tinyurl.com/qvvqh for an example of the new-and-improved
cop-bot).
Make sure that you provide a contact email address.

That has also been done. It's my catch-all domain - I'll probably
spam-safe it like I do with (e-mail address removed) (which is a
blackhole with an informative autoresponder)
Only post a reply if there hasn't been made one yet.

That's probably a good idea (although it's not uncommon for manual
flagging to be done subsequent to other replies). Making such a change,
however, would require some significant changes to the flow of the
program...
Also, some people who multipost understand the issue, and cancel the
wrong post. Cancels always run after the facts. What you really want to
avoid is having your bot reply to a message that has been canceled a few
seconds earlier.

I agree that would be an undesirable situation (though generally
unlikely, IMHO), but I'm not sure how to avoid it. Even posting
manually, I believe it's possible something like this could happen.
I'm pretty sure I've replied (manually) to posts that got pulled out
from under my feet, and only my reply remained (one such post, if I
recall, was in German, but I answered it anyway only to find the
original was gone - probably in favor of a .de group). I don't know if
it's possible to avoid this situation programmatically any more than it
is manually (but I'm open to ideas!)
 
U

usenet

Sherm said:
On the other hand, I dislike the way the message chastises anyone who's
thinking of replying to a multi-posted message, and attempts to "burn the
thread" by encouraging others to ignore it.

I've modified the bot (which addresses your perfectly valid
objections); you may find the new version more acceptable. See recent
messages in alt.test.test or http://tinyurl.com/qvvqh
 
M

Matt Garrish

I've modified the bot (which addresses your perfectly valid
objections); you may find the new version more acceptable. See recent
messages in alt.test.test or http://tinyurl.com/qvvqh


Any chance of posting the group names along with the ids to simplify
lookups? For example:

This message has been multiposted as indicated by these message IDs:
alt.test.test : <
Other than that the simplified message is drastic improvement.

Matt
 
A

axel

Greetings. As many of you are doubtless aware, I recently wrote and
deployed a usenet 'bot which identifies multiposted messages. After
manually flagging such messages for some time, it occurred to me that I
could let Perl do the work for me, and laziness took over.
This topic is presently being discussed in a number of threads:
http://tinyurl.com/rdedx, http://tinyurl.com/m2e2r, and
http://tinyurl.com/oubbn (and possibly others), and the topic is
certainly OT to the first two threads (and the third thread is postured
as an attack article). Multiple threads are an ineffective way to

Now you have hit *my* pet annoyance... posting URLs in Usenet
postings without good cause... sorry, I'm not firing up a browser
to read them.

Axel
 
J

John Bokma

Sherm Pendley said:
For myself, I'd prefer to answer the posted question, and include a
comment in the answer about multi-posting, netiquette, and the group
guidelines. If constructive criticism of that sort is given *along
with* an answer to the posted question, it's more likely to be taken
seriously. If it's given on its own, the receiver is (IMO) more likely
to dismiss the sender as a crank and ignore the advice.

AOL.
 
J

John Bokma

John Bokma wrote:
[..]
Make sure that you provide a contact email address.

That has also been done. It's my catch-all domain - I'll probably
spam-safe it like I do with (e-mail address removed) (which is a
blackhole with an informative autoresponder)

What seems (or seemed) to work is usenet+bot@

spam harvesting bots seem to get only the bot@ :-D (The + is allowed in
email addresses).
That's probably a good idea (although it's not uncommon for manual
flagging to be done subsequent to other replies). Making such a
change, however, would require some significant changes to the flow of
the program...

A programming challenge :-D
I agree that would be an undesirable situation (though generally
unlikely, IMHO), but I'm not sure how to avoid it. Even posting
manually, I believe it's possible something like this could happen.

Yes. I am sure that I have replied to canceled messages more then once
in the past years.
I'm pretty sure I've replied (manually) to posts that got pulled out
from under my feet, and only my reply remained (one such post, if I
recall, was in German, but I answered it anyway only to find the
original was gone - probably in favor of a .de group). I don't know
if it's possible to avoid this situation programmatically any more
than it is manually (but I'm open to ideas!)

You could check control.cancel, but it might be overkill.
 
M

Mumia W.

[...]
If I've angered or annoyed anyone, I do apologize. I had no such
intent.
[...]

Thank you Mr. Filmer. I can see how the 'bot would reduce a
lot of work, but, as you've acknowledged, its message was a
little long and harsh.

Whatever you do, please don't release the code. Hip-ç-rime
would make usenet a nightmare with it.
 
U

usenet

Mumia said:
Whatever you do, please don't release the code. Hip-ç-rime
would make usenet a nightmare with it.

Ya know, that type of thing really hadn't occured to me. Egads, what a
real nightmare that could be.
 
J

John Bokma

Michele Dondi said:
which also explains crossposting, which is the recommended way to post
^^^^^
^^^^^ [*]
a single message to more then one group, if such is really needed.

[*] is also frowned upon, but

Not really, but it's abused a lot, and it's the abuse that's frowned upon.
Note the "really needed", a lot of crossposters get that wrong ;-)

But you're right, maybe it should made a bit stronger.

Personally I don't have a problem with a crosspost if it's really needed
*and* has the follow-up to header set to the most appropriate group. Also
the number of groups should be limited in most cases.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top