Quick and dirty article filter

Discussion in 'Perl Misc' started by Chris Nehren, May 29, 2012.

  1. Chris Nehren

    Chris Nehren Guest

    As I've been spending more time on Usenet, I've come to become rather
    annoyed at the "features" of Google Groups. Yes, it sucks, it makes
    things a mess, and so on. But! Instead of just whining about the
    problem, I've put together a small, simple Perl program to filter Google
    Groups (and other) articles into something resembling sanity. I wrote
    this for slrn, and will include the macro I wrote for slrn as well.

    A quick precaution about this code: yes, it uses Email::Simple. For the
    most part, email messages and news articles are compatible--but that's
    not the part that summons this precaution. The code uses Email::Simple
    to stick its fingers in its ears and pretend that encodings other than
    US ASCII do not exist. For Usenet, this is still mostly okay (at least
    for the parts of it I read, i.e. the Big 8 and a few groups in alt.*).
    It very likely will break on multibyte messages, but I tend not to read
    or encounter those. You have been warned.

    This requires a new-ish version of libslang (2.2.4 works) due to a
    recently-fixed bug in process.sl that I stumbled over while developing
    the slang macro. It was fixed in git at the time I found it, but not in
    the released version of libslang.

    Patches are, of course, welcome. If someone wants to do proper encoding
    handling, that'd be pretty awesome. If you really care about licenses,
    I'll release it as the same terms as perl 5.16.0 or any later version.
    Provided as-is, no warranty, yadda yadda.

    fix_article.pl:

    === cut ===
    #!/usr/bin/env perl
    use strict;
    use warnings;

    use IO::All;
    use Email::Simple;
    use Text::Autoformat;

    my $raw = io('-')->all;
    my $article = Email::Simple->new($raw);
    my $body = $article->body;
    $body =~ s/^(>+)(\w)/$1 $2/mg;
    $body =~ s/^>( >)+/>>/mg;
    $body = autoformat $body;
    $article->body_set($body);
    print $article->as_string;
    === cut ===

    fix_article.sl:

    === cut ===
    require("process");
    define fix_article_stupidity ()
    {
    variable a = article_as_string();
    variable p = new_process(["/path/to/fix_article.pl"]; write={1,2}, read=0);
    fputslines(a, p.fp0);
    fflush(p.fp0);
    fclose(p.fp0);
    variable r = fgetslines(p.fp1);
    p.wait();
    variable n = strjoin(r, "");
    replace_article(n);
    }
    !if(register_hook("read_article_hook", "fix_article_stupidity"))
    message("Warning: Could not register fix_article_stupidity" +
    " for read_article_hook");
    === cut ===


    --
    Thanks and best regards,
    Chris Nehren
    Chris Nehren, May 29, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. countzero
    Replies:
    6
    Views:
    697
    Rolf Magnus
    Mar 11, 2005
  2. Replies:
    1
    Views:
    349
    Robert D. Young
    May 2, 2005
  3. Robert Brewer

    RE: Quick and dirty dialogs?

    Robert Brewer, May 2, 2005, in forum: Python
    Replies:
    3
    Views:
    334
    Harlin Seritt
    May 2, 2005
  4. Ryan Davis
    Replies:
    10
    Views:
    178
    gabriele renzi
    Nov 19, 2004
  5. Erik Terpstra

    Quick and dirty word wrapping.

    Erik Terpstra, Sep 14, 2005, in forum: Ruby
    Replies:
    8
    Views:
    148
    Ezra Zygmuntowicz
    Sep 14, 2005
Loading...

Share This Page