Suggestion For Useful Script -- Google Groups Search and Archive

Discussion in 'Perl Misc' started by EdwardATeller, Sep 6, 2008.

  1. The search function on Google Groups was recently broken. More info
    here:

    http://groups.google.com/group/Is-S...27893/4ffc98ca7b9eaca6?hl=en#4ffc98ca7b9eaca6

    Made me realize how important this is. I thought someone with way
    more talent than me might write a Perl script that takes as input a
    Google Groups search for oranges (for example):

    http://groups.google.com/groups/search?qt_s=1&q=oranges

    and return most of the posts found as a series of linked HTML
    documents.

    Seems like a non-trivial problem, but maybe it's simple to one of you
    Perl gods. Based on the level of concern when the search function
    went away, I'd say people would be interested in archiving some of
    their favorite searches. I know I am. Thanks for taking the time to
    read this post.
    EdwardATeller, Sep 6, 2008
    #1
    1. Advertising

  2. >>>>> "EdwardATeller" == EdwardATeller <> writes:

    EdwardATeller> Made me realize how important this is. I thought someone with
    EdwardATeller> way more talent than me might write a Perl script that takes as
    EdwardATeller> input a Google Groups search for oranges (for example):

    EdwardATeller> http://groups.google.com/groups/search?qt_s=1&q=oranges

    EdwardATeller> and return most of the posts found as a series of linked HTML
    EdwardATeller> documents.

    You're not allowed to scrape the HTML[1]. And it looks like
    <http://code.google.com/apis/ajaxsearch/documentation/> doesn't (currently)
    have a Google Groups searching component.

    So, it's not a matter of being talented. It's a matter of respecting Google's
    permissions for you not to be a robot when you hit their site, because they
    certainly respect the robots.txt you put up as well. It's about ethics.

    print "Just another Perl hacker,"; # the original

    [1] Section 5.3 of [http://www.google.com/accounts/TOS] says:

    5.3 You agree not to access (or attempt to access) any of the Services by
    any means other than through the interface that is provided by Google,
    unless you have been specifically allowed to do so in a separate agreement
    with Google. You specifically agree not to access (or attempt to access)
    any of the Services through any automated means (including use of scripts
    or web crawlers) and shall ensure that you comply with the instructions
    set out in any robots.txt file present on the Services.

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
    See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
    Randal L. Schwartz, Sep 6, 2008
    #2
    1. Advertising

  3. EdwardATeller <> wrote:
    >The search function on Google Groups was recently broken. More info
    >here:
    >
    >http://groups.google.com/group/Is-S...27893/4ffc98ca7b9eaca6?hl=en#4ffc98ca7b9eaca6
    >
    >Made me realize how important this is. I thought someone with way
    >more talent than me might write a Perl script that takes as input a
    >Google Groups search for oranges (for example):
    >
    >http://groups.google.com/groups/search?qt_s=1&q=oranges
    >
    >and return most of the posts found as a series of linked HTML
    >documents.


    Should be fairly simple by using LWP to get the page with the search
    results, then one of the HTML parser modules to extract the links
    (mostly copy-and-paste; AFAIR that task is even used as an example in
    the documentation), and LWP again to get the target pages.

    jue
    Jürgen Exner, Sep 6, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. anonymous
    Replies:
    1
    Views:
    4,514
    Francisco Padron
    May 8, 2005
  2. Andrew Thompson

    FAQ - references to Google/Google Groups

    Andrew Thompson, Jun 20, 2005, in forum: Java
    Replies:
    0
    Views:
    584
    Andrew Thompson
    Jun 20, 2005
  3. Jan Faerber

    advanced google groups search

    Jan Faerber, Feb 1, 2005, in forum: HTML
    Replies:
    4
    Views:
    664
    Jan Faerber
    Feb 2, 2005
  4. Christoff Pale
    Replies:
    0
    Views:
    90
    Christoff Pale
    Jan 9, 2004
  5. VK
    Replies:
    0
    Views:
    76
Loading...

Share This Page