Google's MapReduce

Discussion in 'Python' started by bearophileHUGS@lycos.com, Dec 8, 2004.

  1. Guest

    (This Google article suggestion comes from the CleverCS site):

    http://labs.google.com/papers/mapreduce-osdi04.pdf

    "MapReduce is a programming model and an associated implementation for
    processing and generating large data sets. Users specify a map function
    that processes a key/value pair to generate a set of intermediate
    key/value pairs, and a reduce function that merges all intermediate
    values associated with the same intermediate key."

    This looks like something that can be (nicely) done with Python too.
    Bye,
    Bearophile
     
    , Dec 8, 2004
    #1
    1. Advertising

  2. Terry Reedy Guest

    <> wrote in message
    news:...
    > (This Google article suggestion comes from the CleverCS site):
    >
    > http://labs.google.com/papers/mapreduce-osdi04.pdf
    >
    > "MapReduce is a programming model and an associated implementation for
    > processing and generating large data sets. Users specify a map function
    > that processes a key/value pair to generate a set of intermediate
    > key/value pairs, and a reduce function that merges all intermediate
    > values associated with the same intermediate key."


    Summarizing groups (and sometimes simultaneously by subgroups) has been a
    standard operation for decades in both statistics packages and database
    report generators. Python has various versions of the map and reduce
    operations that they use. Its dicts can easily be used to group items with
    the same key. What is somewhat specific to Google is the need to generate
    *multiple* key-value pairs from each input document. What MapReduce does
    that is somewhat innovative is automatically parallelize the computation to
    run fault-tolerantly on a cluster of up to 1000s of machines with machine
    slowdowns and failures 'common'. And indeed, that is the reason to use the
    system with trivial map or reduce functions, as they sometimes do.

    What this does show is that Google could be regarded as a cluster
    supercomputing company, with Web search as the visible development
    application.

    Terry J. Reedy
     
    Terry Reedy, Dec 8, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrew Thompson

    FAQ - references to Google/Google Groups

    Andrew Thompson, Jun 20, 2005, in forum: Java
    Replies:
    0
    Views:
    655
    Andrew Thompson
    Jun 20, 2005
  2. BG Simp
    Replies:
    3
    Views:
    437
    Terry Reedy
    Aug 27, 2006
  3. Phillip B Oldham

    Suggestions for Python MapReduce?

    Phillip B Oldham, Jul 22, 2009, in forum: Python
    Replies:
    6
    Views:
    1,020
    Phillip B Oldham
    Jul 28, 2009
  4. jlc488

    MapReduce?

    jlc488, Aug 5, 2009, in forum: Java
    Replies:
    4
    Views:
    405
    Arne Vajhøj
    Aug 5, 2009
  5. pythonnoob

    Python MapReduce help

    pythonnoob, Sep 26, 2011, in forum: Python
    Replies:
    0
    Views:
    276
    pythonnoob
    Sep 26, 2011
Loading...

Share This Page