mrjob v0.2.5 released

Discussion in 'Python' started by Jimmy Retzlaff, Apr 30, 2011.

  1. What is mrjob?
    -----------------------

    mrjob is a Python package that helps you write and run Hadoop Streaming jobs.

    mrjob fully supports Amazon's Elastic MapReduce (EMR) service, which
    allows you to buy time on a Hadoop cluster on an hourly basis. It also
    works with your own Hadoop cluster.

    Some important features:

    * Run jobs on EMR, your own Hadoop cluster, or locally (for testing).
    * Write multi-step jobs (one map-reduce step feeds into the next)
    * Duplicate your production environment inside Hadoop
    * Upload your source tree and put it in your job's $PYTHONPATH
    * Run make and other setup scripts
    * Set environment variables (e.g. $TZ)
    * Easily install python packages from tarballs (EMR only)
    * Setup handled transparently by mrjob.conf config file
    * Automatically interpret error logs from EMR
    * SSH tunnel to hadoop job tracker on EMR
    * Minimal setup
    * To run on EMR, set $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY
    * To run on your Hadoop cluster, install simplejson and make
    sure $HADOOP_HOME is set.

    More info:

    * Install mrjob: python setup.py install
    * Documentation: http://packages.python.org/mrjob/
    * PyPI: http://pypi.python.org/pypi/mrjob
    * Discussion: http://groups.google.com/group/mrjob
    * Development is hosted at github: http://github.com/Yelp/mrjob


    What's new?
    -------------------

    v0.2.5, 2011-04-29 -- Hadoop input and output formats
    * Added hadoop_input/output_format options
    * You can now specify a custom Hadoop streaming jar (hadoop_streaming_jar)
    * extra args to hadoop now come before -mapper/-reducer on EMR, so
    that e.g. -libjar will work (worked in hadoop mode since v0.2.2)
    * hadoop mode now supports s3n:// URIs (Issue #53)
     
    Jimmy Retzlaff, Apr 30, 2011
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    475
  2. Hayato Iriumi

    When will Whidbey be released?

    Hayato Iriumi, Feb 12, 2004, in forum: ASP .Net
    Replies:
    11
    Views:
    680
    Matthew
    Apr 6, 2004
  3. Richard

    When will .NET 2 be released?

    Richard, Jun 11, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    438
    Phil Wright
    Jun 12, 2005
  4. Tom Hawkins

    [ANN] Confluence 0.7.1 Released

    Tom Hawkins, Oct 23, 2003, in forum: VHDL
    Replies:
    0
    Views:
    497
    Tom Hawkins
    Oct 23, 2003
  5. Jimmy Retzlaff

    Subject: mrjob v0.2.6 released

    Jimmy Retzlaff, May 25, 2011, in forum: Python
    Replies:
    0
    Views:
    304
    Jimmy Retzlaff
    May 25, 2011
Loading...

Share This Page