questions about multiprocessing

Discussion in 'Python' started by Vincent Ren, Mar 5, 2011.

  1. Vincent Ren

    Vincent Ren Guest

    Hello, everyone, recently I am trying to learn python's
    multiprocessing, but
    I got confused as a beginner.

    If I run the code below:

    from multiprocessing import Pool
    import urllib2
    otasks = [
    'http://www.php.net'
    'http://www.python.org'
    'http://www.perl.org'
    'http://www.gnu.org'
    ]

    def f(url):
    return urllib2.urlopen(url).read()

    pool = Pool(processes = 2)
    print pool.map(f, tasks)


    I'll receive this message:

    Traceback (most recent call last):
    File "<stdin>", line 14, in <module>
    File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
    return self.map_async(func, iterable, chunksize).get()
    File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
    raise self._value
    httplib.InvalidURL: nonnumeric port: ''



    I run Python 2.6 on Ubuntu 10.10


    Regards
    Vincent
     
    Vincent Ren, Mar 5, 2011
    #1
    1. Advertising

  2. On Mar 4, 2011, at 11:08 PM, Vincent Ren wrote:

    > Hello, everyone, recently I am trying to learn python's
    > multiprocessing, but
    > I got confused as a beginner.
    >
    > If I run the code below:
    >
    > from multiprocessing import Pool
    > import urllib2
    > otasks = [
    > 'http://www.php.net'
    > 'http://www.python.org'
    > 'http://www.perl.org'
    > 'http://www.gnu.org'
    > ]
    >
    > def f(url):
    > return urllib2.urlopen(url).read()
    >
    > pool = Pool(processes = 2)
    > print pool.map(f, tasks)


    Hi Vincent,
    I don't think that's the code you're running, because that code won't run. Here's what I get when I run the code you gave us:

    Traceback (most recent call last):
    File "x.py", line 14, in <module>
    print pool.map(f, tasks)
    NameError: name 'tasks' is not defined


    When I change the name of "otasks" to "tasks", I get the nonnumeric port error that you reported.

    Me, I would debug it by adding a print statement to f():
    def f(url):
    print url
    return urllib2.urlopen(url).read()


    Your problem isn't related to multiprocessing.

    Good luck
    Philip




    >
    >
    > I'll receive this message:
    >
    > Traceback (most recent call last):
    > File "<stdin>", line 14, in <module>
    > File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
    > return self.map_async(func, iterable, chunksize).get()
    > File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
    > raise self._value
    > httplib.InvalidURL: nonnumeric port: ''
    >
    >
    >
    > I run Python 2.6 on Ubuntu 10.10
    >
    >
    > Regards
    > Vincent
    >
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
     
    Philip Semanchuk, Mar 5, 2011
    #2
    1. Advertising

  3. On Fri, 4 Mar 2011 20:08:21 -0800 (PST), Vincent Ren
    <> declaimed the following in
    gmane.comp.python.general:

    > Hello, everyone, recently I am trying to learn python's
    > multiprocessing, but
    > I got confused as a beginner.
    >
    > If I run the code below:
    >
    > from multiprocessing import Pool
    > import urllib2


    > otasks = [
    > 'http://www.php.net'
    > 'http://www.python.org'
    > 'http://www.perl.org'
    > 'http://www.gnu.org'
    > ]
    >

    You've just defined a list with ONE element -- a string of:

    "http://www.php.nethttp://www.python.orghttp://www.perl.orghttp://http://www.gnu.org"


    Python concatenates adjacent strings -- which includes those on
    multiple lines when inside an open ( [ { structure.

    You need to put commas after the closing quotes on those lines.

    > def f(url):
    > return urllib2.urlopen(url).read()
    >
    > pool = Pool(processes = 2)
    > print pool.map(f, tasks)


    And I'm presuming the others are correct -- and that should be

    (f, otasks)

    > httplib.InvalidURL: nonnumeric port: ''


    No surprise... URL nomenclature expects a port number after the
    second : in URL, and with concatenation you've got four : in a single
    URL.
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Mar 5, 2011
    #3
  4. Vincent Ren

    Vincent Ren Guest

    Got it.
    After putting commas, it works (The 'o' was a mistake when I posted,
    sorry about it ).

    Thanks to all of you :)


    On Mar 5, 5:12 pm, Dennis Lee Bieber <> wrote:
    > On Fri, 4 Mar 2011 20:08:21 -0800 (PST), Vincent Ren
    > <> declaimed the following in
    > gmane.comp.python.general:
    >
    > > Hello, everyone, recently I am trying to learn python's
    > > multiprocessing, but
    > > I got confused as a beginner.

    >
    > > If I run the code below:

    >
    > > from multiprocessing import Pool
    > > import urllib2
    > > otasks = [
    > >      'http://www.php.net'
    > >      'http://www.python.org'
    > >      'http://www.perl.org'
    > >      'http://www.gnu.org'
    > >      ]

    >
    >         You've just defined a list with ONE element -- a string of:
    >
    > "http://www.php.nethttp://www.python.orghttp://www.perl.orghttp://http..."
    >
    >         Python concatenates adjacent strings -- which includes those on
    > multiple lines when inside an open ( [ { structure.
    >
    >         You need to put commas after the closing quotes on those lines.
    >
    > > def f(url):
    > >      return urllib2.urlopen(url).read()

    >
    > > pool = Pool(processes = 2)
    > > print pool.map(f, tasks)

    >
    >         And I'm presuming the others are correct -- and that should be
    >
    > (f, otasks)
    >
    > > httplib.InvalidURL: nonnumeric port: ''

    >
    >         No surprise... URL nomenclature expects a port number after the
    > second : in URL, and with concatenation you've got four : in a single
    > URL.
    > --
    >         Wulfraed                 Dennis Lee Bieber         AF6VN
    >            HTTP://wlfraed.home.netcom.com/
     
    Vincent Ren, Mar 5, 2011
    #4
  5. Vincent Ren

    Vincent Ren Guest

    I've got some new problems and I tried to search on Google but got no
    useful information.


    I want to download some images with multiprocessing.pool
    In my class named Renren, I defined two methods:

    def getPotrait(self, url):
    # get the current potraits of a friend on Renren.com
    try:
    r = urllib2.urlopen(url)
    except urllib2.URLError:
    print "Time out"

    tmp = re.search('large_[\d\D]*.jpg', url)
    image_name = tmp.group()

    img = r.read()
    output = open(image_name, 'wb')
    output.write(img)
    output.close()

    def getLargePotraits(self):

    tasks = self.makeTaskList()
    pool = Pool(processes = 3)
    pool.map(self.getPotrait, tasks)


    tasks is a list of URLs of images, I want to download these images and
    save them locally.

    In another python file, I wrote this:

    from renren import Renren

    # get username and password for RenRen.com
    username = raw_input('Email: ')
    password = raw_input('Password: ')
    print


    a = Renren(username, password)
    a.login()
    a.getLargePotraits()



    However, when I try to run this file, I received an error message:

    Exception in thread Thread-1:
    Traceback (most recent call last):
    File "/usr/lib/python2.6/threading.py", line 532, in
    __bootstrap_inner
    self.run()
    File "/usr/lib/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
    File "/usr/lib/python2.6/multiprocessing/pool.py", line 225, in
    _handle_tasks
    put(task)
    PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup
    __builtin__.instancemethod failed
     
    Vincent Ren, Mar 7, 2011
    #5
  6. Vincent Ren wrote:
    > Hello, everyone, recently I am trying to learn python's
    > multiprocessing, but
    > I got confused as a beginner.


    > [SNIP]
    > httplib.InvalidURL: nonnumeric port: ''
    >
    > Regards
    > Vincent
    >
    >

    It's a mistake many beginners do, I don't understand why, but it's a
    very common thing. RTFM should stand for "Read The Formidable (error)
    Message" as well.
    Your url is invalid, check your url definition.

    JM
     
    Jean-Michel Pichavant, Mar 7, 2011
    #6
  7. Vincent Ren

    Vincent Ren Guest

    On Mar 7, 9:21 pm, Jean-Michel Pichavant <>
    wrote:

    > It's a mistake many beginners do, I don't understand why, but it's a
    > very common thing. RTFM should stand for "Read The Formidable (error)
    > Message" as  well.
    > Your url is invalid, check your url definition.
    >
    > JM


    I've fixed that problem. But I got a new one

    PicklingError: Can't pickle <type 'instancemethod'>: attribute
    lookup
    __builtin__.instancemethod failed

    The details were listed in my last post in this thread.
    Thanks for your reply :)
     
    Vincent Ren, Mar 7, 2011
    #7
  8. Vincent Ren

    Robert Kern Guest

    On 3/7/11 3:27 PM, Vincent Ren wrote:
    > On Mar 7, 9:21 pm, Jean-Michel Pichavant<>
    > wrote:
    >
    >> It's a mistake many beginners do, I don't understand why, but it's a
    >> very common thing. RTFM should stand for "Read The Formidable (error)
    >> Message" as well.
    >> Your url is invalid, check your url definition.
    >>
    >> JM

    >
    > I've fixed that problem. But I got a new one
    >
    > PicklingError: Can't pickle<type 'instancemethod'>: attribute
    > lookup
    > __builtin__.instancemethod failed
    >
    > The details were listed in my last post in this thread.
    > Thanks for your reply :)


    I'm afraid his response applies to this as well: you can't pass methods to
    pool.map() or any other such communication channel to your subprocesses.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Mar 8, 2011
    #8
  9. Vincent Ren

    Vincent Ren Guest

    Got it, thanks.
    But what should I do if I want to improve the efficiency of my
    program?

    On Mar 8, 11:37 am, Robert Kern <> wrote:

    > I'm afraid his response applies to this as well: you can't pass methods to
    > pool.map() or any other such communication channel to your subprocesses.
     
    Vincent Ren, Mar 8, 2011
    #9
  10. On Mon, Mar 7, 2011 at 7:47 PM, Vincent Ren <> wrote:
    > Got it, thanks.
    > But what should I do if I want to improve the efficiency of my
    > program?
    >


    Is there any particular reason you're using processes and not threads?
    Functions that wait for stuff to happen in C land, such as I/O calls,
    release the GIL so threads can be run in parallel. It's only stuff
    that happens in Python land (i.e. manipulating Python objects) that
    can't be run concurrently.
     
    Benjamin Kaplan, Mar 8, 2011
    #10
  11. Vincent Ren

    Vincent Ren Guest

    I'm just learning python. After changed it to a non-OOP program, it
    works.
    Thank you all for suggestions :)

    On Mar 8, 1:38 pm, Benjamin Kaplan <> wrote:

    > Is there any particular reason you're using processes and not threads?
    > Functions that wait for stuff to happen in C land, such as I/O calls,
    > release the GIL so threads can be run in parallel. It's only stuff
    > that happens in Python land (i.e. manipulating Python objects) that
    > can't be run concurrently.
     
    Vincent Ren, Mar 8, 2011
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Amit N
    Replies:
    4
    Views:
    552
    Paddy
    Sep 13, 2007
  2. sturlamolden

    multiprocessing module (PEP 371)

    sturlamolden, Jun 4, 2008, in forum: Python
    Replies:
    6
    Views:
    454
    John Nagle
    Jun 8, 2008
  3. Max Ivanov

    multiprocessing eats memory

    Max Ivanov, Sep 25, 2008, in forum: Python
    Replies:
    6
    Views:
    351
    redbaron
    Sep 27, 2008
  4. Aaron \Castironpi\ Brady

    2.6 multiprocessing and pdb

    Aaron \Castironpi\ Brady, Oct 2, 2008, in forum: Python
    Replies:
    1
    Views:
    705
    Gabriel Genellina
    Oct 3, 2008
  5. nhwarriors

    Using multiprocessing

    nhwarriors, Oct 10, 2008, in forum: Python
    Replies:
    4
    Views:
    292
    Aaron \Castironpi\ Brady
    Oct 11, 2008
Loading...

Share This Page