Rookie Speaks

Discussion in 'Python' started by William S. Perrin, Jan 7, 2004.

  1. I'm a python rookie, anyone have and suggestions to streamline this
    function? Thanks in advance.....


    def getdata(myurl):
    sock = urllib.urlopen(myurl)
    xmlSrc = sock.read()
    sock.close()

    xmldoc = minidom.parseString(xmlSrc)

    def getattrs(weatherAttribute):
    a = xmldoc.getElementsByTagName(weatherAttribute)
    return a[0].firstChild.data

    currname = getattrs("name")
    currtemp = getattrs("fahrenheit")
    currwind = getattrs("wind")
    currdew = getattrs("dewpoint")
    currbarom = getattrs("relative_humidity")
    currhumid = getattrs("barometric_pressure")
    currcondi = getattrs("conditions")

    print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (currname, currtemp,
    currwind, currbarom, currdew, currhumid, currcondi)
     
    William S. Perrin, Jan 7, 2004
    #1
    1. Advertising

  2. |Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
    2004 17:37:57 -0600|

    > I'm a python rookie, anyone have and suggestions to streamline this
    > function? Thanks in advance.....


    Please define "streamline" in this context.

    Do you mean:
    faster
    smaller
    easier to read
    etc.

    Sam Walters.

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
     
    Samuel Walters, Jan 7, 2004
    #2
    1. Advertising

  3. Sorry, I guess is it efficient? That is if I called it 1000 times.....

    Samuel Walters wrote:
    > |Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
    > 2004 17:37:57 -0600|
    >
    >
    >>I'm a python rookie, anyone have and suggestions to streamline this
    >>function? Thanks in advance.....

    >
    >
    > Please define "streamline" in this context.
    >
    > Do you mean:
    > faster
    > smaller
    > easier to read
    > etc.
    >
    > Sam Walters.
    >
     
    William S. Perrin, Jan 7, 2004
    #3
  4. |Thus Spake William S. Perrin On the now historical date of Wed, 07 Jan
    2004 17:45:38 -0600|

    > Sorry, I guess is it efficient? That is if I called it 1000 times.....


    Ponders... Even "efficient" has a loose meaning here. Since you describe
    running it a thousand times, I'll assume you mean speed of execution.

    There's a saying "Premature optimization is the root of all evil." Which
    is another way of saying "Try it, and if it's too slow, figure out what
    the hold up is. If it's not too slow, don't mess with it." So, try
    running it in the context you need it in. Nothing about your code screams
    "bad implementation." In fact, it's quite clearly written. Still, you
    won't know if it's too slow until you try it.

    There's a way to get a definitive answer on how how fast it's running
    through the profile module in python. Do some research on that module. If
    you're confused about it, come back and ask more questions then.

    Take into consideration that it may not be your code that's slow, but
    rather the way you're getting your information. This is called being "I/O
    bound." The holdup might not be the program, but instead the disk or the
    network. After all, you can't process information until you have the
    information. The profile module will help you to see if this is the
    problem.

    One of the slicker solutions to slow code is the psyco module. It can
    give an amazing speed boost to many processing intensive functions, but it
    can sometimes even slow down your problem.

    If you'd like to see an example of both the psyco and profile modules in
    action, let me know and I'll give you some more understandable code that I
    once wrote to see what types of things psyco is good at optimizing.

    HTH

    Sam Walters.

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
     
    Samuel Walters, Jan 8, 2004
    #4
  5. William S. Perrin

    sdd Guest

    William S. Perrin wrote:
    > I'm a python rookie, anyone have and suggestions to streamline this
    > function? Thanks in advance.....
    >
    >... currname = getattrs("name")
    > currtemp = getattrs("fahrenheit")
    > currwind = getattrs("wind")
    > currdew = getattrs("dewpoint")
    > currbarom = getattrs("relative_humidity")
    > currhumid = getattrs("barometric_pressure")
    > currcondi = getattrs("conditions")
    >
    > print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (currname, currtemp, currwind,
    > currbarom, currdew, currhumid, currcondi)


    How about:
    name, temp, wind, dew, barom, humid, condi = map(getattrs,
    "name fahrenheit wind dewpoint relative_humidity "
    " barometric_pressure conditions".split())

    print "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (name, temp, wind,
    barom, dew, humid, condi)


    -Scott David Daniels
     
    sdd, Jan 8, 2004
    #5
  6. Anytime you find yourself repeating the same pattern of
    code (i.e. the getattrs bit), there's usually a more elegant
    way of doing it.

    def getdata(myurl):
    sock = urllib.urlopen(myurl)
    xmlSrc = sock.read()
    sock.close()

    xmldoc = minidom.parseString(xmlSrc)

    def getattrs(weatherAttribute):
    a = xmldoc.getElementsByTagName(weatherAttribute)
    return a[0].firstChild.data

    attributes = ['name', 'fahrenheit', 'wind',
    'dewpoint', 'relative_humidity',
    'barometric_pressure', 'conditions']

    current = {}

    for a in attributes:
    current[a] = getattrs(a)

    format_str = "%13s"+"\t%s"*(len(attributes)-1)
    print format_str % tuple([current[a] for a in attributes])


    OR, if all you want is to print your numbers, skip the dictionary-

    attributes = ['name', 'fahrenheit', 'wind',
    'dewpoint', 'relative_humidity',
    'barometric_pressure', 'conditions']

    format_str = "%13s"+"\t%s"*(len(attributes)-1)
    print format_str % tuple([getattrs(a) for a in attributes])
     
    Lonnie Princehouse, Jan 8, 2004
    #6
  7. William S. Perrin

    Peter Otten Guest

    William S. Perrin wrote:

    I thinke your function has a sane design :) XML is slow by design, but in
    your case it doesn't really matter, because is probably I/O-bound, as
    already pointed out by Samuel Walters.

    Below is a slightly different approach, that uses a class:

    class Weather(object):
    def __init__(self, url=None, xml=None):
    """ Will accept either a URL or a xml string,
    preferrably as a keyword argument """
    if url:
    if xml:
    # not sure what would be the right exception here
    # (ValueError?), so keep it generic for now
    raise Exception("Must provide either url or xml, not both")
    sock = urllib.urlopen(url)
    try:
    xml = sock.read()
    finally:
    sock.close()
    elif xml is None:
    raise Exception("Must provide either url or xml")
    self._dom = minidom.parseString(xml)

    def getAttrFromDom(self, weatherAttribute):
    a = self._dom.getElementsByTagName(weatherAttribute)
    return a[0].firstChild.data

    def asRow(self):
    # this will defeat lazy attribute lookup
    return "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (self.name,
    self.fahrenheit, self.wind, self.barometric_pressure,
    self.dewpoint, self.relative_humidity, self.conditions)
    return

    def __getattr__(self, name):
    try:
    value = self.getAttrFromDom(name)
    except IndexError:
    raise AttributeError(
    "'%.50s' object has no attribute '%.400s'" %
    (self.__class__, name))
    # now set the attribute so it need not be looked up
    # in the dom next time
    setattr(self, name, value)
    return value

    This has a slight advantage if you are interested only in a subset of the
    attributes, say the temperature:

    for url in listOfUrls:
    print Weather(url).fahrenheit

    Here getAttrFromDom() - the equivalent of your getattrs() - is only called
    once per URL. The possibility to print a tab-delimited row is still there,

    print Weather(url).asRow()

    but will of course defeat this optimization scheme.

    Peter
     
    Peter Otten, Jan 8, 2004
    #7
  8. What psyco is goot at [Was: Rookie Speaks]

    Samuel Walters <> writes:

    > If you'd like to see an example of both the psyco and profile modules in
    > action, let me know and I'll give you some more understandable code that I
    > once wrote to see what types of things psyco is good at optimizing.


    I think this is generally interesting, and would be curious to see it,
    if you'd care to share.
     
    Jacek Generowicz, Jan 8, 2004
    #8
  9. Re: What psyco is goot at [Was: Rookie Speaks]

    |Thus Spake Jacek Generowicz On the now historical date of Thu, 08 Jan
    2004 11:43:01 +0100|

    > Samuel Walters <> writes:
    >
    >> If you'd like to see an example of both the psyco and profile modules
    >> in action, let me know and I'll give you some more understandable code
    >> that I once wrote to see what types of things psyco is good at
    >> optimizing.

    >
    > I think this is generally interesting, and would be curious to see it,
    > if you'd care to share.


    Sure thing. The functions at the top are naive prime enumeration
    algorithms. I chose them because they're each of a general "looping"
    nature and I understand the complexity and methods of each of them. Some
    use lists (and hence linearly indexed) methods and some use dictionary(
    and hence are has bound). One of them, sieve_list is commented out because
    it has such dog performance that I decided I wasn't interested in
    how well it optimized.

    These tests are by no means complete, nor is this probably a good example
    of profiling or the manner in which psyco is useful. It's just from an
    area where I understood the algorithmic bottlenecks to begin with.

    Sam Walters.

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""

    from math import sqrt
    def primes_list(Limits = 1,KnownPrimes = [ 2 ]):
    RetList = KnownPrimes
    for y in xrange(2,Limits + 1):
    w = y
    p, r = 0,0
    for x in RetList:
    if x*x > w:
    RetList.append(w)
    break
    p,r = divmod(y,x)
    if r == 0:
    w = p
    return RetList

    def primes_dict(Limits = 1,KnownPrimes = [ 2 ]):
    RetList = KnownPrimes
    RetDict = {}
    for x in KnownPrimes:
    RetDict[x] = 1
    w = x + x
    n = 2
    while w <= Limits + 1:
    RetDict[w] = n
    w += x
    n += 1
    p, r = 0,0
    for y in xrange(2, Limits + 1):
    for x, z in RetDict.iteritems():
    if x*x > y:
    RetDict[y] = 1
    break
    p,r = divmod(y,x)
    if r == 0:
    RetDict[y] = p
    break
    return RetList

    def sieve_list(Limits = 1, KnownPrimes = [ 2 ]):
    RetList = KnownPrimes
    CompList = [ ]
    for y in xrange(2, Limits + 1):
    if y not in CompList:
    w = y
    n = 1
    while w <= Limits:
    CompList.append(w)
    w += y
    n += 1
    return RetList

    def sieve_list_2(Limits = 1, KnownPrimes = [ 2 ]):
    SieveList = [ 1 ]*(Limits )
    RetList = [ ]
    for y in xrange(2, Limits + 1):
    if SieveList[y-2] == 1:
    RetList.append(y)
    w = y + y
    n = 2
    while w <= Limits + 1:
    SieveList[w - 2] = n
    w += y
    n += 1
    return RetList

    def sieve_dict(Limits = 1, KnownPrimes = [ 2 ]):
    SieveDict = { }
    RetList = KnownPrimes
    for x in KnownPrimes:
    SieveDict[x] = 1
    w = x + x
    n = 2
    while w <= Limits + 1:
    SieveDict[w] = n
    n += 1
    w += x

    for y in xrange(2, Limits + 1):
    if not SieveDict.has_key(y):
    RetList.append(y)
    w = y
    n = 1
    while w <= Limits + 1:
    SieveDict[w] = n
    w += y
    n += 1
    return RetList

    if __name__ == "__main__":
    import sys
    import profile
    import pstats

    import psyco

    #this function wraps up all the calls that we wish to benchmark.
    def multipass(number, args):
    for x in xrange(1, number + 1):
    primes_list(args, [ 2 ])
    print ".",
    sys.stdout.flush()
    primes_dict(args, [ 2 ])
    print ".",
    sys.stdout.flush()
    #Do not uncomment this line unless you have a *very* long time to wait.
    #sieve_list(args)
    sieve_dict(args, [ 2 ])
    print ".",
    sys.stdout.flush()
    sieve_list_2(args, [ 2 ])
    print "\r \r%i/%i"%(x, number),
    sys.stdout.flush()
    print "\n"

    #number of times through the test
    passes = 5
    #find all primes up to maximum
    maximum = 1000000

    #create a profiling instance
    #adjust the argument based on your system.
    pr = profile.Profile( bias = 7.5e-06)

    #run the tests
    pr.run("multipass(%i, %i)"%(passes,maximum))
    #save them to a file.
    pr.dump_stats("primesprof")

    #remove the profiling instance so that we can get a clean comparison.
    del pr

    #create a profiling instance
    #adjust the argument based on your system.
    pr = profile.Profile( bias = 7.5e-06)

    #"recompile" each of the functions under consideration.
    psyco.bind(primes_list)
    psyco.bind(primes_dict)
    psyco.bind(sieve_list)
    psyco.bind(sieve_list_2)
    psyco.bind(sieve_dict)

    #run the tests
    pr.run("multipass(%i, %i)"%(passes,maximum))
    #save them to a file
    pr.dump_stats("psycoprimesprof")

    #clean up our mess
    del pr

    #load and display each of the run-statistics.
    pstats.Stats('primesprof').strip_dirs().sort_stats('cum').print_stats()
    pstats.Stats('psycoprimesprof').strip_dirs().sort_stats('cum').print_stats()
     
    Samuel Walters, Jan 8, 2004
    #9
  10. William S. Perrin

    Tim Churches Guest

    Re: What psyco is goot at [Was: Rookie Speaks]

    On Fri, 2004-01-09 at 05:25, Samuel Walters wrote:
    > |Thus Spake Jacek Generowicz On the now historical date of Thu, 08 Jan
    > 2004 11:43:01 +0100|
    >
    > > Samuel Walters <> writes:
    > >
    > >> If you'd like to see an example of both the psyco and profile modules
    > >> in action, let me know and I'll give you some more understandable code
    > >> that I once wrote to see what types of things psyco is good at
    > >> optimizing.

    > >
    > > I think this is generally interesting, and would be curious to see it,
    > > if you'd care to share.

    >
    > Sure thing. The functions at the top are naive prime enumeration
    > algorithms. I chose them because they're each of a general "looping"
    > nature and I understand the complexity and methods of each of them. Some
    > use lists (and hence linearly indexed) methods and some use dictionary(
    > and hence are has bound). One of them, sieve_list is commented out because
    > it has such dog performance that I decided I wasn't interested in
    > how well it optimized.


    Out of curiosity I ran your code, and obtained these results:

    Fri Jan 9 08:30:25 2004 primesprof

    23 function calls in 2122.530 CPU seconds

    ....

    Fri Jan 9 08:43:24 2004 psycoprimesprof

    23 function calls in -3537.828 CPU seconds

    Does that mean that Armin Rigo has slipped some form of Einsteinian,
    relativistic compiler into Psyco? I am reminded of the well-known
    limerick:

    There once was a lady called Bright,
    Who could travel faster than light.
    She went out one day,
    In a relative way,
    And came back the previous night.

    --

    Tim C

    PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
    or at http://members.optushome.com.au/tchur/pubkey.asc
    Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0



    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.0.7 (GNU/Linux)

    iD8DBQA//dVyeJFGqer5k9ARAsuKAKDOA3t41ZqQy9QNIp9pZ2uuDE8yQACgo0wM
    1w6Kzm37Xp/c3k5SaNk9iv4=
    =XnLz
    -----END PGP SIGNATURE-----
     
    Tim Churches, Jan 8, 2004
    #10
  11. Re: What psyco is goot at [Was: Rookie Speaks]

    |Thus Spake Tim Churches On the now historical date of Fri, 09 Jan 2004
    09:10:58 +1100|

    > Does that mean that Armin Rigo has slipped some form of Einsteinian,
    > relativistic compiler into Psyco?


    No, no. It means one of two things: either you didn't adjust constant
    that tries to factor out the overhead of profiling, or the call took so
    long that the timer actually overflowed.

    This will help you set the proper constant:

    -----
    import profile
    import pprint

    tests = 20
    cycles = 10000
    pr = profile.Profile()
    proflist = []
    for x in xrange(1, tests + 1):
    proflist.append(pr.calibrate(cycles))

    pprint.pprint(proflist)
    -----

    Increase cycles until your results don't exhibit much of a spread, then
    take the lowest of those values. This is the constant you set when
    instantiating a profiling object. It is specific to each individual
    machine.

    If it *still* gives you negative times, then the timer is overflowing and
    you need to adjust the original script so that you're not running through
    such a big list of numbers.

    Then your apparent problems with causality should be solved.

    Sam Walters.

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
     
    Samuel Walters, Jan 9, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Big E

    Rookie Question

    Big E, Jun 17, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    354
    avnrao
    Jun 17, 2004
  2. tma

    Datalist Rookie

    tma, Sep 7, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    318
  3. Delaney, Timothy C (Timothy)

    RE: What psyco is goot at [Was: Rookie Speaks]

    Delaney, Timothy C (Timothy), Jan 8, 2004, in forum: Python
    Replies:
    2
    Views:
    309
    Peter Hansen
    Jan 9, 2004
  4. muldoon
    Replies:
    114
    Views:
    2,883
    Dennis Lee Bieber
    Oct 16, 2005
  5. crea
    Replies:
    7
    Views:
    385
    James Kanze
    Mar 9, 2011
Loading...

Share This Page