testing machine responsiveness

Discussion in 'Python' started by Tim Arnold, Oct 6, 2006.

  1. Tim Arnold

    Tim Arnold Guest

    I have a bunch of processes that I farm out over several HPux machines on
    the network. There are 60 machines to choose from and I want to
    (1) find out which ones are alive (the 'ping' method below) and
    (2) sort them by their current load (the 'get' method below, using the rup
    command)

    I'm no expert--I bet what I'm doing could be done better. I'd appreciate any
    tips, caveats, etc.
    Thanks in advance for looking at the code.
    --Tim Arnold


    Say the host names are in a global list tmpList...
    #---- The final sorted list of cpus is called as:
    cpuList = [x[1] for x in Machines().get()]

    #----
    class Machines(object):
    ' List of available, alive machines. '
    def __init__(self):
    global tmpList
    self.asList = [y for y in tmpList if self.ping(y)]
    self.asString = ' '.join(self.asList)

    def ping(self, cpu):
    ''' Determine whether a machine is alive.
    tcp connect to machine, port 7 (echo port).
    Response within 3 seconds -> return true
    '''
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.settimeout(2)
    try:
    s.connect((cpu,7))
    except:
    return 0
    try:
    s.send('test')
    s.recv(128)
    s.close()
    return 1
    except:
    return 0

    def get(self, maxLoad=3.0,):
    ''' return sorted list of available machines, sorted on load.
    Optionally, specify maxload, default is no more than 3.0
    '''
    tmpDict = {}
    retList = []
    try:
    rup = os.popen('rup %s | sort -n -t, -k4 | grep day' %
    (self.asString))
    except OSError:
    return self.asList

    for s in rup:
    (name, t0) = s.split(' ',1)
    (t1,t2,t3,avgLoad,t4) = s.split(',')
    load = float(avgLoad)
    if load < maxLoad:
    tmpDict['%s.com' % name] = load

    for (l, n) in [(v[1], v[0]) for v in tmpDict.items()]:
    retList.append((l, n))
    retList.sort()
    return retList
     
    Tim Arnold, Oct 6, 2006
    #1
    1. Advertisements

  2. Call me nervous, but I like to make my "except" clauses as specific as
    possible, to catch only the errors I'm expecting. This is to minimize the
    chance of some hidden bug sneaking through and causing the wrong behaviour.
    So if the rup call fails, you're returning a plain list of all the
    hostnames? Won't this cause the following parsing to fail?

    If you're going to trigger a failure, isn't it better for it to happen as
    close as possible to the actual cause? In other words, take out the
    try/except block above and let the os.popen call itself directly signal an
    exception on failure.

    Disclaimer: I've never actually used rup. :)
     
    Lawrence D'Oliveiro, Oct 8, 2006
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.