testing machine responsiveness

T

Tim Arnold

I have a bunch of processes that I farm out over several HPux machines on
the network. There are 60 machines to choose from and I want to
(1) find out which ones are alive (the 'ping' method below) and
(2) sort them by their current load (the 'get' method below, using the rup
command)

I'm no expert--I bet what I'm doing could be done better. I'd appreciate any
tips, caveats, etc.
Thanks in advance for looking at the code.
--Tim Arnold


Say the host names are in a global list tmpList...
#---- The final sorted list of cpus is called as:
cpuList = [x[1] for x in Machines().get()]

#----
class Machines(object):
' List of available, alive machines. '
def __init__(self):
global tmpList
self.asList = [y for y in tmpList if self.ping(y)]
self.asString = ' '.join(self.asList)

def ping(self, cpu):
''' Determine whether a machine is alive.
tcp connect to machine, port 7 (echo port).
Response within 3 seconds -> return true
'''
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(2)
try:
s.connect((cpu,7))
except:
return 0
try:
s.send('test')
s.recv(128)
s.close()
return 1
except:
return 0

def get(self, maxLoad=3.0,):
''' return sorted list of available machines, sorted on load.
Optionally, specify maxload, default is no more than 3.0
'''
tmpDict = {}
retList = []
try:
rup = os.popen('rup %s | sort -n -t, -k4 | grep day' %
(self.asString))
except OSError:
return self.asList

for s in rup:
(name, t0) = s.split(' ',1)
(t1,t2,t3,avgLoad,t4) = s.split(',')
load = float(avgLoad)
if load < maxLoad:
tmpDict['%s.com' % name] = load

for (l, n) in [(v[1], v[0]) for v in tmpDict.items()]:
retList.append((l, n))
retList.sort()
return retList
 
L

Lawrence D'Oliveiro

try:
s.connect((cpu,7))
except:
return 0
try:
s.send('test')
s.recv(128)
s.close()
return 1
except:
return 0

Call me nervous, but I like to make my "except" clauses as specific as
possible, to catch only the errors I'm expecting. This is to minimize the
chance of some hidden bug sneaking through and causing the wrong behaviour.
try:
rup = os.popen('rup %s | sort -n -t, -k4 | grep day' %
(self.asString))
except OSError:
return self.asList

So if the rup call fails, you're returning a plain list of all the
hostnames? Won't this cause the following parsing to fail?

If you're going to trigger a failure, isn't it better for it to happen as
close as possible to the actual cause? In other words, take out the
try/except block above and let the os.popen call itself directly signal an
exception on failure.

Disclaimer: I've never actually used rup. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top