R
rbt
Hello there,
Depending on the firmware version of the HP printer and the model type,
one will encounter a myriad of combinations of the following strings
while reading the index page:
hp
HP
color
Color
Printer
Printer Status
Status:
Device:
Device Status
laserjet
LaserJet
How can I go about determining if a site is indeed the Web interface to
a HP printer? The goal is to remove all HP printers from a list of
publicly available Web sites... I've tried this approach, but it gets
messy quickly when I attempt to account for all possible combinations
that HP uses:
f = urllib2.urlopen("http://%s" %host)
data = f.read()
f.close()
if 'hp' or 'HP' and 'color' or 'Color' and 'Printer' or 'Printer Status'
in data:
DISREGARD THE IP
I'm sure there's a more graceful way to go about this while maintaining
a high degree of accuracy and as few false positives as possible. Any
tips or pointers?
Thanks in advance!
Depending on the firmware version of the HP printer and the model type,
one will encounter a myriad of combinations of the following strings
while reading the index page:
hp
HP
color
Color
Printer
Printer Status
Status:
Device:
Device Status
laserjet
LaserJet
How can I go about determining if a site is indeed the Web interface to
a HP printer? The goal is to remove all HP printers from a list of
publicly available Web sites... I've tried this approach, but it gets
messy quickly when I attempt to account for all possible combinations
that HP uses:
f = urllib2.urlopen("http://%s" %host)
data = f.read()
f.close()
if 'hp' or 'HP' and 'color' or 'Color' and 'Printer' or 'Printer Status'
in data:
DISREGARD THE IP
I'm sure there's a more graceful way to go about this while maintaining
a high degree of accuracy and as few false positives as possible. Any
tips or pointers?
Thanks in advance!