B
Brian
I have a simple script below that is causing me some problems and I am
having a hard time tracking them down. Here is the code:
import urllib
import re
def getPicLinks():
found = []
try:
page =
urllib.urlopen("http://continuouswave.com/whaler/cetacea/")
except:
print "ERROR RREADING PAGE."
sys.exit()
page1 = page.read()
cetLinks = re.compile("cetaceaPage..\.html", page1)
for line in page1:
found.append(cetLinks.findall(line))
print found
This is the error message:
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/sre_parse.py",
line 396, in _parse
if state.flags & SRE_FLAG_VERBOSE:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
I am trying to extract the links on a web page that have a similar
pattern. Here is an example of the html source:
<HR>
<P><SMALL><A HREF="photoLog.html">PHOTO-LOG</A><br>
<A HREF="guide.html">How-To-Submit</A><BR><A
HREF="cetaceaPage01.html">01</A> | <A
HREF="cetaceaPage02.html">02</A> | <A
HREF="cetaceaPage03.html">03</A> | <A
HREF="cetaceaPage04.html">04</A> | <A
HREF="cetaceaPage05.html">05</A> | <A
HREF="cetaceaPage06.html">06</A> | <A
HREF="cetaceaPage07.html">07</A> | <A
HREF="cetaceaPage08.html">08</A> | <A
HREF="cetaceaPage09.html">09</A> | <A
HREF="cetaceaPage10.html">10</A>
<BR><A>
My problem is that I can't seem to be able to figure out what is going
wrong here. Mostly because I am a bit confused by the error message as
it points to a file (presumable part of re) that I am unfamiliar with,
and I am a bit new with python.
Any help is greatly appreciated, as is your patience.
Brian
having a hard time tracking them down. Here is the code:
import urllib
import re
def getPicLinks():
found = []
try:
page =
urllib.urlopen("http://continuouswave.com/whaler/cetacea/")
except:
print "ERROR RREADING PAGE."
sys.exit()
page1 = page.read()
cetLinks = re.compile("cetaceaPage..\.html", page1)
for line in page1:
found.append(cetLinks.findall(line))
print found
This is the error message:
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/sre_parse.py",
line 396, in _parse
if state.flags & SRE_FLAG_VERBOSE:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
I am trying to extract the links on a web page that have a similar
pattern. Here is an example of the html source:
<HR>
<P><SMALL><A HREF="photoLog.html">PHOTO-LOG</A><br>
<A HREF="guide.html">How-To-Submit</A><BR><A
HREF="cetaceaPage01.html">01</A> | <A
HREF="cetaceaPage02.html">02</A> | <A
HREF="cetaceaPage03.html">03</A> | <A
HREF="cetaceaPage04.html">04</A> | <A
HREF="cetaceaPage05.html">05</A> | <A
HREF="cetaceaPage06.html">06</A> | <A
HREF="cetaceaPage07.html">07</A> | <A
HREF="cetaceaPage08.html">08</A> | <A
HREF="cetaceaPage09.html">09</A> | <A
HREF="cetaceaPage10.html">10</A>
<BR><A>
My problem is that I can't seem to be able to figure out what is going
wrong here. Mostly because I am a bit confused by the error message as
it points to a file (presumable part of re) that I am unfamiliar with,
and I am a bit new with python.
Any help is greatly appreciated, as is your patience.
Brian