parsing OPML

C

Chris

I am trying to write a script that will parse my Bloglines OPML export
(see snippet below) and output an HTML blogroll.

I can get at all the actual blog entries with the code below. But the
problem is that I would like to get at the "folder" level ("Test and
Demo" and "Unfiled" below) which just has a name and no other
attributes so that I can group the entries in their "categories"
something like this (http://rpc.bloglines.com/blogroll?html=1&id=chrislott).
My code only lists the entries themselves...

********************************blogs.opml**
<opml version="1.0">
<head>
<title>Bloglines Subscriptions</title>
<dateCreated>Sun, 4 Apr 2004 20:15:17 GMT</dateCreated>
<ownerEmail>[email protected]</ownerEmail>
</head>
<body>
<outline title="Subscriptions">
<outline title="Test and Demo">
<outline title="del.icio.us/imao/Learning"
htmlUrl="http://del.icio.us/imao/Learning" type="rss"
xmlUrl="http://del.icio.us/rss/imao/Learning"/>

<outline title="Fairbanks, Alaska Weather"
htmlUrl="http://www.rssweather.com/hw3.php?zipcode=99701" type="rss"
xmlUrl="http://rssweather.com/rss.php?hwvUT...ountry=us&county=02090&zone=AKZ222&alt=rss20a"/>
</outline>

<outline title="Unfiled">

<outline title="Boxes and Arrows"
htmlUrl="http://www.boxesandarrows.com/" type="rss"
xmlUrl="http://www.boxesandarrows.com/index.xml"/>
<outline title="CBB Plagiarism Project -"
htmlUrl="http://leeds.bates.edu/cbb/" type="rss"
xmlUrl="http://leeds.bates.edu/cbb/module.php?mod=node&op=feed"/>

</outline>


********************************************my script**

from xml.sax import make_parser
from xml.sax.handler import ContentHandler

class OPMLHandler(ContentHandler):

def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')

def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url

parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))
 
R

Richard Morse

I am trying to write a script that will parse my Bloglines OPML export
(see snippet below) and output an HTML blogroll. [snip]
from xml.sax import make_parser
from xml.sax.handler import ContentHandler

class OPMLHandler(ContentHandler):

def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')

def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url

parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))

It looks to me like this is Python.

This is a Perl newsgroup.

Perhaps you meant to post to a Python newsgroup?

Ricky
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,230
Latest member
LifeBoostCBD

Latest Threads

Top