I've got an IIS server with PHP, Perl, and mysql
This isn't a dumb question, but it's a hugely open-ended one and thus
hard to answer. There are certainly no neat snappy little short
answers. Do some Googling, because almost everything is already solved
for you - with PHP, you should have no trouble finding ready-built
almost-solutions.
There are two things you can do with RSS; present it and aggregate it.
- Presenting it is simpler. You connect to one entire feed, supplied
externally, and you transform the whole thing, probably into HTML. One
feed is used, and you use the whole feed.
- Aggregating it is harder (and may become really complicated). You
take more than one feed and combine them. To be really useful, you
start filtering items from each; killing duplicates (many interesting
articles soon get replicated around many feeds) and selecting items
that are "of interest" to you. Identifying "of interest" really well
could turn into a PhD thesis. If you combine without filtering, it
soon turns into the Usenet "infinite monkeys" scenario. You can even
offer your aggregator output as its own RSS feed, perhaps
"Coffee-related news compiled from commodity trading news sources
around the world, and the latest roasting recipes from old Havana".
There's also the question of caching. You should cache feed content
that your site downloads and serve it to your users from a local copy.
If you retrieve the feed each time your site gets a request, then this
is firstly a bad thing to do and contrary to the spirit of
syndication, and secondly you'll find your server soon gets locked out
of many feed servers for being "greedy".
You can present without caching, but aggregation really needs caching
to work. In-memory caching is pretty simple with just an XML DOM, but
serious work needs a database.
Aggregators shouldn't present content, except as RSS. It's easier (and
much more flexible) to couple a presenter to the output of an
aggregator than it is to make the aggregator present the content
itself.
Presentation consists of turning unreadable RSS geek-speak into pretty
HTML (or whatever). You can do this any way you like, but it's a
natural task for simple XSLT.
As to feed and RSS versions, then RSS 1.0 is by far the best and
should be used whenever you create a feed. However the input stage of
a presenter / aggregator shoudl always be widely accepting in what it
can take. This is one of the best arguments for not writing your own
from scratch !
PS - Dave Winer is mad, bad, and dangerous to host with. Shun him.
http://www.theregister.co.uk/2004/06/15/winer_weblog_wipeout/
There's a lot I haven't mentioned here. Content encoding, metadata,
filtering algorithms, adaptive aggregation. Come back (or to
comp.text.xml) whenever you have more queries.
I suggest that you put an environment together on your servers that
will allow you to do simple XSLT transforms. Then try writing a few,
and make some incoming RSS appear as HTML. This isn't a useful
technique, but it will give you a feel for the protocols. After that,
to do it for real, look for an open-source aggregator for PHP & MySQL
and try installing that.
Aggregators are hard. _Good_ aggregators are really hard.