RSS feed clarification?

Discussion in 'XML' started by Ed Flecko, May 14, 2007.

  1. Ed Flecko

    Ed Flecko Guest

    Hi folks,
    I'm trying to figure out this whole RSS feed thing.

    I've created my .xml file to use for my feed, and my browsers
    "recognize" that I have an RSS feed, and you can subscribe, etc., etc.

    Here's why I "think" I want to use an RSS feed, and what I'm confused
    about.

    I have one file (and one file only) on my web site that changes
    frequently (weekly), but the file name is always the same. I want to
    alert people who subscribe to the feed that this file has changed.

    Here's my questions:

    1.) Will an RSS feed "work" (automatically notify the subscribers) for
    a single file whose name is always the same (although the body content
    of the file changes)?
    2.) I don't understand how RSS feeds actually work, from the clients
    perspective, i.e., how do the subscribers RSS client (Internet
    Explorer, Firefox, etc.) actually know that the RSS feed has changed
    and download it, etc.? Is it just simply a scheduled task, and the
    client checks the feed automatically on a schedule?
    3.) Since my feed isn't "news", per se, I don't need to bother with
    "syndicating" my feed, do I?...or would this somehow benefit me.

    Thank you,
    Ed
    Ed Flecko, May 14, 2007
    #1
    1. Advertising

  2. Ed Flecko

    Andy Dingley Guest

    On 14 May, 15:55, Ed Flecko <> wrote:

    > I've created my .xml file to use for my feed, and my browsers
    > "recognize" that I have an RSS feed, and you can subscribe, etc., etc.


    It would help if you told us the URL.

    It's also better practice (if you can arrnage this, with your hosting)
    to give your RSS file a ".rss" extension and most importantly to serve
    it with a correct content-type for RSS, not just for XML. RSS is
    robust aginst not doing this (most publishers can't get it right), but
    it's still good practice if you're hosted on Apache.


    > I have one file (and one file only) on my web site that changes
    > frequently (weekly), but the file name is always the same. I want to
    > alert people who subscribe to the feed that this file has changed.


    > 1.) Will an RSS feed "work" (automatically notify the subscribers) for
    > a single file whose name is always the same (although the body content
    > of the file changes)?


    Yes. RSS is all about embedding metadata, and that includes the update
    timestamps. The names themself are just one piece of the metadata --
    so long as _something_ reflects the change, then you can make it all
    work.

    You might not be able to use the "permalink" feature of some RSS
    versions. This is useful, so if you can, then you should use it. As to
    whether it's relevant, then that depends on your particular
    application and so we don't know that yet.

    It's good practice to offer a "permalink" as a URL that will always
    retrieve a particular version of the content, even some time after
    this was first served. "Last week's news" is still interesting to many
    consumers. You can purge these over time if you wish, but it's still
    good practice to make the URL namespace a consumed resource that isn't
    re-used.

    If you can, then keep "last weeks news" and "this weeks news" stored
    as separate files on the web server and make the web server respond to
    specific requests for each one appropriately. Just giving a filename
    with a datestamp in it can be enough to do this. For the "latest
    news" URL, send a 302 redirect to the URL for the current file. This
    redirect's value will need to be changed as each new file is uploaded.

    Possibly it's just not appropriate to serve "last week's news" in your
    application (I don't, and can't, know). If so, then just have the
    simple one file, one filename, one URL situation. However make sure
    that any URL you publish in RSS is _not_ labelled incorrectly as a
    "permalink".


    > 2.) I don't understand how RSS feeds actually work, from the clients
    > perspective,


    Largely you don't and can't know this. You publish the stuff, what
    happens next is up to whoever uses it. Don't try to pre-judge what
    they can and (especially) what they can't do with it.


    > how do the subscribers RSS client (Internet
    > Explorer, Firefox, etc.) actually know that the RSS feed has changed
    > and download it, etc.?


    They'll usually poll it regularly to see (i.e., the client decides).
    HTTP polling shoudl be efficient - i.e. a GET or HEAD request should
    quickly return a suitable HTTP 304 Not Modified if needs be, or at
    least a HTTP 200 with appropriate timestamps. Good clients adjust this
    polling time so as not to be a nuisance, to respect any hints you
    embed in the syndication information you include inside the RSS
    document, and also to fine-tune this on the basis of how often you
    actually make changes to the content.

    It's important that an RSS server can efficiently serve polling
    clients when the content _hasn't_ changed, otherwise it can soon be
    overloaded, even when it's not serving any content. This is a real
    problem for dumb-coded servers with database-generated content. If
    your RSS content is coming from static files, then Apache will get it
    right automatically. If you're generating it dynamically, then make
    sure your "last updated" timestamps are calculated and returned
    quickly, and also that they represent the "last change" not the "last
    request" timestamps.


    > 3.) Since my feed isn't "news", per se, I don't need to bother with
    > "syndicating" my feed, do I?.


    You don't really ever syndicate your own feed, you offer it up for
    syndication and some aggregator might decide to syndicate it elsewhere
    if it wishes to. Or else it might not, I cannot be positive which.
    Andy Dingley, May 14, 2007
    #2
    1. Advertising

  3. Ed Flecko

    Ed Flecko Guest

    On May 14, 9:00 am, Andy Dingley <> wrote:
    > On 14 May, 15:55, Ed Flecko <> wrote:
    >
    > > I've created my .xml file to use for my feed, and my browsers
    > > "recognize" that I have an RSS feed, and you can subscribe, etc., etc.

    >
    > It would help if you told us the URL.
    >
    > It's also better practice (if you can arrnage this, with your hosting)
    > to give your RSS file a ".rss" extension and most importantly to serve
    > it with a correct content-type for RSS, not just for XML. RSS is
    > robust aginst not doing this (most publishers can't get it right), but
    > it's still good practice if you're hosted on Apache.
    >
    > > I have one file (and one file only) on my web site that changes
    > > frequently (weekly), but the file name is always the same. I want to
    > > alert people who subscribe to the feed that this file has changed.
    > > 1.) Will an RSS feed "work" (automatically notify the subscribers) for
    > > a single file whose name is always the same (although the body content
    > > of the file changes)?

    >
    > Yes. RSS is all about embedding metadata, and that includes the update
    > timestamps. The names themself are just one piece of the metadata --
    > so long as _something_ reflects the change, then you can make it all
    > work.
    >
    > You might not be able to use the "permalink" feature of some RSS
    > versions. This is useful, so if you can, then you should use it. As to
    > whether it's relevant, then that depends on your particular
    > application and so we don't know that yet.
    >
    > It's good practice to offer a "permalink" as a URL that will always
    > retrieve a particular version of the content, even some time after
    > this was first served. "Last week's news" is still interesting to many
    > consumers. You can purge these over time if you wish, but it's still
    > good practice to make the URL namespace a consumed resource that isn't
    > re-used.
    >
    > If you can, then keep "last weeks news" and "this weeks news" stored
    > as separate files on the web server and make the web server respond to
    > specific requests for each one appropriately. Just giving a filename
    > with a datestamp in it can be enough to do this. For the "latest
    > news" URL, send a 302 redirect to the URL for the current file. This
    > redirect's value will need to be changed as each new file is uploaded.
    >
    > Possibly it's just not appropriate to serve "last week's news" in your
    > application (I don't, and can't, know). If so, then just have the
    > simple one file, one filename, one URL situation. However make sure
    > that any URL you publish in RSS is _not_ labelled incorrectly as a
    > "permalink".
    >
    > > 2.) I don't understand how RSS feeds actually work, from the clients
    > > perspective,

    >
    > Largely you don't and can't know this. You publish the stuff, what
    > happens next is up to whoever uses it. Don't try to pre-judge what
    > they can and (especially) what they can't do with it.
    >
    > > how do the subscribers RSS client (Internet
    > > Explorer, Firefox, etc.) actually know that the RSS feed has changed
    > > and download it, etc.?

    >
    > They'll usually poll it regularly to see (i.e., the client decides).
    > HTTP polling shoudl be efficient - i.e. a GET or HEAD request should
    > quickly return a suitable HTTP 304 Not Modified if needs be, or at
    > least a HTTP 200 with appropriate timestamps. Good clients adjust this
    > polling time so as not to be a nuisance, to respect any hints you
    > embed in the syndication information you include inside the RSS
    > document, and also to fine-tune this on the basis of how often you
    > actually make changes to the content.
    >
    > It's important that an RSS server can efficiently serve polling
    > clients when the content _hasn't_ changed, otherwise it can soon be
    > overloaded, even when it's not serving any content. This is a real
    > problem for dumb-coded servers with database-generated content. If
    > your RSS content is coming from static files, then Apache will get it
    > right automatically. If you're generating it dynamically, then make
    > sure your "last updated" timestamps are calculated and returned
    > quickly, and also that they represent the "last change" not the "last
    > request" timestamps.
    >
    > > 3.) Since my feed isn't "news", per se, I don't need to bother with
    > > "syndicating" my feed, do I?.

    >
    > You don't really ever syndicate your own feed, you offer it up for
    > syndication and some aggregator might decide to syndicate it elsewhere
    > if it wishes to. Or else it might not, I cannot be positive which.


    Hi Andy,
    Hey, thanks for the reply. I'll take all the suggestions and help I
    can get! :)

    O.K., I've changed the name of my basic RSS file so it has an .rss
    extension.

    The site is: www.fivestarbank.com, and the specific file is our CD
    rates that I know customers would like to keep current on...that's why
    I think the RSS feed would be a smart idea.

    Comments? Further suggestions?

    Thank you!
    Ed Flecko, May 14, 2007
    #3
  4. Ed Flecko

    Andy Dingley Guest

    On 14 May, 18:14, Ed Flecko <> wrote:

    > O.K., I've changed the name of my basic RSS file so it has an .rss
    > extension.


    It's now served under a content-type of text/plain when it ought to
    be application/rss+xml. Fix that if you can (Apache and .htaccess),
    otherwise it _might_ be better as .xml and at least served as text/xml
    or application/xml. Don't sweat this though: it's good practice, but
    RSS is deliberately robust against it being mis-configured.

    Also validate it with feed validator
    http://feedvalidator.org/check.cgi?url=http://www.fivestarbank.com/fsb.rss

    As it stands, it's valid but still needs a couple of tweaks.

    You're using RSS 2.0, which is probably the best choice for you,
    although the spec is unfortunately badly written and ambiguous. Worth
    reading anyway though:
    <http://cyber.law.harvard.edu/rss/rss.html>


    Line by line:

    <title>Welcome to the Five Star Bank RSS feed</title>
    Don't welcome people, tell them what it is. It's not a web site, it's
    an RSS feed. They don't "visit" this, they have it delivered to them.
    Remeber that they might be reading this on their fridge screen
    display, along with the morning's news and last night's baseball
    result.


    <link>http://www.fivestarbank.com</link>
    Good. This should be to the human-readable website, not any part of
    the feed


    <description>Where Excellence Exceeds Expectations</description>
    Lose the marketing flannel. Put some content here. Try "Five Star Bank
    CD rates at 15th May 2007, valid for the next 5 days" or similar


    <item>
    One item. It's all you need. Not common practice, but entirely valid
    in your application.

    <title>Current CD Rates</title>
    Be careful with words like "current" in any syndicatable protocol (it
    might not still be current when yourr reader gets to see it). Only use
    them with items that are clearly timestamped, otherwise you will
    confuse users.

    <link>http://www.fivestarbank.com/documents/Current_Rates.pdf</link>

    I would still be happier if this pointed to a series of files called
    "rates at 2007-05-15" etc. Delete them as soon as they're obsolete if
    you wish, but at least it avoids confusion of mapping an old
    "currrent" onto a new file with a changed rate. If you don't do this
    then you are losing most of the advantages of RSS.

    You can still make "current" 302 redirect to this week's file.

    There's a separate commercial decision to be made as to whether you
    want to have your historical rate history visible so easily (by
    leaving the old files available). It's your call (but if you ever
    make this information publically visible even temporarily, someone
    will make a business out of recording it and selling histories of it).
    Obviously a single filename kills this anyway.

    <description></description>
    Put something in there. Probably (for this one-item case) a
    restatement of the channel's description.

    There are several elements missing from <channel>. Some are important.

    <pubDate>
    This is vital, because it's how an aggregator identifies the channel /
    item as having been updated. If you don't have it, and you don't
    change the item link URL, then most correct aggregators will simply
    see your content as stale and unchanging, even if the PDF contents
    themselves are changing. Put this on both channel and item -- channel
    is just the latest pubDate across all <item>s, so in your case they're
    currently the same.


    <skipHours> & <ttl>
    This is poorly done in RSS 2.0, but you should still use it. It's part
    of how they hint at the update schedule for the channel. Personally
    I'd use the RSS 1.0 syndication module instead, or as well.
    <http://web.resource.org/rss/1.0/modules/syndication/>

    <copyright>
    This can be important, particularly if you wish to indicate that
    financial information brokers can't republish your content. I suggest
    reading the Creative Commons site for advice on indicating this.

    <managingEditor>
    It's now a legal requirement for UK commercial feeds to include this
    (with some wiggle room for the technical details of "how"), so as to
    identify the legal entity publishing this business communication. I'm
    sure US retail banking laws have similar requirements.


    There are also elements missing from <item>. Some are already
    described, some important.

    Remember that many syndication / aggregation environments syndicate
    _items_, not _channels_. They'll strip out the items they want from
    several sources of channel, then republish them as an aggregation. If
    you want to swim in this world, make sure that your <item>s carry the
    appropriate metadata, don't just stick it once one the overall channel
    and hope.

    <guid>
    This is essential if you expect any syndication to work. It's how they
    recognise <item>s that are different or (in conjunction with pubDate)
    have been updated. Don't use isPermaLink=true though unless you're
    disambiguated between each weeks' set of rates (as I suggest anyway).

    <enclosure>
    Your linked content is a PDF, so it's unclear as to whether it ought
    to be addressed via a <link> or via <enclosure>. It's possible to use
    either. It's better to not use a PDF at all, but to use HTML (with my
    Semantic Web pointy hat on). In that case you'd clearly use a <link>
    and we'd all start building a world of automatically machine-readable
    smart content, intelligent agents and all the rest of it.

    However you probably have a corporate brand manager who forces you to
    use a PDF so that they can control the exact choice of corporate
    typeface. This is a Bad and Wrong policy and the sooner these
    dinosaurs are put out to grass the better, but I appreciate that it
    happens. So is a PDF a piece of "web content" (use <link>) or is it a
    monstrous great piece of opaque brochureware that's only fit to be
    downloaded and printed, with no hope of ever being automatically read
    and used by agents (use <enclosure>).
    Andy Dingley, May 15, 2007
    #4
  5. Ed Flecko

    Ed Flecko Guest

    On May 15, 4:38 am, Andy Dingley <> wrote:
    > On 14 May, 18:14, Ed Flecko <> wrote:
    >
    > > O.K., I've changed the name of my basic RSS file so it has an .rss
    > > extension.

    >
    > It's now served under a content-type of text/plain when it ought to
    > be application/rss+xml. Fix that if you can (Apache and .htaccess),
    > otherwise it _might_ be better as .xml and at least served as text/xml
    > or application/xml. Don't sweat this though: it's good practice, but
    > RSS is deliberately robust against it being mis-configured.
    >
    > Also validate it with feed validatorhttp://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.fivestarbank....
    >
    > As it stands, it's valid but still needs a couple of tweaks.
    >
    > You're using RSS 2.0, which is probably the best choice for you,
    > although the spec is unfortunately badly written and ambiguous. Worth
    > reading anyway though:
    > <http://cyber.law.harvard.edu/rss/rss.html>
    >
    > Line by line:
    >
    > <title>Welcome to the Five Star Bank RSS feed</title>
    > Don't welcome people, tell them what it is. It's not a web site, it's
    > an RSS feed. They don't "visit" this, they have it delivered to them.
    > Remeber that they might be reading this on their fridge screen
    > display, along with the morning's news and last night's baseball
    > result.
    >
    > <link>http://www.fivestarbank.com</link>
    > Good. This should be to the human-readable website, not any part of
    > the feed
    >
    > <description>Where Excellence Exceeds Expectations</description>
    > Lose the marketing flannel. Put some content here. Try "Five Star Bank
    > CD rates at 15th May 2007, valid for the next 5 days" or similar
    >
    > <item>
    > One item. It's all you need. Not common practice, but entirely valid
    > in your application.
    >
    > <title>Current CD Rates</title>
    > Be careful with words like "current" in any syndicatable protocol (it
    > might not still be current when yourr reader gets to see it). Only use
    > them with items that are clearly timestamped, otherwise you will
    > confuse users.
    >
    > <link>http://www.fivestarbank.com/documents/Current_Rates.pdf</link>
    >
    > I would still be happier if this pointed to a series of files called
    > "rates at 2007-05-15" etc. Delete them as soon as they're obsolete if
    > you wish, but at least it avoids confusion of mapping an old
    > "currrent" onto a new file with a changed rate. If you don't do this
    > then you are losing most of the advantages of RSS.
    >
    > You can still make "current" 302 redirect to this week's file.
    >
    > There's a separate commercial decision to be made as to whether you
    > want to have your historical rate history visible so easily (by
    > leaving the old files available). It's your call (but if you ever
    > make this information publically visible even temporarily, someone
    > will make a business out of recording it and selling histories of it).
    > Obviously a single filename kills this anyway.
    >
    > <description></description>
    > Put something in there. Probably (for this one-item case) a
    > restatement of the channel's description.
    >
    > There are several elements missing from <channel>. Some are important.
    >
    > <pubDate>
    > This is vital, because it's how an aggregator identifies the channel /
    > item as having been updated. If you don't have it, and you don't
    > change the item link URL, then most correct aggregators will simply
    > see your content as stale and unchanging, even if the PDF contents
    > themselves are changing. Put this on both channel and item -- channel
    > is just the latest pubDate across all <item>s, so in your case they're
    > currently the same.
    >
    > <skipHours> & <ttl>
    > This is poorly done in RSS 2.0, but you should still use it. It's part
    > of how they hint at the update schedule for the channel. Personally
    > I'd use the RSS 1.0 syndication module instead, or as well.
    > <http://web.resource.org/rss/1.0/modules/syndication/>
    >
    > <copyright>
    > This can be important, particularly if you wish to indicate that
    > financial information brokers can't republish your content. I suggest
    > reading the Creative Commons site for advice on indicating this.
    >
    > <managingEditor>
    > It's now a legal requirement for UK commercial feeds to include this
    > (with some wiggle room for the technical details of "how"), so as to
    > identify the legal entity publishing this business communication. I'm
    > sure US retail banking laws have similar requirements.
    >
    > There are also elements missing from <item>. Some are already
    > described, some important.
    >
    > Remember that many syndication / aggregation environments syndicate
    > _items_, not _channels_. They'll strip out the items they want from
    > several sources of channel, then republish them as an aggregation. If
    > you want to swim in this world, make sure that your <item>s carry the
    > appropriate metadata, don't just stick it once one the overall channel
    > and hope.
    >
    > <guid>
    > This is essential if you expect any syndication to work. It's how they
    > recognise <item>s that are different or (in conjunction with pubDate)
    > have been updated. Don't use isPermaLink=true though unless you're
    > disambiguated between each weeks' set of rates (as I suggest anyway).
    >
    > <enclosure>
    > Your linked content is a PDF, so it's unclear as to whether it ought
    > to be addressed via a <link> or via <enclosure>. It's possible to use
    > either. It's better to not use a PDF at all, but to use HTML (with my
    > Semantic Web pointy hat on). In that case you'd clearly use a <link>
    > and we'd all start building a world of automatically machine-readable
    > smart content, intelligent agents and all the rest of it.
    >
    > However you probably have a corporate brand manager who forces you to
    > use a PDF so that they can control the exact choice of corporate
    > typeface. This is a Bad and Wrong policy and the sooner these
    > dinosaurs are put out to grass the better, but I appreciate that it
    > happens. So is a PDF a piece of "web content" (use <link>) or is it a
    > monstrous great piece of opaque brochureware that's only fit to be
    > downloaded and printed, with no hope of ever being automatically read
    > and used by agents (use <enclosure>).


    Thank you, Andy.

    I'll try your suggestions!

    :)
    Ed Flecko, May 17, 2007
    #5
  6. Quick reminder, not directed only at Ed: Please remember to trim quotes!
    Reposting a hundred lines of text just to add seven words of thanks is
    not a very good use of Internet resources (or of readers' time).

    In general, your new text should be larger than what you're quoting,
    with a *bit* of leeway allowed when the quote itself is also short.
    Joseph Kesselman, May 17, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roland Wolters

    RSS Feed

    Roland Wolters, Jul 27, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    483
    Roland Wolters
    Jul 27, 2003
  2. Replies:
    5
    Views:
    704
    SpaceGirl
    Feb 25, 2005
  3. Motta
    Replies:
    1
    Views:
    504
    Andy Dingley
    Jun 9, 2004
  4. Scott Gordo
    Replies:
    5
    Views:
    677
  5. Jonathan Groll
    Replies:
    1
    Views:
    248
    Kouhei Sutou
    Jun 27, 2009
Loading...

Share This Page