RSS Parser Help..

Gim Ick · Sep 18, 2009

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

<item>
<title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
<pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
<guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
<link>http://a345.singaporeair.com/</link>
<dc:creator><![CDATA[galvezcreative]]></dc:creator>
<comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
<wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
<source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
<category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>
<category
domain="http://delicious.com/galvezcreative/">marketing</category>
</item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>)

Kouhei Sutou · Sep 21, 2009

Hi,

In <[email protected]>
"RSS Parser Help.." on Sat, 19 Sep 2009 09:21:36 +0900,

Gim Ick said:
I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

<item>
<title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
<pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
<guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
<link>http://a345.singaporeair.com/</link>
<dc:creator><![CDATA[galvezcreative]]></dc:creator>
<comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
<wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
<source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
<category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>
<category
domain="http://delicious.com/galvezcreative/">marketing</category>
</item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

rss.items[0].categories.each do |category|
p category.content
end

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>)

rss.items[0].category returns Category object not "<category
...>...</category>" string. (Hint: Category object has #to_s
method that returns "<category ...>...</category>" string)

Thanks,

Gim Ick · Sep 22, 2009

Thanks! I was using regular expressions to do this task!

Richard.Williams.20 · Oct 6, 2009

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

<item>
<title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
<pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
<guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
<link>http://a345.singaporeair.com/</link>
<dc:creator><![CDATA[galvezcreative]]></dc:creator>
<comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
<wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
<source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
<category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>
<category
domain="http://delicious.com/galvezcreative/">marketing</category>
</item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>)

Alternate biterscripting script.

# Script category.txt
var str rss ; cat "file.rss" > $rss
while ( { sen -r -c "^^" $rss } > 0 )
do
var str category ; stex -r -c "^<category&\>&</category\>^" $rss >
$category
stex -r -c "^<category&\>^]" $category > null ; stex -r -c "[^</
category\>^" $category > null
echo $category
done

For documentation on stex (string extractor) command, see
http://www.biterscripting.com/helppages/stex.html

Richard

PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 10, 2022
Two ways to generate RSS - rss/maker and rss/2.0 - which is better?	1	Jun 26, 2009
parsing RSS XML feed for item value	5	Nov 19, 2013
RSS::Maker and non-permalink guids	1	Sep 11, 2009
RSS::Parser not showing <content:encoded> tags	3	Nov 4, 2005
showing date on ashx page (rss)	4	Oct 31, 2006
Parsing XML RSS feed byte stream for <item> tag	2	Feb 7, 2013
REXML, XPath and Namespace	1	Jun 16, 2007

RSS Parser Help..

Gim Ick

Kouhei Sutou

Gim Ick

Richard.Williams.20

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads