Extract alphanumeric text from a string


A

Abir B.

Hello
I have a string extracted from a news flow wich contains heteregenous
parts. I need to extract the part wich represents a natural text (the
content of the summary). I don't know how to do this, but I guess I must
apply a regular expression??

this two exemples of the flow :

"summary":{"direction":"ltr","content":"Voters' hopes for the Iraqi
Kurdistan
elections"},"likingUsers":[],"comments":[],"annotations":[],"origin":{"streamId":"feed/http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml","title":"BBC
News | News Front Page | World
Edition","htmlUrl":"http://news.bbc.co.uk/go/rss/-/2/hi/default.stm"}}

=> Here I want to extract "Voters' hopes for the Iraqi Kurdistan
elections"

"summary":{"direction":"ltr","content":"Microsoft's disappointing fiscal
fourth-quarter results reflect a sharp slowdown in software sales as
demand for new personal computers wanes in the recession.<p><iframe
src=\"http://feedads.g.doubleclick.net/~a...8-4DA1-9194-FDB657AA15DC%7D&siteid=rss&rss=1\"
width=\"100%\" height=\"60\" frameborder=\"0\" scrolling=\"no\"
marginwidth=\"0\" marginheight=\"0\"></iframe></p><div>\n<a
href=\"http://feeds.marketwatch.com/~ff/marketwatch/topstories?a=_AV98AmZ11c:fu-PHxqGjGM:yIl2AUoC8zA\"><img
src=\"http://feeds.feedburner.com/~ff/marketwatch/topstories?d=yIl2AUoC8zA\"
border=\"0\"></a><img
src=\"http://feeds.feedburner.com/~ff/marketwatch/topstories?d=qj6IDK7rITs\"

=> here : "Microsoft's disappointing fiscal fourth-quarter results
reflect a sharp slowdown in software sales as demand for new personal
computers wanes in the recession

sometimes the summary don't contain any text ... so I must return an
empty string

Thanks
 
Ad

Advertisements


Top