best technique for detecting charset/character encoding of RSS feeds

D

Daniel Choi

I have a Ruby application that fetches RSS and Atom feeds. I've tried
using the chardet gem (UniversalDetector) to figure out what the
character encoding of each feed is. But for some strange this library
thinks a lot of feeds are EUC-KR (Korean) when they plainly aren't.

Can anyone suggest a better way to find the encoding of RSS and Atom
feeds (e.g. via BOM detection, etc.) with Ruby?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,444
Messages
2,571,709
Members
48,796
Latest member
Greg L.
Top