Help with Iconv needed

Discussion in 'Ruby' started by Marcus Strube, Nov 29, 2007.

  1. Can someone tell me what it is that I'm getting wrong here with "iconv"?
    I either get "IllegalSequence" or "äöüß" are not encoded properly when
    using Iconv.conv while it looks good using backticks. ("IllegalSequence
    right now with the second. ÄÖü with the first anytime...)

    require 'rss/1.0'; require 'rss/2.0'; require 'open-uri'; require
    "iconv"

    #source = "http://www.sueddeutsche.de/app/service/rss/alles/rss.xml"
    source = "http://www.welt.de/vermischtes/?service=Rss"

    content = ""; open(source) { |s| content = s.read }; rss =
    RSS::parser.parse(content, false)

    rss.items.each do |item|
    converted = `'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
    puts(Iconv.conv('ISO-8859-1', 'UTF-8', item.title)); puts " "
    end
    --
    Posted via http://www.ruby-forum.com/.
    Marcus Strube, Nov 29, 2007
    #1
    1. Advertising

  2. Marcus Strube

    MonkeeSage Guest

    On Nov 29, 6:50 am, Marcus Strube <> wrote:
    > Can someone tell me what it is that I'm getting wrong here with "iconv"?
    > I either get "IllegalSequence" or "äöüß" are not encoded properly when
    > using Iconv.conv while it looks good using backticks. ("IllegalSequence
    > right now with the second. ÄÖü with the first anytime...)
    >
    > require 'rss/1.0'; require 'rss/2.0'; require 'open-uri'; require
    > "iconv"
    >
    > #source = "http://www.sueddeutsche.de/app/service/rss/alles/rss.xml"
    > source = "http://www.welt.de/vermischtes/?service=Rss"
    >
    > content = ""; open(source) { |s| content = s.read }; rss =
    > RSS::parser.parse(content, false)
    >
    > rss.items.each do |item|
    > converted = `'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
    > puts(Iconv.conv('ISO-8859-1', 'UTF-8', item.title)); puts " "
    > end
    > --
    > Posted viahttp://www.ruby-forum.com/.


    Not sure about the error, but I see two issues. First, this is an
    error...

    `'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`

    I think you meant to echo the vale to the pipe...

    `echo -n '#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`

    Second, iso-8859-1 to utf-8 doesn't appear to be the proper encoding.
    The following string...

    Düsseldorf: Prominentengedrängel bei der Bambi-Verleihung

    ...is encoded as...

    "D\303\203\302\274sseldorf: Prominentengedr\303\203\302\244ngel bei
    der Bambi-Verleihung"

    ...by iconv from the command prompt. But it should be...

    "D\303\274sseldorf: Prominentengedr\303\244ngel bei der Bambi-
    Verleihung"

    I'm not good with encodings and utf-8, so I can't tell you the
    problem. I just know "umlaut u" should be 0xc3bc (\303\274), but it's
    not doing that.

    Regards,
    Jordan
    MonkeeSage, Nov 30, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Strong IsOnlyWord

    How to fix the bug about iconv for python?

    Strong IsOnlyWord, Dec 26, 2005, in forum: Python
    Replies:
    1
    Views:
    613
    Strong IsOnlyWord
    Dec 26, 2005
  2. Tim Ferrell

    Iconv.iconv and Windows XP

    Tim Ferrell, Oct 2, 2005, in forum: Ruby
    Replies:
    4
    Views:
    421
    nobuyoshi nakada
    Oct 4, 2005
  3. Anders
    Replies:
    3
    Views:
    145
    Anders Schneiderman
    Dec 1, 2005
  4. Mister Yu
    Replies:
    2
    Views:
    172
    Mister Yu
    Sep 30, 2007
  5. Krzysztof Cierpisz

    iconv "\n" (Iconv::InvalidCharacter)

    Krzysztof Cierpisz, Sep 8, 2009, in forum: Ruby
    Replies:
    0
    Views:
    183
    Krzysztof Cierpisz
    Sep 8, 2009
Loading...

Share This Page