Resolving unicode escapes to unicode character

Discussion in 'Ruby' started by Tyler, Jul 29, 2011.

  1. Tyler

    Tyler Guest

    Hi all,
    I'm trying to parse escaped unicode characters. The basic goal is to
    read the string '\u00F3' (or "\\u00F3") as 'ó'. I have a workaround
    below that uses eval (code below), but I'd be grateful if anyone had a
    less dangerous solution or suggestion. In python, you can 'import
    codecs' and use string.decode("unicode-escape"), is something similar
    possible in Ruby?

    Thanks!
    Tyler


    File.open("test.txt", 'w') {|file| file.puts "Asociaci\\u00F3n Alumni
    \nF\\u00FAtbol"}
    File.open "test.txt", 'r' do |file|
    file.each do |line|
    puts eval("%Q{#{line}}")
    # puts line
    end
    end
    # => Asociación Alumni
    # => Fútbol
    #
    # If 'puts line' is used instead, this is the output:
    # => Asociaci\u00F3n Alumni
    # => F\u00FAtbol
    #
    # Is there a (prettier & safer) way to do this without using eval?
     
    Tyler, Jul 29, 2011
    #1
    1. Advertisements

  2. irb(main):037:0> s="a\\u00fab"
    => "a\\u00fab"
    irb(main):038:0> puts s
    a\u00fab
    => nil
    irb(main):039:0> s.gsub(%r[\\u(\h{4})]) {$1.to_i(16).chr(Encoding::UTF_8)}
    => "aúb"

    Kind regards

    robert
     
    Robert Klemme, Jul 29, 2011
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.