Text encodings

Discussion in 'Ruby' started by xTRiM, Jul 10, 2006.

  1. xTRiM

    xTRiM Guest

    Hello,

    is there any way, to detect text encoding?
    For example, is it in utf8, or in win1251, or something else.

    Thank you.
     
    xTRiM, Jul 10, 2006
    #1
    1. Advertising

  2. xTRiM

    Paul Battley Guest

    On 10/07/06, xTRiM <> wrote:
    > is there any way, to detect text encoding?
    > For example, is it in utf8, or in win1251, or something else.


    You can't detect one-byte-per-character encodings easily (i.e. without
    statistical analysis) but you can easily tell if something's UTF-8 or
    not:

    class String
    def is_utf8?
    unpack('U*')
    return true
    rescue
    return false
    end
    end

    "foo".is_utf8? #=> true
    "foo\303".is_utf8? #=> false

    Not the most efficient way, necessarily, but probably the easiest.

    Paul.
     
    Paul Battley, Jul 10, 2006
    #2
    1. Advertising

  3. xTRiM

    Takashi Sano Guest

    Hi,

    2006/7/10, xTRiM <>:
    > Hello,
    >
    > is there any way, to detect text encoding?
    > For example, is it in utf8, or in win1251, or something else.
    >


    You can use the standard lib NKF's guess or guess2 (ruby 1.8.2 or
    later) method for that. Look up the NKF section in
    http://www.ruby-doc.org/stdlib/.

    Takashi Sano
     
    Takashi Sano, Jul 10, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Caterpillar

    fsm state encodings

    Caterpillar, Feb 10, 2006, in forum: VHDL
    Replies:
    1
    Views:
    603
  2. Per Bolmstedt
    Replies:
    0
    Views:
    442
    Per Bolmstedt
    Mar 3, 2004
  3. calmar
    Replies:
    1
    Views:
    361
    Fredrik Lundh
    Feb 16, 2006
  4. Replies:
    6
    Views:
    436
  5. Nordlöw

    Determining possible encodings of a given text

    Nordlöw, May 6, 2008, in forum: C Programming
    Replies:
    3
    Views:
    290
    Richard Tobin
    May 6, 2008
Loading...

Share This Page