Ruby 1.9 - ArgumentError: incompatible encoding regexp match(US-ASCII regexp with ISO-2022-JP string

Discussion in 'Ruby' started by Mikel Lindsaar, Mar 31, 2008.

  1. Hiya all,

    I am testing TMail against the latest Ruby (1.9.0 downloaded last
    night), and have come up against this problem:

    ArgumentError: incompatible encoding regexp match (US-ASCII regexp
    with ISO-2022-JP string)

    Now, I can think of a couple of ways to do this, but has anyone else
    run into this problem and has a nice elegant solution?

    I don't really want to set the regexp to UTF-8 or something and then
    transliterate the match strings as that just isn't going to scale I
    think when you are talking about emails which can have almost anything
    in them, and making a regexp for every encoding type also isn't the
    solution.

    This only comes up in the 1.9.0 from last night, 1.9.0 from about
    January does not have this issue.

    The method that is failing is:

    def encode_value( str )
    str.gsub(TOKEN_UNSAFE) {|s| '%%%02x' % s[0] }
    end

    And TOKEN_UNSAFE is defined as:

    tspecial = %Q|()<>[];:\\,"/?=|
    lwsp = %Q| \t\r\n|
    control = %Q|\x00-\x1f\x7f-\xff|

    TOKEN_UNSAFE = /[#{Regexp.quote tspecial}#{control}#{lwsp}]/n

    Which already has the 'n' switch....

    And the failing test is in test_encode.rb (for anyone with TMail
    installed) and looks like this:

    def test_s_encode
    SRCS.each_index do |i|
    assert_equal crlf(OK),
    TMail::Encoder.encode(NKF.nkf('-j', SRCS))
    end
    end

    def crlf( str )
    str.gsub(/\n|\r\n|\r/) { "\r\n" }
    end

    Which is using the string:

    SRCS = ["a cde \343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212"]

    To match against:

    OK = [
    "a cde =?iso-2022-jp?B?GyRCJCIkJCQmJCgkKiQiJCQkJiQoJCokIiQkJCYkKCQqJCIbKEI=?=\n\t=?iso-2022-jp?B?GyRCJCQkJiQoJCokIiQkJCYkKCQqGyhC?=",
    #1
    ]

    Regards

    Mikel
    Mikel Lindsaar, Mar 31, 2008
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kettle
    Replies:
    4
    Views:
    482
    kettle
    Oct 24, 2007
  2. Luther
    Replies:
    15
    Views:
    624
    Jason O.
    Nov 10, 2010
  3. Replies:
    5
    Views:
    548
    John W. Kennedy
    Mar 18, 2006
  4. Replies:
    0
    Views:
    84
  5. Replies:
    1
    Views:
    371
    Peter J. Holzer
    Apr 22, 2006
Loading...

Share This Page