SOAP4R throws exception on copyright character

Discussion in 'Ruby' started by Brian Marick, Nov 8, 2003.

  1. Brian Marick

    Brian Marick Guest

    Lots of web pages contain copyright characters (not © but
    something that displays in Mozilla view source as the copyright symbol,
    in emacs as a square box, and probably says to the world "Hi! I'm an
    HTML file that was created with Word."). SOAP4r is unhappy with that
    character, as you can see in this use of the googleSearch sample:

    % ruby wsdlDriver.rb 'Mark Swanson'
    /usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in `_set':
    {http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by
    <b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b>
    <b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError)
    from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in `set'
    from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
    `decode_textbuf'
    from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
    `decode_tag_end'

    In contrast, the Java version that comes with the Google API download
    prints the peculiar character.

    This is easy for me to work around, just comment out the check in
    XSDString#_set:

    def _set(value)
    unless XSD::Charset.is_ces(value, XSD::Charset.encoding)
    raise ValueSpaceError.new("#{ type }: cannot accept '#{ value
    }'.")
    end
    @data = value
    end

    My questions:
    1) Is there a better workaround?
    Or something I'm misunderstanding?
    2) Is this behavior something that
    should be changed in SOAP4r?
    3) Is Google in error in delivering
    that character in that type of string?

    I am using ruby 1.8.1-preview2 and the code from soap4r-1_5_1.

    -----
    Brian Marick
    Consulting, training, contracting, and research
    Focused on the intersection of testing, programming, and design
    ,
    www.testing.com, www.visibleworkings.com
     
    Brian Marick, Nov 8, 2003
    #1
    1. Advertising

  2. Hi, good morning from far east.

    > From: "Brian Marick" <>
    > Sent: Sunday, November 09, 2003 5:57 AM


    > % ruby wsdlDriver.rb 'Mark Swanson'
    > /usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in `_set':
    > {http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by
    > <b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b>
    > <b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError)
    > from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in `set'
    > from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
    > `decode_textbuf'
    > from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
    > `decode_tag_end'


    GoogleAPI returns "\xc2\xa9" sequence in utf-8 format. Can you try this?
    $ ruby -Ku wsdlDriver.rb 'Mark Swanson Copyright Artwork'

    There may be another reason (no iconv?) though... I cannot reproduce
    the same error on my linux/cygwin boxes even though I run the
    wsdlDriver.rb with "-Kn".

    Beside this, I should add '$KCODE = "UTF8"' at the head of the
    wsdlDriver.rb.

    Regards,
    // NaHi
     
    NAKAMURA, Hiroshi, Nov 9, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Himmat Dhange
    Replies:
    0
    Views:
    443
    Himmat Dhange
    Aug 26, 2003
  2. Cowboy \(Gregory A. Beamer\)
    Replies:
    1
    Views:
    812
    =?Utf-8?B?amhpbGw=?=
    May 16, 2004
  3. Chris Miller
    Replies:
    4
    Views:
    13,449
    Chris Smith
    Nov 22, 2003
  4. Replies:
    2
    Views:
    344
  5. Robert Dailey

    Character encoding & the copyright symbol

    Robert Dailey, Aug 6, 2009, in forum: Python
    Replies:
    11
    Views:
    1,715
    Dave Angel
    Aug 6, 2009
Loading...

Share This Page