SOAP4R throws exception on copyright character

B

Brian Marick

Lots of web pages contain copyright characters (not © but
something that displays in Mozilla view source as the copyright symbol,
in emacs as a square box, and probably says to the world "Hi! I'm an
HTML file that was created with Word."). SOAP4r is unhappy with that
character, as you can see in this use of the googleSearch sample:

% ruby wsdlDriver.rb 'Mark Swanson'
/usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in `_set':
{http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by
<b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b>
<b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError)
from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in `set'
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
`decode_textbuf'
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
`decode_tag_end'

In contrast, the Java version that comes with the Google API download
prints the peculiar character.

This is easy for me to work around, just comment out the check in
XSDString#_set:

def _set(value)
unless XSD::Charset.is_ces(value, XSD::Charset.encoding)
raise ValueSpaceError.new("#{ type }: cannot accept '#{ value
}'.")
end
@data = value
end

My questions:
1) Is there a better workaround?
Or something I'm misunderstanding?
2) Is this behavior something that
should be changed in SOAP4r?
3) Is Google in error in delivering
that character in that type of string?

I am using ruby 1.8.1-preview2 and the code from soap4r-1_5_1.

-----
Brian Marick
Consulting, training, contracting, and research
Focused on the intersection of testing, programming, and design
(e-mail address removed), (e-mail address removed)
www.testing.com, www.visibleworkings.com
 
N

NAKAMURA, Hiroshi

Hi, good morning from far east.
From: "Brian Marick" <[email protected]>
Sent: Sunday, November 09, 2003 5:57 AM
% ruby wsdlDriver.rb 'Mark Swanson'
/usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in `_set':
{http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by
<b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b>
<b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError)
from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in `set'
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
`decode_textbuf'
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
`decode_tag_end'

GoogleAPI returns "\xc2\xa9" sequence in utf-8 format. Can you try this?
$ ruby -Ku wsdlDriver.rb 'Mark Swanson Copyright Artwork'

There may be another reason (no iconv?) though... I cannot reproduce
the same error on my linux/cygwin boxes even though I run the
wsdlDriver.rb with "-Kn".

Beside this, I should add '$KCODE = "UTF8"' at the head of the
wsdlDriver.rb.

Regards,
// NaHi
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top