JSON.parse and unicode escape?

J

Jonathan Rochkind

The documentation for the ruby JSON classes (http://json.rubyforge.org/)
implies that it handles unicode escaping fine. But I'm having trouble
with parsing JSON with a unicode escape sequence in it. I am using the
'ext' parser (JSON::Ext::parser) not the 'pure' parser. version 1.1.2,
which appears to still be the latest.

Here is some test JSON, that's actually an excerpt from some JSON
returned to me by a third party web service. Finally boiled it down to
the simplest demonstration case. I saved it in a file, but here's what's
in the text file:

=====
{ "key": 'something \x26 more' }
=====

I believe that is valid json, containing an escaped unicode char? But
JSON.parse on that string throws, complaining:

JSON::parserError: unexpected token at '{ "summary": ' \u0026 ' }


I have verified it is the /x26 that's doing it. It doesn't like \x
escaped unicode.

Am I doing something wrong? Is the JSON I am receiving from the third
party bad somehow? This is such a widely used library that I'd be
surprised if it's broken and can't parse input including unicode escape
sequences... but that's what it looks like to me. Feedback?
 
P

pwever

I am running into what seems to be a related problem with the
following code:

irbJSON::parserError: source sequence is illegal/malformed near uddb0"}
from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
`parse'
from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
`parse'
from (irb):2
from :0
I don't know enough about unicode to really understand what is being
escaped here, but the following unicode characters, very close in
range (I assume) do not throw an error:
"\ucdb0", "\uedb0", "\ud7b0"

I also validated the JSON string ('{"s":"\uddb0"}') successfully at
http://www.jsonlint.com/ and in Python.

Any ideas of what might be the problem?
Are there any alternative JSON parsers for ruby?

Thank you very much // pascal
 
P

Pascal Wever

That makes a lot of sense. Thanks for the clarification regarding the
unicode range.

Since I don't have control over the JSON source, I would like to try to
parse the JSON even if it results in a malformed unicode string. So
today I tried switching from 'json' to the 'ruby-json' library. After
some searching online, I didn't find any documentation on how to use it
though. Primarily I don't know how to include or require it.

require 'ruby-json'
require 'rubyjson'

don't seem to work?
Any ideas are appreaciated.
Thank you very much
// pascal
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,015
Latest member
AmbrosePal

Latest Threads

Top