Need help in parsing REXML::Document

N

Nikolay Pavlov

--nextPart1407253.yNRC0Cy0Rg
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hello all.
I have this XML output:

<response>
<num></num>
<token></token>
<operation id=3D"some id" sid=3D"some sid">
<trans></trans>
<data></data>
</operation>
<operation id=3D"other id" sid=3D"other sid"
<trans></trans>
<data></data>
</operation>
...
</response>

Could someone point me to the way on how i can get construction like this=20
one from the XML above:

{:num =3D> value,=20
:token =3D> value,=20
:eek:peration =3D> [{:id =3D> value,=20
:sid =3D> value,
:trans =3D> value,=20
:data =3D> value }, =20
{:id =3D> value,
:sid =3D> value,
:trans =3D> value,
:data =3D> value}]
}

The initial hash part was trivial for me, but while i am thinking about an=
=20
array part my brain says me "stack level too deep" :)

=2D-=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20
=2D Best regards, Nikolay Pavlov. <<<----------------------------------- =
=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20


--nextPart1407253.yNRC0Cy0Rg
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQBG3DuI/2R6KvEYGaIRAozkAKDuEWQ9tykVxCwPhq3IqiuNErAVVgCfQbwa
SstAdQvzg7qumGtKqDVBdm8=
=JyM2
-----END PGP SIGNATURE-----

--nextPart1407253.yNRC0Cy0Rg--
 
P

Phlip

[Please post raw text; my newsreader chokes on whatever you posted...]

Nikolay said:
I have this XML output:

<response>
<num></num>
<token></token>
<operation id="some id" sid="some sid">
<trans></trans>
<data></data>
</operation>
<operation id="other id" sid="other sid"
<trans></trans>
<data></data>
</operation>
...
Could someone point me to the way on how i can get construction like this
one from the XML above:

{:num => value,
:token => value,
:eek:peration => [{:id => value,
:sid => value,

I get the idea you want a generic system to turn any stereotypical XML to a
hash. The XML is too varied to just hard-code the transformation, but the
XML doesn't contain advanced features like nodes containing both text and
sub-nodes.

def git_riddim(node)
hash = {}
REXML::XPath.each node, '*' do |node|
contents = (node.text || '').strip
(hash[node.name] ||= []) << (contents.blank? ? git_riddim(node) :
node.text)
end
return hash
end

def test_recursive_xml_reader
xml = '<response>
<num>v1</num>
<token>v2</token>
<operation id="some id" sid="some sid">
<trans>v3</trans>
<data>v4</data>
</operation>
<operation id="other id" sid="other sid">
<trans></trans>
<data></data>
</operation>
</response>'
doc = REXML::Document.new(xml)
hash = git_riddim(doc)
assert_equal({"response"=>[{"token"=>["v2"], "num"=>["v1"],
"operation"=>[{"trans"=>["v3"], "data"=>["v4"]}, {"trans"=>[{}],
"data"=>[{}]}]}]}), hash
end

Now here's the head-hurting part. We need arrays to turn into objects if
they only contain one element.

def reduce_dimensions(hash)
return unless hash.kind_of?(Hash)
hash.each do |k,v|
v.each do |q| reduce_dimensions(q) end
hash[k] = *v
end
end

reduce_dimensions(hash)

assert_equal({"response"=>
{"token"=>"v2",
"num"=>"v1",
"operation"=>[{"trans"=>"v3", "data"=>"v4"}, {"trans"=>{},
"data"=>{}}]}},
hash)

We used * to turn arrays of none or one into nil or an object.

Outside of that trick, my code is much more flabby that usual for Hash
contortions. Someone might be able to use .inject or something to get it
down to fewer lines, in one pass.
 
N

Nikolay Pavlov

Outside of that trick, my code is much more flabby that usual for Hash
contortions. Someone might be able to use .inject or something to get it
down to fewer lines, in one pass.

Thanks Phlip. With your suggestions i have come to some thing like this at
the end:

def parse_to_hash(node)
normalize(xml_to_hash_of_arrays(node))
end

def xml_to_hash_of_arrays(node)
hash = {}
REXML::XPath.each node, '*' do |node|
(hash[node.name.underscore.to_sym] ||= []) << if node.has_elements?
parse_node(node)
else
(node.text || '').strip
end
end
end

def normalize(hash)
return unless hash.kind_of?(Hash)
hash.each do |key, value|
value.each { |val| reduce(val) }
hash[key] = *value
end
end

But one thing is still missed. In my original example i want to move the
attributes of the nodes (that hash_elements? == true) to the array, for
example:

<operation id="some id" sid="some sid">
<trans>v3</trans>
<data>v4</data>
</operation>

Should produce:

[{:id => "some id", :sid => "some sid", :trans => "v4", :data => "v4"}]
 
N

Nikolay Pavlov

OK. I have come up to the final version that do what i want.
Here it is:

def parse_to_hash(node)
normalize(xml_to_hash_of_arrays(node))
end

def xml_to_hash_of_arrays(node)
hash = {}
REXML::XPath.each node, '*' do |node|
if node.parent.has_attributes?
node.parent.attributes.each_attribute do |attr|
hash[attr.name.underscore.to_sym] = attr.value
end
end
(hash[node.name.underscore.to_sym] ||= []) << if node.has_elements?
xml_to_hash_of_arrays(node)
else
(node.text || '').strip
end
end
return hash
end

def normalize(hash)
return unless hash.kind_of?(Hash)
hash.each do |key, value|
value.each { |val| normalize(val) }
hash[key] = *value
end
end

Many thanks Philip to your suggestions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,067
Latest member
HunterTere

Latest Threads

Top