Design suggestion for translations/mappings from xml

B

Brian Lonsdorf

Hi, i've got an xml file i'm parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I'm looking for any design patterns or suggestions people might have to
elegantly solve this.

xml example:

<HotelDescriptiveContent CurrencyCode="USD" TimeZone="GMT;-06"
BrandCode="BV" HotelCode="517" HotelName="Americas Best Value Inn and
Suites - Downtown" Overwrite="true" UnitOfMeasureCode="1">
<HotelInfo WhenBuilt="2000" HotelStatus="Bookable" HotelStatusCode="1">
<CategoryCodes>
<LocationCategory Code="3"></LocationCategory>
<SegmentCategory Code="5"></SegmentCategory>
<HotelCategory Code="20"></HotelCategory>
</CategoryCodes>
<Description>All the comforts of home conveniently...</Description>


object example:

class Hotel
attr_accessor :property_name, :brand
end


There's a bunch of other classes and corresponding awkwardly named tags.

It'd be killer if i could do a to_xml as well as from. But not
crucial...

Thanks for your help!
 
J

Jesús Gabriel y Galán

Hi, i've got an xml file i'm parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I'm looking for any design patterns or suggestions people might have to
elegantly solve this.

xml example:

<HotelDescriptiveContent CurrencyCode="USD" TimeZone="GMT;-06"
BrandCode="BV" HotelCode="517" HotelName="Americas Best Value Inn and
Suites - Downtown" Overwrite="true" UnitOfMeasureCode="1">
<HotelInfo WhenBuilt="2000" HotelStatus="Bookable" HotelStatusCode="1">
<CategoryCodes>
<LocationCategory Code="3"></LocationCategory>
<SegmentCategory Code="5"></SegmentCategory>
<HotelCategory Code="20"></HotelCategory>
</CategoryCodes>
<Description>All the comforts of home conveniently...</Description>


object example:

class Hotel
attr_accessor :property_name, :brand
end


There's a bunch of other classes and corresponding awkwardly named tags.

It'd be killer if i could do a to_xml as well as from. But not
crucial...

Here's a rough idea I though of when reading your email:

module XMLMap
module ClassMethods
attr_reader :from_xml, :to_xml
def map property, xpath_to_value
(@from_xml ||= {})[xpath_to_value] = property
(@to_xml ||= {})[property] = xpath_to_value
end
end

def to_xml
"implement to_xml creating all tags and attrs defined in the to_xml
hash: #{self.class.to_xml.inspect}"
end

def from_xml xml
"iterate through all xpaths from from_xml
(#{self.class.from_xml.inspect}) initializing the corresponding ivars"
# Alternatively you could make this a class method that
returns an initialized instance
end

def self.included child
puts "inherited"
child.extend ClassMethods
end
end

I haven't implemented the XML stuff, but it should be straightforward
if you limit yourself to simple xpaths. Then you can use it like:

class Test
include XMLMap
map :this, "/root/this/tag"
map :that, "/root/that/@attr"
end

test = Test.new.from_xml
"<root><this><tag>this_value</tag></this><that attr="that_value"/>"
test.to_xml

You could also implement a version of the map method that receives a
hash with all the mappings or whatever.

Hope this helps,

Jesus.
 
T

Tom Morris

Hi, i've got an xml file i'm parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I'm looking for any design patterns or suggestions people might have to
elegantly solve this.

Well, if you are going to be parsing lots of different types of XML
document, here are some possible ways you could approach the problem:

- if a schema document is available, pull it in and do something clever
with it. If it's a RelaxNG schema, if you look for oneOrMore and
zeroOrMore elements that contain element references, you can use that
to map to lists.

- you could also write an XSLT transformation to turn the document into
an intermediary XML format for which there are already intuitive
interfaces in Ruby (or whatever other language you end up using) -
RSS/Atom, for instance (maybe RDF eventually - I'm soon to release an
alpha version of a Ruby RDF gem) or even something like XML-RPC or
SOAP, both of which are designed to map to native data-types (and
objects in the case of SOAP).

- you could see if you could produce some kind of clever metrics from
the document by looking for the most-used elements in particular
places in the hierarchy.

- sometimes a separate parser/serializer class is a necessity - I took
this approach in my project just because it was the approach that the
guys who had done the Python and Java libraries had done, and it
seemed pretty sensible to do it that way.

One day, people will get that XML is a *markup* language - and markup is
something you add to *documents*. If you just want to shunt data around,
it probably shouldn't be the first choice when compared with something
like JSON or YAML.
 
B

Brian Lonsdorf

class Test
include XMLMap
map :this, "/root/this/tag"
map :that, "/root/that/@attr"
end

test = Test.new.from_xml
"<root><this><tag>this_value</tag></this><that attr="that_value"/>"
test.to_xml

You could also implement a version of the map method that receives a
hash with all the mappings or whatever.

Hope this helps,

Jesus.

Thanks so much for your response!

The map method was the perfect solution to my problem. Declarative,
organized, and maintainable. Previously, I kept trying to define them
all in one spot - I just needed to think about it differently.

One enhancement that made me smile was a call to attr_accessor in the
map method :)
 
J

Jesús Gabriel y Galán

Thanks so much for your response!

Glad it helped.
One enhancement that made me smile was a call to attr_accessor in the
map method :)

Now *you* got me thinking a little step further. Finally I've had some
spare time,
and came up with this couple enhancements (plus implementing the XML stuff:
parsing using nokogiri, generated the XML by hand):

require 'nokogiri'

module XMLMap
module ClassMethods
def set_xml_data property, xpath
re = %r{\A(/(\w+))+(/@(\w+))?\Z}
raise "Invalid xpath: #{xpath} for attribute #{property}. Only
simple tags and attrs supported (#{re})" unless xpath =~ re
(@mappings ||= {})[property] = xpath
end

def mapped_reader property, xpath
set_xml_data property, xpath
self.class_eval {attr_reader property.to_sym}
end

def mapped_writer property, xpath
set_xml_data property, xpath
self.class_eval {attr_writer property.to_sym}
end

def mapped_accessor property, xpath
set_xml_data property, xpath
self.class_eval {attr_accessor property.to_sym}
end

def from_xml xml
o = self.new
doc = Nokogiri.XML(xml)
@mappings.each do |attr, xpath|
item = doc.xpath xpath
unless item.empty?
o.instance_variable_set "@#{attr}".to_sym, item.inner_text
end
end
o
end

def mappings
@mappings
end
end

def to_xml
xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
self.class.mappings.each do |attr, xpath|
value = instance_variable_get "@#{attr}"
continue unless value
tag, attr = xpath.split("/@")
tags = tag.split("/")
h = tags[1..-2].inject(xml) {|h, tag| h[tag]}
if attr
h[tags[-1]][:attributes][attr] = value
else
h[tags[-1]][:value] = value
end
end
output = ""
xml.each do |node, data|
generate_node node, data, output
end
output
end

def generate_node node,data,output
value = data.delete:)value)
attrs = data.delete:)attributes)
output << "<#{node}"
if attrs
attrs.each do |attr, value|
output << " #{attr}=\"#{value}\""
end
end
if value
output << ">#{value}</#{node}>"
elsif !data.empty?
output << ">"
data.each do |child_tag, child_data|
generate_node child_tag, child_data, output
end
output << "</#{node}>"
else
output << "/>"
end
end

def self.included child
child.extend ClassMethods
end
end

The XML stuff ended up a little messy, I'd appreciate any help or
comment there. Usage:

class A
include XMLMap
mapped_reader :first, "/root/first"
mapped_reader :second, "/root/second"
mapped_accessor :attr, "/root/first/@attr"
end

a = A.from_xml %q{<root><first
attr="the_attr_value">the_first_value</first><second>the_second_value</second></root>}
p a
a.attr = "changed value"
puts a.to_xml

I haven't tested very thoroughly, so there might be a bug or two in
there. Probably you will want to reimplement the to_xml method to use
a proper XML generator.
Any comment about the code or approach appreciated, for sure there's
room for improvement :)

Regards,

Jesus.
 
B

Brian Lonsdorf

I haven't tested very thoroughly, so there might be a bug or two in
there. Probably you will want to reimplement the to_xml method to use
a proper XML generator.
Any comment about the code or approach appreciated, for sure there's
room for improvement :)

Regards,

Jesus.

Wow, that's pretty kickass.

# I love this.
xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
tags[1..-2].inject(xml) {|h, tag| h[tag]}

I actually did some of the same stuff last week, but a lot less generic
and sophisticated. I hadn't realized the potential of having a cool
little library for this stuff.

You should release it - I would have used this for sure if I'd found it
on github.
 

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top