REXML Speed Question

K

Kyle X.

Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:

---------------------------------------------
result = []
wall_refs1 = XPath.match( $doc,
"doc:iso_10303_28/uos/IfcWallStandardCase//*[@pos='1']" )

wall_refs1 = grab_id(wall_refs1,'ref')
#grab_id simply puts the ref's id and puts them into an array
#output from this would be [["i1741"]]

wall_ref2 = []
wall_refs1.each do |ref|
x =
REXML::XPath.first($doc,"//*[@id='#{ref}']//IfcExtrudedAreaSolid").attribute("ref").value
wall_ref2 << x
end
#Output [["i1738"]]

wall_depth = []
wall_ref2.each do |ref|
x = REXML::XPath.match($doc,"//*[@id='#{ref}']//Depth").map {|element|
element.text}
wall_depth << x
end
#Output [["120."]]

wall_depth_final = wall_depth.map do |arr|
arr.map do |arr2|
#this is simply converting to float and rounding to 2 decimles
arr2.to_f.round_to(2)
end
end

wall_depth_final
#Output [["120.00"]
-----------------------------------------

The problem with doing this is that it takes substantial time for the
computer to run this, doing this for say 200 elements can take 25
minutes (I would be guessing the reason it takes so long to run is
because as some of the xml files are 10,000+ lines and I image it takes
a while to comb through that). I have to start from the first location
and work my way to the final one, and simply cannot run a search to grab
//depth unfortunately.

Is there a quicker way of accomplishing the same thing, or is time
always going to be a burden?

Thank you for your time.

This would be the xml I am reading:

<IfcWallStandardCase id="i1677">
<Representation>
<IfcProductDefinitionShape id="i1747">
<Representations id="i1750" exp:cType="list">
<IfcShapeRepresentation exp:pos="0" xsi:nil="true" ref="i1708"/>
<IfcShapeRepresentation exp:pos="1" xsi:nil="true" ref="i1741"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>
<IfcShapeRepresentation id="i1741">
<Items id="i1746" exp:cType="set">
<IfcExtrudedAreaSolid exp:pos="0" xsi:nil="true" ref="i1738"/>
</Items>
</IfcShapeRepresentation>
<IfcExtrudedAreaSolid id="i1738">
<Depth>120.</Depth>
</IfcExtrudedAreaSolid>
 
R

Ryan Davis

Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:

Switch to nokogiri and you'll be much much happier.
 
M

Mark Kremer

For larger XML documents SAX parsing can really improve performance
(specifically because SAX parsing doesn't create an entire DOM
structure, it only extracts the bits you are interested in). Programming
with a SAX parser is very different though :)

You can also switch to another library for handling your XML, the most
popular library (at least to my knowledge) is Nokogiri
(http://nokogiri.org/) and it is a great deal faster than REXML

Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:

---------------------------------------------
result = []
wall_refs1 = XPath.match( $doc,
"doc:iso_10303_28/uos/IfcWallStandardCase//*[@pos='1']" )

wall_refs1 = grab_id(wall_refs1,'ref')
#grab_id simply puts the ref's id and puts them into an array
#output from this would be [["i1741"]]

wall_ref2 = []
wall_refs1.each do |ref|
x =
REXML::XPath.first($doc,"//*[@id='#{ref}']//IfcExtrudedAreaSolid").attribute("ref").value
wall_ref2<< x
end
#Output [["i1738"]]

wall_depth = []
wall_ref2.each do |ref|
x = REXML::XPath.match($doc,"//*[@id='#{ref}']//Depth").map {|element|
element.text}
wall_depth<< x
end
#Output [["120."]]

wall_depth_final = wall_depth.map do |arr|
arr.map do |arr2|
#this is simply converting to float and rounding to 2 decimles
arr2.to_f.round_to(2)
end
end

wall_depth_final
#Output [["120.00"]
-----------------------------------------

The problem with doing this is that it takes substantial time for the
computer to run this, doing this for say 200 elements can take 25
minutes (I would be guessing the reason it takes so long to run is
because as some of the xml files are 10,000+ lines and I image it takes
a while to comb through that). I have to start from the first location
and work my way to the final one, and simply cannot run a search to grab
//depth unfortunately.

Is there a quicker way of accomplishing the same thing, or is time
always going to be a burden?

Thank you for your time.

This would be the xml I am reading:

<IfcWallStandardCase id="i1677">
<Representation>
<IfcProductDefinitionShape id="i1747">
<Representations id="i1750" exp:cType="list">
<IfcShapeRepresentation exp:pos="0" xsi:nil="true" ref="i1708"/>
<IfcShapeRepresentation exp:pos="1" xsi:nil="true" ref="i1741"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>
<IfcShapeRepresentation id="i1741">
<Items id="i1746" exp:cType="set">
<IfcExtrudedAreaSolid exp:pos="0" xsi:nil="true" ref="i1738"/>
</Items>
</IfcShapeRepresentation>
<IfcExtrudedAreaSolid id="i1738">
<Depth>120.</Depth>
</IfcExtrudedAreaSolid>
 
K

Kyle X.

Thanks for the info. I am going to try Nokogiri, if I can only figure
out how to get it to work in SketchUp.... There is a surprisingly a
dearth of information on the topic, after a few hours of trying to find
out online.... Any chance anyone know how?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top