programmnig advise needed

Discussion in 'Python' started by mitsura@skynet.be, Jun 7, 2005.

  1. Guest

    Hi,

    I am writing a Python program that needs to read XML files and contruct
    a tree object from the XML file (using wxTree).
    The XML however is not an hiearchical XML file. It contains <elements>
    and <association> tags. The <assiociation> tags link the elements
    together.
    Example:
    <element1>
    ... element 1 attributes
    </element1>
    <element2>
    ... element 2 attributes
    </element2>
    <element3>
    ... element 3 attributes
    </element3>
    <association>
    <Name>element1</Name>
    <Parent>element3</Parent>
    </association>
    <association>
    <Name>element2</Name>
    <Parent>element3</Parent>
    </association>

    In this case <element3> is the parent of the two other elements.
    The XML files I receive often contain thousands of elements so the tree
    structure can be very large. Futhermore, the XML files are not
    'sorted', like in the example above, the 'root' element object entry
    might be somewhere in the middle of the XML file.

    I don't really now to code this so any suggestions are welcome.

    With kind regards,

    Kris
     
    , Jun 7, 2005
    #1
    1. Advertising

  2. On 7 Jun 2005 12:14:45 -0700, wrote:

    >I am writing a Python program that needs to read XML files and contruct
    >a tree object from the XML file (using wxTree).


    Supposing your XML file has a single top level node (so that it's a
    legal XML file) then the following code should be able to read it...

    #########
    from elementtree import ElementTree as ET

    class Node:
    def __init__(self, xmlnode):
    self.xmlnode = xmlnode
    self.children = []

    def loadtree(f):
    root = ET.parse(f).getroot()
    nodes = {}
    for x in root:
    if x.tag.startswith("element"):
    nodes[x.tag] = Node(x)
    for x in root.findall("association"):
    name = x.find("Name").text
    parent = x.find("Parent").text
    nodes[parent].children.append(nodes.pop(name))
    assert len(nodes) == 1
    return nodes.popitem()[1]
    ##########

    The idea is to create a node with an empty list of
    logical children and then for every association
    removing the child node from the global pool (a dict
    indexed by node name) to place it in its parent.
    I assumed that the nodes are in random order, but
    that the associations are sorted bottom-up in
    the tree. If this is not the case then you should
    keep TWO dicts, removing from only one of them
    the children when you process an association,
    and looking up the parent in the *other* dict that
    is not changed during processing of associations.
    The dict from who you are removing the children
    will allow you to detect logical errors in the file
    (i.e. a node having two parents - you won't find
    that node in the dict the second time - and absence
    of a single root - you won't end up with a single
    element in the dict after processing all
    associations -).

    HTH
    Andrea
     
    Andrea Griffini, Jun 7, 2005
    #2
    1. Advertising

  3. Guest

    Thanks for the feedback! It is a very good idea to use dictionaries. I
    will try to implemented it this way.

    Thanks,

    Kris


    Andrea Griffini schreef:
    > On 7 Jun 2005 12:14:45 -0700, wrote:
    >
    > >I am writing a Python program that needs to read XML files and contruct
    > >a tree object from the XML file (using wxTree).

    >
    > Supposing your XML file has a single top level node (so that it's a
    > legal XML file) then the following code should be able to read it...
    >
    > #########
    > from elementtree import ElementTree as ET
    >
    > class Node:
    > def __init__(self, xmlnode):
    > self.xmlnode = xmlnode
    > self.children = []
    >
    > def loadtree(f):
    > root = ET.parse(f).getroot()
    > nodes = {}
    > for x in root:
    > if x.tag.startswith("element"):
    > nodes[x.tag] = Node(x)
    > for x in root.findall("association"):
    > name = x.find("Name").text
    > parent = x.find("Parent").text
    > nodes[parent].children.append(nodes.pop(name))
    > assert len(nodes) == 1
    > return nodes.popitem()[1]
    > ##########
    >
    > The idea is to create a node with an empty list of
    > logical children and then for every association
    > removing the child node from the global pool (a dict
    > indexed by node name) to place it in its parent.
    > I assumed that the nodes are in random order, but
    > that the associations are sorted bottom-up in
    > the tree. If this is not the case then you should
    > keep TWO dicts, removing from only one of them
    > the children when you process an association,
    > and looking up the parent in the *other* dict that
    > is not changed during processing of associations.
    > The dict from who you are removing the children
    > will allow you to detect logical errors in the file
    > (i.e. a node having two parents - you won't find
    > that node in the dict the second time - and absence
    > of a single root - you won't end up with a single
    > element in the dict after processing all
    > associations -).
    >
    > HTH
    > Andrea
     
    , Jun 8, 2005
    #3
  4. Magnus Lycka Guest

    wrote:
    > Hi,
    >
    > I am writing a Python program that needs to read XML files and contruct
    > a tree object from the XML file (using wxTree).
    > The XML however is not an hiearchical XML file. It contains <elements>
    > and <association> tags. The <assiociation> tags link the elements
    > together.


    Are you sure that you get a tree? As I understand your description,
    the XML format could well be used to describe arbitrary directed
    graphs, even cyclic ones. This might not be a problem unless someone
    tries to expand the entire tree...but you should probably have
    some kind of cycle check, or disallow "expand all" in the GUI and
    make sure you don't create the nodes before they are expanded.

    If you really just permit trees, it's fairly simple, because a
    node should just appear as a <Name> in an association once. You
    can put those in a set, and check that they aren't already in the
    set before you accept the content in a new association. Something
    like the following semi-pseudo-code:

    nodes = set()

    for name, parent in associations:
    if name in nodes:
    raise KeyError, ("%s is already in the tree. "
    "Can't put it under %s." % (name, parent)
    nodes.add(name)
    whatever...

    If you want to allow arbitrary directed acyclic graphs (DAGs),
    you have to work a little more. E.g. if you want to allow things
    like:

    A
    ..B
    ...C
    ....D
    ....E
    ..F
    ...C
    ....D
    ....E

    (I know that you said a tree, but I've certainly met clever and
    well educated people before who called DAGs trees...)

    For DAGs, you might traverse the tree to make sure there are
    no cycles, but there are probably better solutions as well. As
    I said, if it's really trees you want, this is not a problem.

    Actually, another risk is that your document describes several
    disjunct structures... Oh well, that's another issue, and you
    can always argue how defensive your code needs to be. It obviously
    depends on how well you trust your input and the implications
    of problems. Anyway, assuming cycle checks, it's easy to find
    roots: They are the keys in the dictionary that don't occur as
    values. If there's just one, you just have one tree.

    To summarize: If you want to describe a tree structure in XML,
    it's probably a good idea to use the natural hierarchy of XML
    tags to do this. That way the file can't convey a non-tree
    structure. If you need to use a construct like you describe
    in your XML, this suggests that you need to convey non-tree
    structures, and then you probably need to handle those as well.
     
    Magnus Lycka, Jun 10, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raaijmakers, Vincent (IndSys,GE Interlogix)

    web guru advise needed.

    Raaijmakers, Vincent (IndSys,GE Interlogix), Nov 10, 2003, in forum: Python
    Replies:
    5
    Views:
    364
    Brad Clements
    Nov 11, 2003
  2. Raaijmakers, Vincent (IndSys,GE Interlogix)

    RE: web guru advise needed.

    Raaijmakers, Vincent (IndSys,GE Interlogix), Nov 10, 2003, in forum: Python
    Replies:
    1
    Views:
    344
    Eric Williams
    Nov 13, 2003
  3. iboon.us

    Easiest Programmnig & MS Office

    iboon.us, Feb 7, 2008, in forum: Java
    Replies:
    0
    Views:
    326
    iboon.us
    Feb 7, 2008
  4. Steven W. Orr

    Advise on using logging.getLogger needed.

    Steven W. Orr, Oct 2, 2011, in forum: Python
    Replies:
    1
    Views:
    165
    Vinay Sajip
    Oct 7, 2011
  5. Jack
    Replies:
    2
    Views:
    145
Loading...

Share This Page