what is the bettter/performant way to compare org.w3c.dom.DocumentFragment

Discussion in 'Java' started by Mausam, Jan 17, 2012.

  1. Mausam

    Mausam Guest

    I have a java class, whose contains a DocumentFragment.

    In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.

    I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.

    So in such cases this equality will fail.

    Please suggest a better approach.
     
    Mausam, Jan 17, 2012
    #1
    1. Advertising

  2. Mausam

    Jeff Higgins Guest

    On 01/17/2012 10:03 AM, Mausam wrote:
    > I have a java class, whose contains a DocumentFragment.
    >
    > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
    >
    > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
    >
    > So in such cases this equality will fail.
    >
    > Please suggest a better approach.

    A my class is equal to another my class if and only if ...
     
    Jeff Higgins, Jan 17, 2012
    #2
    1. Advertising

  3. Mausam

    Arne Vajhøj Guest

    On 1/17/2012 10:03 AM, Mausam wrote:
    > I have a java class, whose contains a DocumentFragment.
    >
    > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
    >
    > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
    >
    > So in such cases this equality will fail.


    I think XML Canonicalization will solve the problem.

    It comes as a cost though.

    Arne
     
    Arne Vajhøj, Jan 17, 2012
    #3
  4. Mausam

    Mausam Guest

    On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
    > On 01/17/2012 10:03 AM, Mausam wrote:
    > > I have a java class, whose contains a DocumentFragment.
    > >
    > > In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
    > >
    > > I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
    > >
    > > So in such cases this equality will fail.
    > >
    > > Please suggest a better approach.

    > A my class is equal to another my class if and only if ...


    Thanks Jeff, I understand what you mean.

    BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode(org.w3c.dom.Node)

    The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index.


    The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared.

    Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and <dept/><emp/> ) they will be treated as NOT equal.

    So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options.
     
    Mausam, Jan 18, 2012
    #4
  5. Mausam

    Jeff Higgins Guest

    On 01/17/2012 08:56 PM, Mausam wrote:
    > On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
    >> On 01/17/2012 10:03 AM, Mausam wrote:
    >>> I have a java class, whose contains a DocumentFragment.
    >>>
    >>> In the equals method of my class, I am converting the DocumentFragment to a String and comparing an equals on the String.
    >>>
    >>> I know this is not the best way, because "attributes" e.g can change order in Element of DocumentFragment, or e.g documents differ only in the sequence of unordered elements.
    >>>
    >>> So in such cases this equality will fail.
    >>>
    >>> Please suggest a better approach.

    >> A my class is equal to another my class if and only if ...

    >
    > Thanks Jeff, I understand what you mean.
    >
    > BTW, I was checking the API http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode(org.w3c.dom.Node)
    >
    > The attributes NamedNodeMaps are equal. This is: they are both null, or they have the same length and for each node that exists in one map there is a node that exists in the other map and is equal, although not necessarily at the same index.
    >
    >
    > The childNodes NodeLists are equal. This is: they are both null, or they have the same length and contain equal nodes at the same index. Note that normalization can affect equality; to avoid this, nodes should be normalized before being compared.
    >
    > Here for attributes, they take care of "NOT necessarily at the same index" but in case of childNodes its not being taken care of. So if there is a sequence of unordered elements (<emp/><dept/> and<dept/><emp/> ) they will be treated as NOT equal.
    >
    > So either I iterate through each node and attribute and do a comparison. That's the fall back. But before that, I wanted to check the experts if there are better options.


    Yep. I based my hair trigger response upon the .equals(Object) of the
    "known implementing classes" of Node. Sorry. I'll be interested in
    finding out the "cost" associated with Arne Vajhøj's response.
     
    Jeff Higgins, Jan 18, 2012
    #5
  6. Mausam

    Arne Vajhøj Guest

    On 1/17/2012 6:38 PM, Arne Vajhøj wrote:
    > On 1/17/2012 10:03 AM, Mausam wrote:
    >> I have a java class, whose contains a DocumentFragment.
    >>
    >> In the equals method of my class, I am converting the DocumentFragment
    >> to a String and comparing an equals on the String.
    >>
    >> I know this is not the best way, because "attributes" e.g can change
    >> order in Element of DocumentFragment, or e.g documents differ only in
    >> the sequence of unordered elements.
    >>
    >> So in such cases this equality will fail.

    >
    > I think XML Canonicalization will solve the problem.
    >
    > It comes as a cost though.


    Example:

    import java.io.IOException;
    import java.io.UnsupportedEncodingException;

    import javax.xml.parsers.ParserConfigurationException;

    import org.apache.xml.security.Init;
    import org.apache.xml.security.c14n.CanonicalizationException;
    import org.apache.xml.security.c14n.Canonicalizer;
    import org.apache.xml.security.c14n.InvalidCanonicalizerException;
    import org.xml.sax.SAXException;

    public class XmlComp {
    static {
    Init.init();
    }
    private static String canonicalize(String s) throws
    InvalidCanonicalizerException, UnsupportedEncodingException,
    CanonicalizationException, ParserConfigurationException, IOException,
    SAXException {
    Canonicalizer c14n =
    Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
    String res = new
    String(c14n.canonicalize(s.getBytes(Canonicalizer.ENCODING)),
    Canonicalizer.ENCODING);
    return res;
    }
    public static void main(String[] args) throws Exception {
    String s1 = "<a><b c='1' d='2'/></a>";
    String s2 = "<a><b d='2' c='1'/></a>";
    System.out.println(s1);
    System.out.println(s2);
    System.out.println(canonicalize(s1));
    System.out.println(canonicalize(s2));
    }
    }

    outputs:

    <a><b c='1' d='2'/></a>
    <a><b d='2' c='1'/></a>
    <a><b c="1" d="2"></b></a>
    <a><b c="1" d="2"></b></a>

    Arne
     
    Arne Vajhøj, Jan 18, 2012
    #6
  7. Mausam

    Arne Vajhøj Guest

    On 1/17/2012 9:33 PM, Jeff Higgins wrote:
    > On 01/17/2012 08:56 PM, Mausam wrote:
    >> On Tuesday, 17 January 2012 23:02:29 UTC+5:30, Jeff Higgins wrote:
    >>> On 01/17/2012 10:03 AM, Mausam wrote:
    >>>> I have a java class, whose contains a DocumentFragment.
    >>>>
    >>>> In the equals method of my class, I am converting the
    >>>> DocumentFragment to a String and comparing an equals on the String.
    >>>>
    >>>> I know this is not the best way, because "attributes" e.g can change
    >>>> order in Element of DocumentFragment, or e.g documents differ only
    >>>> in the sequence of unordered elements.
    >>>>
    >>>> So in such cases this equality will fail.
    >>>>
    >>>> Please suggest a better approach.
    >>> A my class is equal to another my class if and only if ...

    >>
    >> Thanks Jeff, I understand what you mean.
    >>
    >> BTW, I was checking the API
    >> http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#isEqualNode(org.w3c.dom.Node)
    >>
    >>
    >> The attributes NamedNodeMaps are equal. This is: they are both null,
    >> or they have the same length and for each node that exists in one map
    >> there is a node that exists in the other map and is equal, although
    >> not necessarily at the same index.
    >>
    >>
    >> The childNodes NodeLists are equal. This is: they are both null, or
    >> they have the same length and contain equal nodes at the same index.
    >> Note that normalization can affect equality; to avoid this, nodes
    >> should be normalized before being compared.
    >>
    >> Here for attributes, they take care of "NOT necessarily at the same
    >> index" but in case of childNodes its not being taken care of. So if
    >> there is a sequence of unordered elements (<emp/><dept/>
    >> and<dept/><emp/> ) they will be treated as NOT equal.
    >>
    >> So either I iterate through each node and attribute and do a
    >> comparison. That's the fall back. But before that, I wanted to check
    >> the experts if there are better options.

    >
    > Yep. I based my hair trigger response upon the .equals(Object) of the
    > "known implementing classes" of Node. Sorry. I'll be interested in
    > finding out the "cost" associated with Arne Vajhøj's response.


    The cost is CPU time. It cost a bit of CPU time to parse and
    reorganize and serialize again.

    Arne
     
    Arne Vajhøj, Jan 18, 2012
    #7
  8. Mausam

    Mausam Guest

    Thanks Arne,

    I can achieve that using Node.isEqualTo(Node) API post JDK1.5.

    I am worried of following usecases (wondering if its even valid usecase or not)

    1)
    Are these two Nodes equal? (check that one has empty street element and other has no street element. That implies that value for street is empty in both cases. So as per employee object is considered in Java, both will be equal.
    <Employee company="example" xmlns="http://example.com" debug="true">
    <Employeename>mausam</Employeename>
    <email>a @example.com</email>
    <street/>
    </Employee>

    <Employee debug="true" company="example" xmlns="http://example.com">
    <Employeename>mausam</Employeename>
    <email>a @example.com</email>
    </Employee>

    2)
    Check the sequence of street element. In Node 1 it is after email and in node2 it is before.
    <Employee company="example" xmlns="http://example.com" debug="true">
    <Employeename>mausam</Employeename>
    <email>a @example.com</email>
    <street>Marienplatz</street>
    </Employee>

    <Employee debug="true" company="example" xmlns="http://example.com">
    <Employeename>mausam</Employeename>
    <street>Marienplatz</street>
    <email>a @example.com</email>
    </Employee>

    --

    Please note that I can not create java objects from XMLs as those are free xml fragments and does not comply to schema. But thanks a lot for your effort and code example.
     
    Mausam, Jan 18, 2012
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Wendy S
    Replies:
    1
    Views:
    6,364
    Darren Davison
    Aug 5, 2003
  2. mynamehere
    Replies:
    0
    Views:
    459
    mynamehere
    Dec 14, 2003
  3. Bryan
    Replies:
    2
    Views:
    308
    Bryan
    Sep 23, 2004
  4. Praveen Chhangani

    Converting a org.jdom DOC to org.w3c DOC

    Praveen Chhangani, Aug 5, 2003, in forum: XML
    Replies:
    2
    Views:
    970
    Johannes Koch
    Aug 7, 2003
  5. Alan
    Replies:
    6
    Views:
    1,602
Loading...

Share This Page