Unordered tree to store elements of a parenthesized expression

Discussion in 'Java' started by Kaushik, Jan 4, 2007.

  1. Kaushik

    Kaushik Guest

    I have a string like (A AND B OR (C AND D AND E OR F)) which has to be
    put into a tree. The number of children any node can have is variable.
    For the above expression, I'd like to have a tree which looks like
    this:
    OR
    AND
    A
    B
    OR
    F
    AND
    C
    D
    E

    Could someone tell me how to construct this tree and then traverse this
    tree?
    I'll be traversing this tree to put the contents into a W3C document
    later on.

    Thanks
    Kaushik
     
    Kaushik, Jan 4, 2007
    #1
    1. Advertising

  2. Kaushik

    voorth Guest

    Kaushik wrote:
    > I have a string like (A AND B OR (C AND D AND E OR F)) which has to be
    > put into a tree. The number of children any node can have is variable.
    > For the above expression, I'd like to have a tree which looks like
    > this:
    > OR
    > AND
    > A
    > B
    > OR
    > F
    > AND
    > C
    > D
    > E
    >
    > Could someone tell me how to construct this tree and then traverse this
    > tree?
    > I'll be traversing this tree to put the contents into a W3C document
    > later on.
    >
    > Thanks
    > Kaushik


    A google search on "parse boolean expression java" will get you a _lot_
    of relevant information.

    You might also want to take a look ad Dijkstra's "shunting yard
    algorithm" : http://en.wikipedia.org/wiki/Shunting_yard_algorithm
     
    voorth, Jan 4, 2007
    #2
    1. Advertising

  3. Kaushik wrote:
    > I have a string like (A AND B OR (C AND D AND E OR F)) which has to be
    > put into a tree. The number of children any node can have is variable.
    > For the above expression, I'd like to have a tree which looks like
    > this:
    > OR
    > AND
    > A
    > B
    > OR
    > F
    > AND
    > C
    > D
    > E
    >
    > Could someone tell me how to construct this tree and then traverse this
    > tree?
    > I'll be traversing this tree to put the contents into a W3C document
    > later on.


    I guess you've decided OR takes precedence over AND from the above.

    You need to build a tree so

    public class Node {
    public List<Node> children = new LinkedList<Node>();
    public Node parent;
    public String contents;
    public boolean isOrNode;

    public static Node buildTree (String x) {
    int pos = 0;
    StringBuffer wordBuilder = new StringBuffer();
    String word = null;
    boolean expectLeaf = true;
    Deque<Node> stackOr = new LinkedList<Node>();
    Deque<Node> stackAnd = new LinkedList<Node>();
    Node flag = new Node();
    Node currentOr = flag;
    Node currentAnd = null;
    Node child = null;
    currentOr.isOrNode = true;
    if (!x.startsWith('(')) x = "(" + x + ')';
    while (pos < x.length()) {
    char c = x.charAt(pos);
    ++pos;
    if (!Character.isWhitespace(c)
    && c != '(' && c != ')') {
    wordBuilder.append(c);
    continue;
    }
    if (wordBuilder.length() > 0) {
    word = wordBuilder.toString();
    wordBuilder.delete(0,
    wordBuilder.length() - 1);
    }
    if (word != null) {
    if (word.equals("AND")) {
    if (expectLeaf) throw whatever;
    child = new Node();
    currentAnd.children.add(child);
    expectLeaf = true;
    } else if (word.equals("OR")) {
    if (expectLeaf) throw whatever;
    Node cur = currentAnd;
    if (cur.children.size() == 0)
    throw whatever;
    if (cur.children.size() == 1) {
    currentOr.remove(cur);
    cur=cur.children.get(0);
    currentOr.add(cur);
    }
    currentAnd = new Node();
    currentOr.children.
    add(currentAnd);
    expectLeaf = true;
    } else {
    if (!expectLeaf) throw whatever;
    child.contents = word;
    expectLeaf = false;
    }
    }
    if (c == '(') {
    if (!expectLeaf) throw whatever;
    stackOr.push(currentOr);
    stackAnd.push(currentAnd);
    currentOr = new Node();
    currentAnd = new Node();
    child = new Node();
    currentOr.children.add(currentAnd);
    currentAnd.children.add(child);
    currentOr.isOrNode = true;
    word = null;
    continue;
    }
    if (c == ')') {
    if (expectLeaf) throw whatever;
    if (currentAnd.children.size() == 0)
    throw whatever;
    if (currentAnd.children.size() == 1) {
    currentOr.remove(currentAnd);
    currentOr.add(currentAnd.
    children.get(0));
    }
    if (currentOr.children.size() == 0)
    throw whatever;
    Node cur = currentOr;
    if (currentOr.children.size() == 1)
    cur = currentAnd;
    currentOr = stackOr.pop(); // May throw
    if (currentOr == flag) {
    if (pos != string.length())
    throw whatever;
    return cur;
    }
    currentAnd.children.add(cur);
    currentAnd = stackAnd.pop();
    word = null;
    expectLeaf = true;
    }
    }
    throw whatever; // End of string came early.
    }
    }

    Untested but should probably work with any string of the form you gave,
    producing a tree of Nodes. Only leaf Nodes have non-null contents field,
    while the rest have non-empty children list with at least 2 children and
    isOrNode set for the "OR" nodes. Whitespace is insignificant and only
    the parentheses and the words OR and AND (case-sensitive) are special.
    Any other agglomeration of non-parenthesis non-whitespace is treated as
    a leaf node. The outermost pair of parentheses are optional.

    The string you gave should produce
    OR
    AND OR
    A B AND F
    C D E

    as the tree structure. The loop starts with expectLeaf true and sees the
    initial (, wordBuilder is empty and word is null, and it pushes the node
    "flag" and currentAnd (never used). It then reads A and whitespace and
    word becomes "A", and it changes expectLeaf to false and child's
    contents to "A". Then it reads AND and makes child a new empty Node
    added to currentAnd. It reads B, assigns the child's contents field;
    reads OR, and creates a new currentAnd to add to currentOr. We now have
    the left hand side of the tree. It sees a ( next and pushes currentOr
    and currentAnd -- the former is the tree root above and the latter is
    empty. It creates new ones -- they'll end up the right hand OR and right
    hand AND above. It sees C now, setting the currently-under-construction
    child's content field and unsetting expectLeaf. (Note the pattern:
    expectLeaf goes up, then child gets its contents field filled in and
    expectLeaf goes down, then a new child begins construction and the flag
    goes back up.) Next is AND, so a new child is started. D goes into it;
    new child; E. currentAnd is now the tree's other AND node with three
    children when it sees OR. It creates an empty currentAnd and adds it to
    currentOr. This currentAnd won't last. It sees F, sets child contents,
    and then sees ). In the ) handler it sees that currentAnd has size 1 and
    so drops it from currentOr and adds its sole child directly to
    currentOr. Since currentOr has two children cur ends up currentOr;
    otherwise it would have ended up currentOr's sole child. The original
    currentAnd and currentOr get popped (the top OR and an empty currentAnd)
    and cur (the right hand OR) added to the latter. And it immediately sees
    another ). Again currentAnd has only one child so the top OR ends up
    with the right hand OR as a direct child. As pos was incremented after
    pulling the final ) from x, pos is the string's length and the top OR is
    returned from the method.

    It's trivial to verify that it handles a one-element non-rightmost AND
    correctly as well -- building an OR is only terminated by a ) but
    building an AND is terminated by any OR, so when it sees say (X OR Y)'s
    "OR" it snaps the X to being a direct child of the OR node. It also
    handles (FOO AND BAR) correctly, removing the superfluous OR node it was
    building with the AND as its only child on encountering the ).

    I don't guarantee it totally free of errors; in particular some imports
    are needed (mainly from java.util) and you should throw a new instance
    of some chosen exception class in place of each "whatever" occurrence.
    It will choke on a string with trailing whitespace, unless it didn't
    start with a (, as well. It will handle (X) properly, as the ) handler
    first eats the useless currentAnd node and then the equally useless
    currentOr node. It will choke on empty parenthesis pair (() or ( );
    some whitespace may be between them) too. It also isn't easy to extend,
    or fix if OR taking precedence wasn't your intention (though it gives
    your intended tree for the example string you posted). In the general case:

    X AND Y AND Z OR A AND B AND C OR D AND E AND F

    gets you a single OR node with three AND nodes, the first with X, Y, and
    Z, the next with A, B, and C, and the last with D, E, and F as children.

    The only limit to parenthesis nesting depth is memory and any size
    limitations LinkedList has when used as a stack via the Deque interface.
    Memory consumption is O(nesting depth) + O(string length) and time
    consumption is O(string length) as there's no backtracking for such a
    simple grammar.

    If you need more flexibility, the APIs that come with the JDK include
    some parsing stuff, much of it geared toward HTML and XML but some of it
    more general-purpose. But I personally hate mucking with StringTokenizer...

    Exceptions: besides your own parse exception, it will throw
    NoSuchElementException if there's an unbalanced ) in the input string.
    You might want to transform that into another of your choice of parse
    exception class with an added outer try block.

    Efficiency caveats: empty nodes are sometimes created and discarded.
    This includes the initial "flag" node, used to put in the OR-node deque
    to flag exit from the final parentheses and any one-child nodes that get
    created. At least it recycles the StringBuffer. :)
     
    John Ersatznom, Jan 4, 2007
    #3
  4. Kaushik

    Kaushik Guest

    How would I be able to implement this in version 4 of java?
     
    Kaushik, Jan 8, 2007
    #4
  5. "Kaushik" <> wrote in message
    news:...
    > How would I be able to implement this in version 4 of java?
    >


    Be able to implement what?

    You need to quote whatever message you are replying to, or else people won't
    know ;)

    --
    LTP

    :)
     
    Luc The Perverse, Jan 8, 2007
    #5
  6. Kaushik wrote:
    > How would I be able to implement this in version 4 of java?


    It wasn't that heavily generic. You'll have to change the List<Node>
    references and the like to drop <Node> and cast things taken out of
    collections to (Node); that's probably it, unless lists don't implement
    Deque in older versions...
     
    John Ersatznom, Jan 8, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ha
    Replies:
    1
    Views:
    863
    arnold m. slotnik
    Jan 28, 2004
  2. yann
    Replies:
    1
    Views:
    396
    C. M. Sperberg-McQueen
    Aug 19, 2005
  3. Stub

    B tree, B+ tree and B* tree

    Stub, Nov 12, 2003, in forum: C Programming
    Replies:
    3
    Views:
    10,131
  4. klikic
    Replies:
    2
    Views:
    2,512
    George Bina
    Jan 15, 2007
  5. candide
    Replies:
    10
    Views:
    511
    Simon Forman
    Jul 19, 2009
Loading...

Share This Page