Validate opening and closing of html tags

Discussion in 'Java' started by Pradeep, Apr 17, 2006.

  1. Pradeep

    Pradeep Guest

    Hi,

    Can anyone help me in solving this problem.
    I have an example input:
    sometext<b><i>some text</i></b>
    the input may vary i.e. 1 tag is opened & not closed, some mismatches

    To do:
    1.check for few html tags like b,i,u
    2.opening and closing of tags must be in proper order without
    overlaping.

    I have to write a java code to validate this.
    Can anyone help me..

    Thanks in Advance..

    Regards,
    Pradeep.
     
    Pradeep, Apr 17, 2006
    #1
    1. Advertising

  2. Pradeep

    Mark Thomas Guest

    Pradeep wrote:
    > Hi,
    >
    > Can anyone help me in solving this problem.
    > I have an example input:
    > sometext<b><i>some text</i></b>
    > the input may vary i.e. 1 tag is opened & not closed, some mismatches
    >
    > To do:
    > 1.check for few html tags like b,i,u
    > 2.opening and closing of tags must be in proper order without
    > overlaping.
    >
    > I have to write a java code to validate this.
    > Can anyone help me..
    >
    > Thanks in Advance..
    >
    > Regards,
    > Pradeep.
    >

    I'd use a finite state machine - googling that might get you started.

    Mark
     
    Mark Thomas, Apr 17, 2006
    #2
    1. Advertising

  3. Mark Thomas wrote:
    > Pradeep wrote:
    >> Hi,
    >>
    >> Can anyone help me in solving this problem.
    >> I have an example input:
    >> sometext<b><i>some text</i></b>
    >> the input may vary i.e. 1 tag is opened & not closed, some mismatches
    >>
    >> To do:
    >> 1.check for few html tags like b,i,u
    >> 2.opening and closing of tags must be in proper order without
    >> overlaping.
    >>
    >> I have to write a java code to validate this.
    >> Can anyone help me..
    >>
    >> Thanks in Advance..
    >>
    >> Regards,
    >> Pradeep.
    >>

    > I'd use a finite state machine - googling that might get you started.
    >

    Or a stack.

    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
     
    Martin Gregorie, Apr 17, 2006
    #3
  4. Pradeep

    Venkatesh Guest

    U can just make use of stack and java pattern matching package
    (java.util.regex) ....

    Here is the code to find tags in given html string:

    private static final String HTML_TAG_PATTERN = "<[^>]*>";
    private static final Pattern searchPattern =
    Pattern.compile(HTML_TAG_PATTERN);

    private Matcher m = null;
    private String m_htmlStr = null;

    private boolean m_initDone = false;

    public void init(String htmlStr){

    m_htmlStr = htmlStr;
    m = searchPattern.matcher(m_htmlStr);

    m_initDone = true;

    }

    private String getNextTag() throws Exception {

    if (!m_initDone) {
    throw new Exception("Not yet initialized ....");
    }

    String tagToReturn = null;
    if (m.find()) {
    tagToReturn = m_htmlStr.substring(m.start(), m.end());
    }
    return tagToReturn;

    }

    So, make use of a stack and push all the start tags and selectively pop
    them up whenever u find an end tag and compare to find if the start and
    end tags match.

    Hope this helps

    -Venkatesh
     
    Venkatesh, Apr 17, 2006
    #4
  5. [posted and mailed]

    "Pradeep" <> wrote in
    news::

    > To do:
    > 1.check for few html tags like b,i,u
    > 2.opening and closing of tags must be in proper order without
    > overlaping.
    >
    > I have to write a java code to validate this.
    > Can anyone help me..


    Use a stack data structure.

    Scan through the text looking for HTML tags.

    When you encounter a start tag, push it on the stack.

    When you encounter an end tag, pop the top element from the stack and
    compare it to the end tag.

    Cheers
    GRB


    --
    ---------------------------------------------------------------------
    Greg R. Broderick [rot13]

    A. Top posters.
    Q. What is the most annoying thing on Usenet?
    ---------------------------------------------------------------------
     
    Greg R. Broderick, Apr 17, 2006
    #5
  6. Pradeep

    Oliver Wong Guest

    "Pradeep" <> wrote in message
    news:...
    > Hi,
    >
    > Can anyone help me in solving this problem.
    > I have an example input:
    > sometext<b><i>some text</i></b>
    > the input may vary i.e. 1 tag is opened & not closed, some mismatches
    >
    > To do:
    > 1.check for few html tags like b,i,u
    > 2.opening and closing of tags must be in proper order without
    > overlaping.
    >
    > I have to write a java code to validate this.
    > Can anyone help me..


    You might be interested in HTML Tidy:
    http://www.w3.org/People/Raggett/tidy/

    - Oliver
     
    Oliver Wong, Apr 17, 2006
    #6
  7. Oliver Wong wrote:
    >
    > "Pradeep" <> wrote in message
    > news:...
    >> Hi,
    >>
    >> Can anyone help me in solving this problem.
    >> I have an example input:
    >> sometext<b><i>some text</i></b>
    >> the input may vary i.e. 1 tag is opened & not closed, some mismatches
    >>
    >> To do:
    >> 1.check for few html tags like b,i,u
    >> 2.opening and closing of tags must be in proper order without
    >> overlaping.
    >>
    >> I have to write a java code to validate this.
    >> Can anyone help me..

    >
    > You might be interested in HTML Tidy:
    > http://www.w3.org/People/Raggett/tidy/
    >

    Agreed. If you're writing HTML you should not be without it. However, I
    think you'll find the latest versions are here:

    http://tidy.sourceforge.net/

    The new C version is worth having and there's a Java version too.


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
     
    Martin Gregorie, Apr 17, 2006
    #7
  8. Pradeep

    Tim Smith Guest

    In article <44437868$0$2526$>,
    Mark Thomas <anon> wrote:
    > > 2.opening and closing of tags must be in proper order without
    > > overlaping.

    ....
    > >

    > I'd use a finite state machine - googling that might get you started.


    Wait a second...isn't checking for closing tags being in the right order
    and for tags not overlapping equivalent to the problem of recognizing
    palindromes? And isn't that one of the classic examples of something
    that you can't do with a finite state machine?

    --
    --Tim Smith
     
    Tim Smith, Apr 18, 2006
    #8
  9. Pradeep

    Oliver Wong Guest

    "Tim Smith" <> wrote in message
    news:...
    > In article <44437868$0$2526$>,
    > Mark Thomas <anon> wrote:
    >> > 2.opening and closing of tags must be in proper order without
    >> > overlaping.

    > ...
    >> >

    >> I'd use a finite state machine - googling that might get you started.

    >
    > Wait a second...isn't checking for closing tags being in the right order
    > and for tags not overlapping equivalent to the problem of recognizing
    > palindromes? And isn't that one of the classic examples of something
    > that you can't do with a finite state machine?


    You're right. It can't be done with a finite state machine. You'd need
    an infinite state machine (or a stack machine, or something equally
    powerful, etc.)

    - Oliver
     
    Oliver Wong, Apr 18, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Allan Ebdrup
    Replies:
    0
    Views:
    2,738
    Allan Ebdrup
    Jan 26, 2006
  2. Nik Coughlin
    Replies:
    0
    Views:
    431
    Nik Coughlin
    Jun 16, 2006
  3. Replies:
    2
    Views:
    944
  4. Saravan Wants
    Replies:
    3
    Views:
    3,865
    Peter Flynn
    Sep 30, 2010
  5. Allan Ebdrup
    Replies:
    0
    Views:
    301
    Allan Ebdrup
    Jan 20, 2006
Loading...

Share This Page