Java regex can't match lengthy match?

Discussion in 'Java' started by hiwa, Jan 29, 2004.

  1. hiwa

    hiwa Guest

    Below is a simple demo program for the problem I have encountered
    during a development. The java.util.regex seems that it can't match if
    the matched substring is too long. If you slice the long line in the
    test data into four to five pieces of matched substrings with "><",
    then the demo program runs fine. The test data has no < nor > in "..."
    quotations.
    ---demo program---
    Code:
    import java.io.*;
    import java.util.regex.*;
    
    public class TagMatchTest{
    public static void main(String[] args) throws IOException{
    String line;
    StringBuffer sb = new StringBuffer();
    
    BufferedReader br = new BufferedReader(new
    FileReader("test.txt"));
    while ((line = br.readLine()) != null){
    sb.append(line);
    }
    Pattern pat = Pattern.compile("<[^>]*>"); //find tags
    Matcher mat = pat.matcher(new String(sb));
    while (mat.find()){
    System.out.println(mat.group());
    }
    }
    }
    
    ---test data(test.txt)-number of lines is 4, meta "description" is the
    long line---
    <meta http-equiv="content-type" content="text/html;
    charset=windows-1252">
    <meta name="keywords" content=" Java, J2EE, Enterprise Java,
    J2ME, Java 2 Micro Edition, perfect">
    <meta name="description" content="Java is like any development
    platform/language combination most developers have a love-hate
    relationship with it. Sure, for Java aficionados it's better than
    using .Net, LAMP, or (add your own particular poison here), but we
    bemoan the complexity of Swing, the bulkiness of the Enterprise
    JavaBeans (EJB) specification, performance, additional overheads
    imposed on skimpy hardware by the Java 2 Platform, Micro Edition
    (J2ME) platform, the 101 different ways to do things, and on and on.
    If we could just address Java's weak points, we might make Java that
    mythical beast?the perfect technology platform...So then, what are
    those changes? Is there such a thing as the perfect technology
    platform, and does Java have the potential to become it? (3,500 words;
    January 2, 2004)">
    <meta name="GOOGLEBOT" content="NOARCHIVE">
    <style type="text/css">
    ---end test data-------
    hiwa, Jan 29, 2004
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?RWQgQ2hpdQ==?=

    Navigation for a lengthy page

    =?Utf-8?B?RWQgQ2hpdQ==?=, Oct 29, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    371
    Jeffrey Palermo [MCP]
    Oct 30, 2004
  2. Replies:
    4
    Views:
    444
  3. Ken Varn
    Replies:
    6
    Views:
    810
    Ken Varn
    Apr 28, 2005
  4. =?Utf-8?B?UkZQdWxzaWZlcg==?=

    Rendering very lengthy

    =?Utf-8?B?UkZQdWxzaWZlcg==?=, May 16, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    328
    =?Utf-8?B?UkZQdWxzaWZlcg==?=
    May 16, 2005
  5. Ben Fidge
    Replies:
    3
    Views:
    13,620
    MarkD
    Nov 8, 2005
Loading...

Share This Page