Help with regular expression

G

Grost

Hi all,

I'm writing an application to perform some HTML text manipulation from
templates and I have a regex formulation problem. For example, in the
template I have a line:

<tr><td class="caption"><!--@caption--><!--<br />(@caption)--></td></tr>

where the parts I want to replace are HTML comments: <!-- ??? -->
There are two styles of comment I want to search/replace:
1) <!--@caption-->
2) <!--XXX(@caption)YYY-->, where XXX and YYY can represent other HTML

Case 1 is easy, and I just use: <!--\s*?@caption\s*?-->
Case 2 is the problem. I trying to use this for conditional insertion
of additional HTML, depending on whether @caption exists in the
application. If I have a value for @caption, then the following is
produced from the above example:

<tr><td class="caption">foo<br />foo</td></tr>

This seems easy enough in principle, but every regex pattern I've tried
unsuprisingly matches the <!-- from the first comment. My initial try
which of course failed was: <!--(.*?)\(@caption\)(.*?)-->

What I need is a way of saying:
Match "(@caption)" within an HTML comment, and capture the text on
either side of tag and within the comment, but make sure there are no
other comment-like tags within that text. I'm guessing I need something
along the lines of the lookaround operators, but I have little
experience with them. Any help anyone...?

(For clarity I removed the extra escaping required for Java inline strings.)

Stan
 
H

hiwa

Grost ã®ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸:
Hi all,

I'm writing an application to perform some HTML text manipulation from
templates and I have a regex formulation problem. For example, in the
template I have a line:

<tr><td class="caption"><!--@caption--><!--<br />(@caption)--></td></tr>

where the parts I want to replace are HTML comments: <!-- ??? -->
There are two styles of comment I want to search/replace:
1) <!--@caption-->
2) <!--XXX(@caption)YYY-->, where XXX and YYY can represent other HTML

Case 1 is easy, and I just use: <!--\s*?@caption\s*?-->
Case 2 is the problem. I trying to use this for conditional insertion
of additional HTML, depending on whether @caption exists in the
application. If I have a value for @caption, then the following is
produced from the above example:

<tr><td class="caption">foo<br />foo</td></tr>

This seems easy enough in principle, but every regex pattern I've tried
unsuprisingly matches the <!-- from the first comment. My initial try
which of course failed was: <!--(.*?)\(@caption\)(.*?)-->

What I need is a way of saying:
Match "(@caption)" within an HTML comment, and capture the text on
either side of tag and within the comment, but make sure there are no
other comment-like tags within that text. I'm guessing I need something
along the lines of the lookaround operators, but I have little
experience with them. Any help anyone...?

(For clarity I removed the extra escaping required for Java inline strings.)

Stan
I think your description does not formalize the requirement well
enough.
Here's a rough stab in the dark. HTH.
------------------------------------------------------------------
public class Grost{

public static void main(String[] args){
String text =
"<tr><td class=\"caption\"><!--@caption--><!--<br
/>(@caption)--></td></tr>";
String result = "<tr><td class=\"caption\">foo<br />foo</td></tr>";
String regex1 = "<!--(<[^>]+>).*-->";
String regex2 = "<!--.*-->";

text = text.replaceAll(regex1, "foo$1foo");
text = text.replaceAll(regex2, "");

if (result.equals(text)){
System.out.println("success");
}
}
}
 
G

Grost

Grost ã®ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸:
Hi all,

I'm writing an application to perform some HTML text manipulation from
templates and I have a regex formulation problem. For example, in the
template I have a line:

<tr><td class="caption"><!--@caption--><!--<br />(@caption)--></td></t r>

where the parts I want to replace are HTML comments: <!-- ??? -->
There are two styles of comment I want to search/replace:
1) <!--@caption-->
2) <!--XXX(@caption)YYY-->, where XXX and YYY can represent other HTML

Case 1 is easy, and I just use: <!--\s*?@caption\s*?-->
Case 2 is the problem. I trying to use this for conditional insertion
of additional HTML, depending on whether @caption exists in the
application. If I have a value for @caption, then the following is
produced from the above example:

<tr><td class="caption">foo<br />foo</td></tr>

This seems easy enough in principle, but every regex pattern I've tried
unsuprisingly matches the <!-- from the first comment. My initial try
which of course failed was: <!--(.*?)\(@caption\)(.*?)-->

What I need is a way of saying:
Match "(@caption)" within an HTML comment, and capture the text on
either side of tag and within the comment, but make sure there are no
other comment-like tags within that text. I'm guessing I need something
along the lines of the lookaround operators, but I have little
experience with them. Any help anyone...?

(For clarity I removed the extra escaping required for Java inline string s.)

Stan
I think your description does not formalize the requirement well
enough.
Here's a rough stab in the dark. HTH.
------------------------------------------------------------------
public class Grost{

public static void main(String[] args){
String text "<tr><td class=\"caption\"><!--@caption--><!--<br
/>(@caption)--></td></tr>";
String result = "<tr><td class=\"caption\">foo<br />foo</td></tr>";
String regex1 = "<!--(<[^>]+>).*-->";
String regex2 = "<!--.*-->";

text = text.replaceAll(regex1, "foo$1foo");
text = text.replaceAll(regex2, "");

if (result.equals(text)){
System.out.println("success");
}
}
}

I figured that formalisation may be a problem, and that's quite likely
to be the aspect for which I need the most help. Essentially I want to
allow arbitrary text (inc.HTML) either side of a caption tag:
<!--XXX(@caption)YYY-->
with the only restriction being that the text CANNOT be an HTML comment.
XXX cannot contain <!--.*-->
YYY cannot contain <!--.*-->

In regex terms, if I use my non-working version:
<!--(.*?)\(@caption\)(.*?)-->
then neither $1 or $2 capuring groups in this match should contain any
HTML comments.

Any clearer?

Stan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,061
Latest member
KetonaraKeto

Latest Threads

Top