regular expression question

G

Guest

I posted this in the c# newsgroup by mistake initially...

I am a newbie to regular expressions and want to extract a number from the
end of a string within an HTML document. The string would have these formats:

image/4567
image/45678
image/456789

I would also want to extract the name if possible from this string too:
"image/4567">name</a>

Thanks.
 
G

Guest

JP said:
I posted this in the c# newsgroup by mistake initially...

I am a newbie to regular expressions and want to extract a number from the
end of a string within an HTML document. The string would have these formats:

image/4567
image/45678
image/456789

I would also want to extract the name if possible from this string too:
"image/4567">name</a>

Thanks.

JP,

I think the regex pattern that you want is something like this:
image/(?<image>\d*)\">(?<name>.*)</a

A really nice free regex editor is called expresso. It has a lot of good
examples to help get started.
http://www.ultrapico.com/Expresso.htm

Also, here's some sample code that might help.

Jason Vermillion

aspx tags....

Search in:<br />
<asp:TextBox ID="txtSearchIn" runat="server" Height="144px"
TextMode="MultiLine" Width="606px">asdf p0jasdf
asd image/4567">nameA</a> asdf
asdfas as asdimage/45678">name2</a> asdf
09823 vasd aimage/456789">n3ame</a> asdf
image/4567">name44</a> asdfasdfasd</asp:TextBox><br />
<br />
<asp:Button ID="cmdSearch" runat="server" OnClick="Button1_Click"
Text="Search" /><br />
<br />
Matches:<br />
<br />
<asp:ListBox ID="lstMatches" runat="server" Width="614px">
</asp:ListBox>



protected void Button1_Click(object sender, EventArgs e)
{
Regex regex;
MatchCollection mcl;
string input = "";
string pat = @"image/(?<image>\d*)\"">(?<name>.*)</a";

this.lstMatches.Items.Clear();
input = this.txtSearchIn.Text;

regex = new Regex(pat, RegexOptions.IgnoreCase |
RegexOptions.Compiled);
mcl = regex.Matches(input);

foreach (Match m in mcl)
{
// Match has 3 groups, one for the entire match, 1 for the image
number, and one for the name.
// Just peal off the 2nd and 3rd group.
if (m.Groups.Count == 3) {
lstMatches.Items.Add("image #: " + m.Groups[1].Value + "
name: " + m.Groups[2].Value);
}
/*
// Use this if you want to see all of the groups.
foreach (Group g in m.Groups)
{
lstMatches.Items.Add(m.Value + " " + g.Index.ToString() + ":
" + g.Value);
}
*/
}

}
 
G

Guest

I think the regex pattern that you want is something like this:
image/(?<image>\d*)\">(?<name>.*)</a

If you need to extract the string from

image/456789 AND "image/4567">name</a>

then I would add ()? (to tell that ">name</a>" at the end is optional)

In this case the final pattern is

image/(?<image>\d*)(\">(?<name>.*)</a)?
 
G

Guest

If you need to extract the string from

image/456789 AND "image/4567">name</a>

then I would add ()? (to tell that ">name</a>" at the end is optional)

In this case the final pattern is

image/(?<image>\d*)(\">(?<name>.*)</a)?


....and it could be more simple

(?<=image\/)\d*
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top