Regular Expression Help

R

Rob Meade

Hi all,

I'm going down the road of learing the pattern matching in regular
expressions, and I'm trying to convert the characters into English in my
head so I can see whats happening...

For clarity with this one...

\[b\]([^\]]+)\[\/b\]

Ok - the first thing I did was break it up - because it looks a nightmare
like that...so now I have..3 parts..

\[b\]

([^\]]+)

\[\/b\]

In part 1 I under stand that the \ is telling the expression that a special
character is coming next, thus escaping the [ and ] respectively, thus ,
and in part 3 I understand the same and also the escaping of the /, thus
- no problems so far...

In part 2 I'm assuming that the brackets are seperating a "pattern" so that
I might reference it as $1 later on, I understand that the [^] are saying
"not enclosed" so therefore it's going to ignore a ], the + sign however
perplexes me - my VBScript book says "Matches the preceeding character one
or more times" - so does that mean the ] just before it, or does it mean the
character OR characters defined within the [ ] etc?

Any help would be appreciated - thus far I've managed to 'wing' my way
through what I've needed to do - but it's getting more complicated now :)

Thanks in advance

Rob
 
R

Roland Hall

: Hi all,
:
: I'm going down the road of learing the pattern matching in regular
: expressions, and I'm trying to convert the characters into English in my
: head so I can see whats happening...
:
: For clarity with this one...
:
: \[b\]([^\]]+)\[\/b\]
:
: Ok - the first thing I did was break it up - because it looks a nightmare
: like that...so now I have..3 parts..
:
: \[b\]
:
: ([^\]]+)
:
: \[\/b\]
:
: In part 1 I under stand that the \ is telling the expression that a
special
: character is coming next, thus escaping the [ and ] respectively, thus
,
: and in part 3 I understand the same and also the escaping of the /, thus
:
- no problems so far...
:
: In part 2 I'm assuming that the brackets are seperating a "pattern" so
that
: I might reference it as $1 later on, I understand that the [^] are saying
: "not enclosed" so therefore it's going to ignore a ], the + sign however
: perplexes me - my VBScript book says "Matches the preceeding character one
: or more times" - so does that mean the ] just before it, or does it mean
the
: character OR characters defined within the [ ] etc?

the character OR characters defined within the [ ] etc?
[ ] as a negative character set. Matches any character not ]
\] as a literal ]
+ one or more characters in the character set

It says it doesn't want ] in between

This looks like forum code where forums, portals, etc. don't allow HTML
coding but do allow forum coding and this it to capture the content in the
( ) group and replace the forum code with <b></b> to make it bold.
They could also use an inline style with: <span style="font-weight:
bold"></span>.

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
R

roger

\[b\]([^\]]+)\[\/b\]

I think it might be easier if you search for a general tag
then find its matching closing tag
then recursively process the string between the two.

e.g.

\[(\w+)=?([^\]]*)\]

which is...

[ followed by
a word - \w+ is the same as [A-Za-z0-9_]+
followed by one (optional) equals sign
possibly followed by some other stuff before the ]

For instance...

function fnparse(fString)
dim re, ms, m
dim token, closetag
dim iStart, iEnd
dim s, t

set re = new regexp
re.pattern = "\[(\w+)=?([^\]]*)\]"
set ms = re.execute(fString)
If ms.Count > 0 Then
set m = ms(0)
iStart = m.FirstIndex + m.Length + 1
token = m.SubMatches(0) 'b, i, url etc.
t = m.SubMatches(1) 'tag attribute (if any)
closetag = "[/" & token & "]"
iEnd = InStr(iStart, fString, closetag, 1)
if iEnd > 0 Then
s = Mid(fString, iStart, iEnd - iStart)
else 'error: closing tag not found
s = Mid(fString, iStart)
end if
select case LCase(token)
case "url"
if len(t) > 0 then t = " href='" & t & "'"
s = "<a" & t & ">" & fnparse(s) & "</a>"
'case (any other special cases here)
case else
s = "<" & token & ">" & fnparse(s) & "</" & token & ">"
end select
'output any string to the left of the tag
'then the processed string
'then process the rest of the input
iEnd = iEnd + len(closetag)
s = Mid(fString, 1, m.FirstIndex) & s & fnparse(Mid(fString, iEnd))
else 'no tags found
s = fString
end if
fnparse = s

end function


The function uses 'SubMatches' which
is not in earlier versions of VBScript
 
R

roger

oops... I meant to reply to the 'bbcode parsing' thread
but clicked on the wrong post
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top