Regular Expression Help please!

Giles · Nov 1, 2009

My (VB/ASP) site parses pseudocode created by authors. For example, the
author's HTML might contain
[start small padded box] This content displays in a box [end small padded
box]
The bits in square brackets are then replaced with appropriate HTML to
create a border around the text.

PageHTML=replace(PageHTML,"[start small padded box]","<div
style='width:200px; padding:4px; border:1px solid #000'>")
PageHTML=replace(PageHTML,"[end small padded box]","</div>")

The problem is, the spaces might (or might not) be   due to the
authoring interface
[start small padded box] This content displays in a box [end small
padded box]

Is there a regular expression that can turn   (if they exist) into
spaces? (Prior to applying the pseudocode conversion)

function deNBSP(s,html)
?
?
end function
PageHTML=deNBSP("[start small padded box]", PageHTML)
PageHTML=deNBSP("[end small padded box]", PageHTML)

The pseudocode phrases can be quite long, and have a lot of spaces, I was
hoping a RegExp would be quicker than looping through, using replace() for
every permutation of space -   in a phrase.

Thanks if you can help, or advise a different strategy.

Bob Barrows · Nov 1, 2009

Giles said:
The problem is, the spaces might (or might not) be   due to the
authoring interface
[start small padded box] This content displays in a box
[end small padded box]

Is there a regular expression that can turn   (if they exist)
into spaces? (Prior to applying the pseudocode conversion)

..

A simple call to Replace should do this - no need for regex.
s = Replace(s," ", " ")

Giles · Nov 1, 2009

Giles said:
The problem is, the spaces might (or might not) be   due to the
authoring interface
[start small padded box] This content displays in a box
[end small padded box]

Is there a regular expression that can turn   (if they exist)
into spaces? (Prior to applying the pseudocode conversion)

Click to expand...

.
Bob Barrows wrote:

A simple call to Replace should do this - no need for regex.
s = Replace(s," ", " ")

Thanks Bob, but that would change all the nbsp's in the PageHTML, not just
the ones in the pseudocode phrases. The page may contain other necessary
nbsp's. It's the pseudocode phrase that varies:
I am trying to find a way around doing -
PageHTML=replace(PageHTML,"[start small padded box]","[start small
padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start small
padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start small
padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start small
padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start small
padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")

Bob Barrows · Nov 1, 2009

Giles said:
Giles said:

The problem is, the spaces might (or might not) be   due to the
authoring interface
[start small padded box] This content displays in a box
[end small padded box]

Is there a regular expression that can turn   (if they exist)
into spaces? (Prior to applying the pseudocode conversion)

Click to expand...

.
Bob Barrows wrote:

A simple call to Replace should do this - no need for regex.
s = Replace(s," ", " ")

Click to expand...

Thanks Bob, but that would change all the nbsp's in the PageHTML, not
just the ones in the pseudocode phrases. The page may contain other
necessary nbsp's. It's the pseudocode phrase that varies:
I am trying to find a way around doing -
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded
box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start
small padded box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")

Then you will need regex. Unfortunately, I'm not fluent in regular
expressions so all I can do is suggest you go to the documentation.
Hopefully someone else will jump in and help you out.

Evertjan. · Nov 1, 2009

Bob Barrows wrote on 01 nov 2009 in
microsoft.public.inetserver.asp.general:

Giles said:
Giles said:

Giles wrote:

The problem is, the spaces might (or might not) be   due to
the authoring interface
[start small padded box] This content displays in a box
[end small padded box]

Is there a regular expression that can turn   (if they exist)
into spaces? (Prior to applying the pseudocode conversion)
.
Bob Barrows wrote:

A simple call to Replace should do this - no need for regex.
s = Replace(s," ", " ")

Click to expand...

Thanks Bob, but that would change all the nbsp's in the PageHTML, not
just the ones in the pseudocode phrases. The page may contain other
necessary nbsp's. It's the pseudocode phrase that varies:
I am trying to find a way around doing -
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded
box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start
small padded box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[
start small padded box]")

Click to expand...

Then you will need regex. Unfortunately, I'm not fluent in regular
expressions so all I can do is suggest you go to the documentation.
Hopefully someone else will jump in and help you out.

Perhaps I can help you out with Regex,
but I do not know what you mean by "pseudocode phrases".

Let us just define a string called PageHTML [i am not interested in the
final purpose], I suppose pars of that string with well defined start and
ends need to be purged of a certain substring.

Please define the start and end of such substrings.

Giles · Nov 1, 2009

Evertjan. said:
Bob Barrows wrote on 01 nov 2009 in
microsoft.public.inetserver.asp.general:

Giles said:

Giles wrote:

The problem is, the spaces might (or might not) be   due to
the authoring interface
[start small padded box] This content displays in a box
[end small padded box]

Is there a regular expression that can turn   (if they exist)
into spaces? (Prior to applying the pseudocode conversion)
.
Bob Barrows wrote:

A simple call to Replace should do this - no need for regex.
s = Replace(s," ", " ")

Thanks Bob, but that would change all the nbsp's in the PageHTML, not
just the ones in the pseudocode phrases. The page may contain other
necessary nbsp's. It's the pseudocode phrase that varies:
I am trying to find a way around doing -
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[start
small padded box]")
PageHTML=replace(PageHTML,"[start small padded
box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start
small padded box]","[start small padded box]")
PageHTML=replace(PageHTML,"[start small padded box]","[
start small padded box]")

Click to expand...

Then you will need regex. Unfortunately, I'm not fluent in regular
expressions so all I can do is suggest you go to the documentation.
Hopefully someone else will jump in and help you out.

Click to expand...

Perhaps I can help you out with Regex,
but I do not know what you mean by "pseudocode phrases".

Let us just define a string called PageHTML [i am not interested in the
final purpose], I suppose pars of that string with well defined start and
ends need to be purged of a certain substring.

Please define the start and end of such substrings.

Thank you Evertjan
Each pseudocode phrase is a sub-string within PageHTML that starts with
Open-Square-Bracket [, and ends with Close-Square-Bracket, ].
It can contain any number of words, separated by spaces.
Some of the "spaces" might be  
It needs to be purged of   each occurrence being replaced by a space.

e.g.
[word1 word2 word3 word4] - is OK
[word1 word2] - needs converting to [word1 word2]

Examples are
[podcast lecture.mp3]

[movie /flv/demo.flv width=400 height=300]

<b>Quiz</b><br />
[mcq start]
Questions here...
[mcq end]

Evertjan. · Nov 1, 2009

Giles wrote on 01 nov 2009 in microsoft.public.inetserver.asp.general:

Thank you Evertjan
Each pseudocode phrase is a sub-string within PageHTML that starts
with Open-Square-Bracket [, and ends with Close-Square-Bracket, ].
It can contain any number of words, separated by spaces.
Some of the "spaces" might be  
It needs to be purged of   each occurrence being replaced by a
space.

Could be done like this,
I use a Javascript function for simplicity:

==============================================
<% 'vbs

PageHTML = "[word1 word2 word3 word4]z z" &_
"[word5 word6]z z" &_
"[word7 word8 word9]"

PageHTML = replaceNbsp(PageHTML)

Response.write PageHTML

%>

<script language='javascript' runat='server'>
function replaceNbsp(s) {
return s.replace(/(\[.*?\])/g,function(a)
{return a.replace(/ /g,' ');});
};
</script>
==============================================

You will need view-source to see that the  
outside the [...] are not touched.

Giles · Nov 2, 2009

Giles wrote on 01 nov 2009 in microsoft.public.inetserver.asp.general:

Thank you Evertjan
Each pseudocode phrase is a sub-string within PageHTML that starts
with Open-Square-Bracket [, and ends with Close-Square-Bracket, ].
It can contain any number of words, separated by spaces.
Some of the "spaces" might be  
It needs to be purged of   each occurrence being replaced by a
space.

Click to expand...

Could be done like this,
I use a Javascript function for simplicity:

==============================================
<% 'vbs

PageHTML = "[word1 word2 word3 word4]z z" &_
"[word5 word6]z z" &_
"[word7 word8 word9]"

PageHTML = replaceNbsp(PageHTML)

Response.write PageHTML

%>

<script language='javascript' runat='server'>
function replaceNbsp(s) {
return s.replace(/(\[.*?\])/g,function(a)
{return a.replace(/ /g,' ');});
};
</script>
==============================================

You will need view-source to see that the  
outside the [...] are not touched.

Perfect. Your help is very much appreciated, thank you Evertjan

I need help fixing my website	2	Oct 15, 2023
Help with my responsive home page	2	Dec 14, 2022
Popup HTML help	5	Nov 28, 2019
Help with Regular Expression	1	Apr 6, 2008
Regular Expression Help	3	Mar 14, 2007
Survey details won't go through using php, ajax, Mysql	0	Oct 26, 2023
Help with Visual Lightbox: Scripts	2	May 3, 2023
HTML Site Problems	11	Nov 25, 2019

Regular Expression Help please!

Giles

Bob Barrows

Giles

Bob Barrows

Evertjan.

Giles

Evertjan.

Giles

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads