Multi-line replace with string, not regexp

B

Brett

I'm working on a project where a paragraph of text may contain markup
such as:

[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming")

I want to replace any instance of the above markup with an HTML link.
E.g. the link text is "Dewhurst" and clicking it produces an alert
with the full citation.

I've already written code to find each markupLink and convert it to
the desired HTML. The problem I have is putting it back into the
paragraph.

Suppose I've converted
linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming")'
into
linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
a>"

I want to do a multi-line replace, replacing linkMarkup with
linkHtml.

txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
linkMarkup isn't a regexp pattern, it's just a string. Characters such
as the '++' in C++ need to be escaped.

Is there a way to convert a plain string into a regexp patter which
matches the plain string?
 
R

RobG

I'm working on a project where a paragraph of text may contain markup
such as:

[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming")

I want to replace any instance of the above markup with an HTML link.
E.g. the link text is "Dewhurst" and clicking it produces an alert
with the full citation.

Use something that more closely approximates a real reference:

<h2>References</h2>
<ol>
<li><a name="Dewhurst"></a>Dewhurst, Stephen, C. <cite>&quot;C++
Common Knowledge: Essential Intermediate Programming&quot;/cite>
</ol>
<p>Here is a statement that references "&hellip;something writtten by
Dewhurst" <sup><a class="ref" href="#Dewhurst">[Dewhurst]</a></sup>


And do it all on the server - no javascript required.
 
A

Asen Bozhilov

Brett said:
Is there a way to convert a plain string into a regexp patter which
matches the plain string?

/**
* Escape not allowed symbols in PatternCharacter
* PatternCharacter ::
* SourceCharacter but not any of:
* ^ $ \ . * + ? ( ) [ ] { } |
*/
function escapeRegExp(str) {
return str.replace(/[\^\$\\\.\*\+\?\(\)\[\]\{\}\|]/g, "\\$&");
}

escapeRegExp('[\\d]+'); //-> \[\\d\]\+
 
A

Asen Bozhilov

Asen said:
Brett said:
Is there a way to convert a plain string into a regexp patter which
matches the plain string?

/**
 * Escape not allowed symbols in PatternCharacter
 * PatternCharacter ::
 *    SourceCharacter but not any of:
 * ^ $ \ . * + ? ( ) [ ] { } |
 */
function escapeRegExp(str) {
    return str.replace(/[\^\$\\\.\*\+\?\(\)\[\]\{\}\|]/g, "\\$&");

More readable RegExp is:

/[$^\\.*+?()[\]{}|]/g

I think it is not a bad idea FAQ to add entry about this topic.
 
R

Ry Nohryb

I'm working on a project where a paragraph of text may contain markup
such as:

[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming")

I want to replace any instance of the above markup with an HTML link.
E.g. the link text is "Dewhurst" and clicking it produces an alert
with the full citation.

I've already written code to find each markupLink and convert it to
the desired HTML. The problem I have is putting it back into the
paragraph.

Suppose I've converted
linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming")'
into
linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
a>"

I want to do a multi-line replace, replacing linkMarkup with
linkHtml.

txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
linkMarkup isn't a regexp pattern, it's just a string. Characters such
as the '++' in C++ need to be escaped.

Is there a way to convert a plain string into a regexp patter which
matches the plain string?

txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
Knowledge: Essential Intermediate Programming") plus some more text
plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming") plus even more text';

linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming")';

linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
a>";

while (txt.indexOf(linkMarkup) >= 0) txt= txt.replace(linkMarkup,
linkHtml);
--> "some text plus <a href="javascript:alert('Dewhurst, Stephen, C. "C
++ Common Knowledge: Essential Intermediate Programming"');">Dewhurst</
a> plus some more text plus again <a href="javascript:alert('Dewhurst,
Stephen, C. "C++ Common Knowledge: Essential Intermediate
Programming"');">Dewhurst</a> plus even more text"
 
S

SAM

Le 08/08/10 17:46, Brett a écrit :
I'm working on a project where a paragraph of text may contain markup
such as:

[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming")

I want to replace any instance of the above markup with an HTML link.
E.g. the link text is "Dewhurst" and clicking it produces an alert
with the full citation.

I've already written code to find each markupLink and convert it to
the desired HTML. The problem I have is putting it back into the
paragraph.

Suppose I've converted
linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming")'

linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. \"C++ Common
Knowledge:\nEssential Intermediate Programming\")';

or maybe :

linkMarkup = /\[Dewhurst]\(Dewhurst, Stephen, C. \"C\+\+ Common
Knowledge:[\n\r]*Essential Intermediate Programming\")/;

(both in one line, and linkHtml too)

into :

linkHtml = '<a href="javascript:alert(\'(Dewhurst, Stephen, C. \\"C++
into
linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
a>"

I want to do a multi-line replace, replacing linkMarkup with
linkHtml.

I think that is only possible with a "real" regexp (that will search all
characters between 2 tags (or marker) )

linkMarkup = /\[Dewhurst][^_]*\[\/Dewhurst]/;
txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
linkMarkup isn't a regexp pattern, it's just a string. Characters such
as the '++' in C++ need to be escaped.

txt.replace(/\[Dewhurst][^_]*\[\/Dewhurst]/g, linkHtml);


I think that is not the + the problem
I think it's the line return that causes troubles
and, perhaps too, the " and ' and ( in replacing string

Is there a way to convert a plain string into a regexp patter which
matches the plain string?

the "plain" string must be first a "string" (in JS understanding)
 
R

Ry Nohryb

Le 08/08/10 17:46, Brett a écrit :








I'm working on a project where a paragraph of text may contain markup
such as:
[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming")
I want to replace any instance of the above markup with an HTML link.
E.g. the link text is "Dewhurst" and clicking it produces an alert
with the full citation.
I've already written code to find each markupLink and convert it to
the desired HTML. The problem I have is putting it back into the
paragraph.
Suppose I've converted
linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming")'

linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. \"C++ Common
Knowledge:\nEssential Intermediate Programming\")';

or maybe :

linkMarkup = /\[Dewhurst]\(Dewhurst, Stephen, C. \"C\+\+ Common
Knowledge:[\n\r]*Essential Intermediate Programming\")/;

(both in one line, and linkHtml too)

into :

linkHtml = '<a href="javascript:alert(\'(Dewhurst, Stephen, C. \\"C++
Common Knowledge: Essential Intermediate Programming\\")\')">Dewhurst said:
into
linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
a>"
I want to do a multi-line replace, replacing linkMarkup with
linkHtml.

I think that is only possible with a "real" regexp (that will search all
characters between 2 tags (or marker) )

linkMarkup = /\[Dewhurst][^_]*\[\/Dewhurst]/;
txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
linkMarkup isn't a regexp pattern, it's just a string. Characters such
as the '++' in C++ need to be escaped.

txt.replace(/\[Dewhurst][^_]*\[\/Dewhurst]/g, linkHtml);

I think that is not the + the problem
I think it's the line return that causes troubles
and, perhaps too, the " and ' and ( in replacing string
Is there a way to convert a plain string into a regexp patter which
matches the plain string?

the "plain" string must be first a "string" (in JS understanding)

But I wonder, why the hassle when you can do it by looping an ordinary
replace ? Is it because regexps are sooo cool that one should use them
amap even when/if they're not the right/more convenient tool for the
task at hand ?
:)
 
S

SAM

Le 09/08/10 14:49, Ry Nohryb a écrit :
txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
Knowledge: Essential Intermediate Programming") plus some more text
plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming") plus even more text';

And what about (what I think OP wanted) :

txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
Knowledge: '+
'\n\r' +
'Essential Intermediate Programming") plus some more textplus again
[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming") plus even more text';

???
 
S

SAM

Le 09/08/10 15:55, Ry Nohryb a écrit :
But I wonder, why the hassle when you can do it by looping an ordinary
replace ?

You're talking about a text-editor ?
Yes a text-editor by copy/past can search a multi-lines text
(and then replace it)
Is it because regexps are sooo cool that one should use them
amap even when/if they're not the right/more convenient tool for the
task at hand ?

It's not the fault to RegExp if JS breaks on a line return
(text-editor's line return in a JS string)

Even in my text-editor I use RegExp for multi-replacements,
it's really too cool ;-)

search :
art: (\d+)
replace all :
article: \1 - ref: shop-\1

In JS :
texto.replace(/art: (\d+)/g,'article: $1 - ref: shop-$1');
 
R

Ry Nohryb

Le 09/08/10 15:55, Ry Nohryb a écrit :


You're talking about a text-editor ?
Yes a text-editor by copy/past can search a multi-lines text
(and then replace it)


It's not the fault to RegExp if JS breaks on a line return
(text-editor's line return in a JS string)

Even in my text-editor I use RegExp for multi-replacements,
it's really too cool ;-)

search :
        art: (\d+)
replace all :
        article: \1 - ref: shop-\1

In JS :
        texto.replace(/art: (\d+)/g,'article: $1 - ref: shop-$1');

But in this case, the OP has a string that must be escaped if it's to
be used as a regexp for the search, therefore, I'd say, well, then
don't use it as a regexp, just loop using a regular search (not a //g
regexp) until done. BTW that's because I'm guessing that when he says
multiline he really means he wants to replace multiple instances, that
is, a //g regexp, which is no more than ~ a simple loop.
 
R

Ry Nohryb

Le 09/08/10 14:49, Ry Nohryb a écrit :
txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
Knowledge: Essential Intermediate Programming") plus some more text
plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
Essential Intermediate Programming") plus even more text';

And what about (what I think OP wanted) :

txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
Knowledge: '+
'\n\r' +
'Essential Intermediate Programming") plus some more textplus again
[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
Intermediate Programming") plus even more text';

???

He has said: "I've already written code to find each markupLink and
convert it to the desired HTML. The problem I have is putting it back
into the paragraph.". I'm hoping the "find" includes line breaks...
 
L

Lasse Reichstein Nielsen

Brett said:
I've already written code to find each markupLink and convert it to
the desired HTML. The problem I have is putting it back into the
paragraph.

If you have found it, you probably also have found the start position
in the original string. In that case, replaceing the string match at
position pos with something else is easily done as:

string = string.substring(0, pos) + something_else +
string.substring(pos + match.length);

If you are doing multiple replacements, you shouldn't add the rest of
the string and then start splitting it apart again, but instead work
iteratively to add replacements and in-between text until you have
processed the entire string.
I want to do a multi-line replace, replacing linkMarkup with
linkHtml.

txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
linkMarkup isn't a regexp pattern, it's just a string. Characters such
as the '++' in C++ need to be escaped.

You don't really want/need to use the multiline flag. All it does is
change the behavior of "^" and "$", which you don't use anyway.
Is there a way to convert a plain string into a regexp patter which
matches the plain string?

Others have shown how to replace all special characters with escaped
versions of themselves, but I wouldn't use RegExp for this.
Even if you don't have the start position, you can still use
string.indexOf(text) to find the text, and then use the string
operations above.

/L
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top