Multi-line replace with string, not regexp

Discussion in 'Javascript' started by Brett, Aug 8, 2010.

  1. Brett

    Brett Guest

    I'm working on a project where a paragraph of text may contain markup
    such as:

    [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    Intermediate Programming")

    I want to replace any instance of the above markup with an HTML link.
    E.g. the link text is "Dewhurst" and clicking it produces an alert
    with the full citation.

    I've already written code to find each markupLink and convert it to
    the desired HTML. The problem I have is putting it back into the
    paragraph.

    Suppose I've converted
    linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    Essential Intermediate Programming")'
    into
    linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
    Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
    a>"

    I want to do a multi-line replace, replacing linkMarkup with
    linkHtml.

    txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
    linkMarkup isn't a regexp pattern, it's just a string. Characters such
    as the '++' in C++ need to be escaped.

    Is there a way to convert a plain string into a regexp patter which
    matches the plain string?
    Brett, Aug 8, 2010
    #1
    1. Advertising

  2. Brett

    RobG Guest

    On Aug 9, 1:46 am, Brett <> wrote:
    > I'm working on a project where a paragraph of text may contain markup
    > such as:
    >
    > [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    > Intermediate Programming")
    >
    > I want to replace any instance of the above markup with an HTML link.
    > E.g. the link text is "Dewhurst" and clicking it produces an alert
    > with the full citation.


    Use something that more closely approximates a real reference:

    <h2>References</h2>
    <ol>
    <li><a name="Dewhurst"></a>Dewhurst, Stephen, C. <cite>&quot;C++
    Common Knowledge: Essential Intermediate Programming&quot;/cite>
    </ol>
    <p>Here is a statement that references "&hellip;something writtten by
    Dewhurst" <sup><a class="ref" href="#Dewhurst">[Dewhurst]</a></sup>


    And do it all on the server - no javascript required.


    --
    Rob
    RobG, Aug 9, 2010
    #2
    1. Advertising

  3. Brett wrote:

    > Is there a way to convert a plain string into a regexp patter which
    > matches the plain string?


    /**
    * Escape not allowed symbols in PatternCharacter
    * PatternCharacter ::
    * SourceCharacter but not any of:
    * ^ $ \ . * + ? ( ) [ ] { } |
    */
    function escapeRegExp(str) {
    return str.replace(/[\^\$\\\.\*\+\?\(\)\[\]\{\}\|]/g, "\\$&");
    }

    escapeRegExp('[\\d]+'); //-> \[\\d\]\+
    Asen Bozhilov, Aug 9, 2010
    #3
  4. Asen Bozhilov wrote:
    > Brett wrote:
    > > Is there a way to convert a plain string into a regexp patter which
    > > matches the plain string?

    >
    > /**
    >  * Escape not allowed symbols in PatternCharacter
    >  * PatternCharacter ::
    >  *    SourceCharacter but not any of:
    >  * ^ $ \ . * + ? ( ) [ ] { } |
    >  */
    > function escapeRegExp(str) {
    >     return str.replace(/[\^\$\\\.\*\+\?\(\)\[\]\{\}\|]/g, "\\$&");


    More readable RegExp is:

    /[$^\\.*+?()[\]{}|]/g

    I think it is not a bad idea FAQ to add entry about this topic.
    Asen Bozhilov, Aug 9, 2010
    #4
  5. Brett

    Ry Nohryb Guest

    On Aug 8, 5:46 pm, Brett <> wrote:
    > I'm working on a project where a paragraph of text may contain markup
    > such as:
    >
    > [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    > Intermediate Programming")
    >
    > I want to replace any instance of the above markup with an HTML link.
    > E.g. the link text is "Dewhurst" and clicking it produces an alert
    > with the full citation.
    >
    > I've already written code to find each markupLink and convert it to
    > the desired HTML. The problem I have is putting it back into the
    > paragraph.
    >
    > Suppose I've converted
    > linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    > Essential Intermediate Programming")'
    > into
    > linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
    > Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
    > a>"
    >
    > I want to do a multi-line replace, replacing linkMarkup with
    > linkHtml.
    >
    > txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
    > linkMarkup isn't a regexp pattern, it's just a string. Characters such
    > as the '++' in C++ need to be escaped.
    >
    > Is there a way to convert a plain string into a regexp patter which
    > matches the plain string?


    txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
    Knowledge: Essential Intermediate Programming") plus some more text
    plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    Essential Intermediate Programming") plus even more text';

    linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    Essential Intermediate Programming")';

    linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
    Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
    a>";

    while (txt.indexOf(linkMarkup) >= 0) txt= txt.replace(linkMarkup,
    linkHtml);
    --> "some text plus <a href="javascript:alert('Dewhurst, Stephen, C. "C
    ++ Common Knowledge: Essential Intermediate Programming"');">Dewhurst</
    a> plus some more text plus again <a href="javascript:alert('Dewhurst,
    Stephen, C. "C++ Common Knowledge: Essential Intermediate
    Programming"');">Dewhurst</a> plus even more text"
    --
    Jorge.
    Ry Nohryb, Aug 9, 2010
    #5
  6. Brett

    SAM Guest

    Le 08/08/10 17:46, Brett a écrit :
    > I'm working on a project where a paragraph of text may contain markup
    > such as:
    >
    > [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    > Intermediate Programming")
    >
    > I want to replace any instance of the above markup with an HTML link.
    > E.g. the link text is "Dewhurst" and clicking it produces an alert
    > with the full citation.
    >
    > I've already written code to find each markupLink and convert it to
    > the desired HTML. The problem I have is putting it back into the
    > paragraph.
    >
    > Suppose I've converted
    > linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    > Essential Intermediate Programming")'


    linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. \"C++ Common
    Knowledge:\nEssential Intermediate Programming\")';

    or maybe :

    linkMarkup = /\[Dewhurst]\(Dewhurst, Stephen, C. \"C\+\+ Common
    Knowledge:[\n\r]*Essential Intermediate Programming\")/;

    (both in one line, and linkHtml too)

    into :

    linkHtml = '<a href="javascript:alert(\'(Dewhurst, Stephen, C. \\"C++
    Common Knowledge: Essential Intermediate Programming\\")\')">Dewhurst</a>';


    > into
    > linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
    > Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
    > a>"
    >
    > I want to do a multi-line replace, replacing linkMarkup with
    > linkHtml.


    I think that is only possible with a "real" regexp (that will search all
    characters between 2 tags (or marker) )

    linkMarkup = /\[Dewhurst][^_]*\[\/Dewhurst]/;

    > txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
    > linkMarkup isn't a regexp pattern, it's just a string. Characters such
    > as the '++' in C++ need to be escaped.


    txt.replace(/\[Dewhurst][^_]*\[\/Dewhurst]/g, linkHtml);


    I think that is not the + the problem
    I think it's the line return that causes troubles
    and, perhaps too, the " and ' and ( in replacing string


    > Is there a way to convert a plain string into a regexp patter which
    > matches the plain string?


    the "plain" string must be first a "string" (in JS understanding)

    --
    sm
    SAM, Aug 9, 2010
    #6
  7. Brett

    Ry Nohryb Guest

    On Aug 9, 3:42 pm, SAM <>
    wrote:
    > Le 08/08/10 17:46, Brett a écrit :
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > I'm working on a project where a paragraph of text may contain markup
    > > such as:

    >
    > > [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    > > Intermediate Programming")

    >
    > > I want to replace any instance of the above markup with an HTML link.
    > > E.g. the link text is "Dewhurst" and clicking it produces an alert
    > > with the full citation.

    >
    > > I've already written code to find each markupLink and convert it to
    > > the desired HTML. The problem I have is putting it back into the
    > > paragraph.

    >
    > > Suppose I've converted
    > > linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    > > Essential Intermediate Programming")'

    >
    > linkMarkup = '[Dewhurst](Dewhurst, Stephen, C. \"C++ Common
    > Knowledge:\nEssential Intermediate Programming\")';
    >
    > or maybe :
    >
    > linkMarkup = /\[Dewhurst]\(Dewhurst, Stephen, C. \"C\+\+ Common
    > Knowledge:[\n\r]*Essential Intermediate Programming\")/;
    >
    > (both in one line, and linkHtml too)
    >
    > into :
    >
    > linkHtml = '<a href="javascript:alert(\'(Dewhurst, Stephen, C. \\"C++
    > Common Knowledge: Essential Intermediate Programming\\")\')">Dewhurst</a>';
    >
    > > into
    > > linkHtml = "<a href=\"javascript:alert('Dewhurst, Stephen, C. \"C++
    > > Common Knowledge: Essential Intermediate Programming\"');\">Dewhurst</
    > > a>"

    >
    > > I want to do a multi-line replace, replacing linkMarkup with
    > > linkHtml.

    >
    > I think that is only possible with a "real" regexp (that will search all
    > characters between 2 tags (or marker) )
    >
    > linkMarkup = /\[Dewhurst][^_]*\[\/Dewhurst]/;
    >
    > > txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
    > > linkMarkup isn't a regexp pattern, it's just a string. Characters such
    > > as the '++' in C++ need to be escaped.

    >
    > txt.replace(/\[Dewhurst][^_]*\[\/Dewhurst]/g, linkHtml);
    >
    > I think that is not the + the problem
    > I think it's the line return that causes troubles
    > and, perhaps too, the " and ' and ( in replacing string
    >
    > > Is there a way to convert a plain string into a regexp patter which
    > > matches the plain string?

    >
    > the "plain" string must be first a "string" (in JS understanding)


    But I wonder, why the hassle when you can do it by looping an ordinary
    replace ? Is it because regexps are sooo cool that one should use them
    amap even when/if they're not the right/more convenient tool for the
    task at hand ?
    :)
    --
    Jorge.
    Ry Nohryb, Aug 9, 2010
    #7
  8. Brett

    SAM Guest

    Le 09/08/10 14:49, Ry Nohryb a écrit :
    > txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
    > Knowledge: Essential Intermediate Programming") plus some more text
    > plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    > Essential Intermediate Programming") plus even more text';


    And what about (what I think OP wanted) :

    txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
    Knowledge: '+
    '\n\r' +
    'Essential Intermediate Programming") plus some more textplus again
    [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    Intermediate Programming") plus even more text';

    ???

    --
    sm
    SAM, Aug 9, 2010
    #8
  9. Brett

    SAM Guest

    Le 09/08/10 15:55, Ry Nohryb a écrit :
    > But I wonder, why the hassle when you can do it by looping an ordinary
    > replace ?


    You're talking about a text-editor ?
    Yes a text-editor by copy/past can search a multi-lines text
    (and then replace it)

    > Is it because regexps are sooo cool that one should use them
    > amap even when/if they're not the right/more convenient tool for the
    > task at hand ?


    It's not the fault to RegExp if JS breaks on a line return
    (text-editor's line return in a JS string)

    Even in my text-editor I use RegExp for multi-replacements,
    it's really too cool ;-)

    search :
    art: (\d+)
    replace all :
    article: \1 - ref: shop-\1

    In JS :
    texto.replace(/art: (\d+)/g,'article: $1 - ref: shop-$1');


    --
    sm
    SAM, Aug 9, 2010
    #9
  10. Brett

    Ry Nohryb Guest

    On Aug 9, 4:34 pm, SAM <>
    wrote:
    > Le 09/08/10 15:55, Ry Nohryb a écrit :
    >
    > > But I wonder, why the hassle when you can do it by looping an ordinary
    > > replace ?

    >
    > You're talking about a text-editor ?
    > Yes a text-editor by copy/past can search a multi-lines text
    > (and then replace it)
    >
    > > Is it because regexps are sooo cool that one should use them
    > > amap even when/if they're not the right/more convenient tool for the
    > > task at hand ?

    >
    > It's not the fault to RegExp if JS breaks on a line return
    > (text-editor's line return in a JS string)
    >
    > Even in my text-editor I use RegExp for multi-replacements,
    > it's really too cool ;-)
    >
    > search :
    >         art: (\d+)
    > replace all :
    >         article: \1 - ref: shop-\1
    >
    > In JS :
    >         texto.replace(/art: (\d+)/g,'article: $1 - ref: shop-$1');


    But in this case, the OP has a string that must be escaped if it's to
    be used as a regexp for the search, therefore, I'd say, well, then
    don't use it as a regexp, just loop using a regular search (not a //g
    regexp) until done. BTW that's because I'm guessing that when he says
    multiline he really means he wants to replace multiple instances, that
    is, a //g regexp, which is no more than ~ a simple loop.
    --
    Jorge.
    Ry Nohryb, Aug 9, 2010
    #10
  11. Brett

    Ry Nohryb Guest

    On Aug 9, 4:30 pm, SAM <>
    wrote:
    > Le 09/08/10 14:49, Ry Nohryb a écrit :
    >
    > > txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
    > > Knowledge: Essential Intermediate Programming") plus some more text
    > > plus again [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge:
    > > Essential Intermediate Programming") plus even more text';

    >
    > And what about (what I think OP wanted) :
    >
    > txt= 'some text plus [Dewhurst](Dewhurst, Stephen, C. "C++ Common
    > Knowledge: '+
    > '\n\r' +
    > 'Essential Intermediate Programming") plus some more textplus again
    > [Dewhurst](Dewhurst, Stephen, C. "C++ Common Knowledge: Essential
    > Intermediate Programming") plus even more text';
    >
    > ???


    He has said: "I've already written code to find each markupLink and
    convert it to the desired HTML. The problem I have is putting it back
    into the paragraph.". I'm hoping the "find" includes line breaks...
    --
    Jorge.
    Ry Nohryb, Aug 9, 2010
    #11
  12. Brett <> writes:

    > I've already written code to find each markupLink and convert it to
    > the desired HTML. The problem I have is putting it back into the
    > paragraph.


    If you have found it, you probably also have found the start position
    in the original string. In that case, replaceing the string match at
    position pos with something else is easily done as:

    string = string.substring(0, pos) + something_else +
    string.substring(pos + match.length);

    If you are doing multiple replacements, you shouldn't add the rest of
    the string and then start splitting it apart again, but instead work
    iteratively to add replacements and in-between text until you have
    processed the entire string.

    > I want to do a multi-line replace, replacing linkMarkup with
    > linkHtml.
    >
    > txt.replace(new RegExp(linkMarkup,'m'), linkHtml) doesn't work because
    > linkMarkup isn't a regexp pattern, it's just a string. Characters such
    > as the '++' in C++ need to be escaped.


    You don't really want/need to use the multiline flag. All it does is
    change the behavior of "^" and "$", which you don't use anyway.

    > Is there a way to convert a plain string into a regexp patter which
    > matches the plain string?


    Others have shown how to replace all special characters with escaped
    versions of themselves, but I wouldn't use RegExp for this.
    Even if you don't have the start position, you can still use
    string.indexOf(text) to find the text, and then use the string
    operations above.

    /L
    --
    Lasse Reichstein Holst Nielsen
    'Javascript frameworks is a disruptive technology'
    Lasse Reichstein Nielsen, Aug 10, 2010
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dean
    Replies:
    5
    Views:
    454
    Phlip
    Jun 4, 2006
  2. Sébastien Maurette
    Replies:
    3
    Views:
    137
    David Vallner
    Oct 2, 2006
  3. Joao Silva
    Replies:
    16
    Views:
    355
    7stud --
    Aug 21, 2009
  4. Anthony Papillion
    Replies:
    0
    Views:
    120
    Anthony Papillion
    Sep 2, 2013
  5. Chris “Kwpolska†Warrick

    Re: How can I remove the first line of a multi-line string?

    Chris “Kwpolska†Warrick, Sep 2, 2013, in forum: Python
    Replies:
    0
    Views:
    103
    Chris “Kwpolska†Warrick
    Sep 2, 2013
Loading...

Share This Page