Regular expression help

Discussion in 'Javascript' started by RobG, Mar 23, 2010.

  1. RobG

    RobG Guest

    I'm working with XML files that sometimes use a default namespace,
    unfortunately there doesn't seem to be an elegant way of dealing with
    them. In some cases I need to modify part of the expression to include
    a random namespace, e.g. change:

    /LandXML/Parcels/Parcel

    into something like:

    /xx:LandXML/xx:parcels/xx:parcel

    Sometimes the expression starts with // so I've been using the
    following regular expression:

    expr = expr.replace(/(\/+)/g,'$1xx:');


    which works fine in most cases. However, sometimes the expression
    includes an attribute value that has slashes. In that case, I don't
    want to modify the attribute value's slash. e.g. at the moment,

    /LandXML/Parcels/Parcel[@name="79a/SP199095"]

    is converted to:

    /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]


    Modifying the attribute value means that the result will be wrong. Is
    there a regular expression that will only modify slashes outside
    square brackets?

    An alternative is to fix the namespace when building the expression,
    which is less elegant than conditionally modifying the expression in
    the evaluator function.


    --
    Rob
     
    RobG, Mar 23, 2010
    #1
    1. Advertising

  2. RobG

    Ken Snyder Guest

    On Mar 23, 5:32 pm, RobG <> wrote:
    > ...
    > ... Is
    > there a regular expression that will only modify slashes outside
    > square brackets?
    > ...


    I'm pretty sure a single regular expression can't do that. You could
    try splitting the expression at 2 double quotes and replacing slashes
    in every other array part. Capturing the split characters is not
    supported across browsers, so you would need to use a function from a
    library like Prototype that normalizes it.

    The splitting expression might look something like this: /("["]+")/

    - Ken
     
    Ken Snyder, Mar 24, 2010
    #2
    1. Advertising

  3. In comp.lang.javascript message <66446dfb-e9cb-4d82-8146-377423e48aa8@k4
    g2000prh.googlegroups.com>, Tue, 23 Mar 2010 16:32:24, RobG
    <> posted:
    >
    >Modifying the attribute value means that the result will be wrong. Is
    >there a regular expression that will only modify slashes outside
    >square brackets?
    >
    >An alternative is to fix the namespace when building the expression,
    >which is less elegant than conditionally modifying the expression in
    >the evaluator function.


    If you in fact only need to modify / where not somewhere preceded by [,
    you could probably split on [, RegExp-alter [0], and rejoin.

    expr = '/LandXML/Parcels/Parcel[@name="79a/SP199095"]'
    S = expr.split("[")
    S[0] = S[0].replace(/(\/+)/g,'$1xx:');
    expr = S.join("[")

    gave '/xx:LandXML/xx:parcels/xx:parcel[@name="79a/SP199095"]'.

    Or you could use, which might be more flexible,

    expr = '/LandXML/Parcels/Parcel[@name="79a/SP199095"]'
    S = "" ; X = 0
    for (J=0 ; J<expr.length ; J++) { C = expr.charAt(J) ; S += C
    if (C == "[") X++ ;
    else if (C == "]") X-- ;
    else if (C == "/" && X<=0) S += "xx:" }
    expr = S

    --
    (c) John Stockton, nr London UK. ?@merlyn.demon.co.uk IE8 FF3 Op10 Sf4 Cr4
    news:comp.lang.javascript FAQ <URL:http://www.jibbering.com/faq/index.html>.
    <URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
    <URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
     
    Dr J R Stockton, Mar 24, 2010
    #3
  4. On Mar 23, 11:32pm, RobG wrote:

    > [... ] I need to modify part of the expression to include
    > a random namespace, e.g. change:
    >
    > /LandXML/Parcels/Parcel
    >
    > into something like:
    >
    > /xx:LandXML/xx:parcels/xx:parcel
    >
    > Sometimes the expression starts with // so I've been using the
    > following regular expression:
    >
    > expr = expr.replace(/(\/+)/g,'$1xx:');
    >
    > which works fine in most cases. However, sometimes the expression
    > includes an attribute value that has slashes. In that case, I don't
    > want to modify the attribute value's slash. e.g. at the moment,
    >
    > /LandXML/Parcels/Parcel[@name="79a/SP199095"]
    >
    > is converted to:
    >
    > /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]
    >
    > Modifying the attribute value means that the result will be wrong.

    Is
    > there a regular expression that will only modify slashes outside
    > square brackets?


    expr = expr.replace(/(?![^[]*\])\//g, '$&xx:');

    Can your attribute value contain square brackets or escaped
    quotation marks? --Antony
     
    Antony Scriven, Mar 24, 2010
    #4
  5. On Mar 24, 3:51pm, (denisb) wrote:

    > Antony Scriven <> wrote:
    >
    > [...]
    >
    > little typo ('+' missing) :
    > expr = expr.replace(/(?![^[]*\])\/+/g, '$&xx:');
    > ....................................^


    Ah yes, I missed that part of the Rob's specification,
    thanks. --Antony
     
    Antony Scriven, Mar 24, 2010
    #5
  6. RobG

    RobG Guest

    On Mar 25, 1:09 am, Antony Scriven <> wrote:
    > On Mar 23, 11:32pm, RobG wrote:
    >
    >  > [... ] I need to modify part of the expression to include
    >  > a random namespace, e.g. change:
    >  >
    >  > /LandXML/Parcels/Parcel
    >  >
    >  > into something like:
    >  >
    >  > /xx:LandXML/xx:parcels/xx:parcel
    >  >
    >  > Sometimes the expression starts with // so I've been using the
    >  > following regular expression:
    >  >
    >  >  expr = expr.replace(/(\/+)/g,'$1xx:');
    >  >
    >  > which works fine in most cases. However, sometimes the expression
    >  > includes an attribute value that has slashes. In that case, I don't
    >  > want to modify the attribute value's slash. e.g. at the moment,
    >  >
    >  > /LandXML/Parcels/Parcel[@name="79a/SP199095"]
    >  >
    >  > is converted to:
    >  >
    >  > /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]
    >  >
    >  > Modifying the attribute value means that the result will be wrong.
    > Is
    >  > there a regular expression that will only modify slashes outside
    >  > square brackets?
    >
    > expr = expr.replace(/(?![^[]*\])\//g, '$&xx:');
    >
    > Can your attribute value contain square brackets or escaped
    > quotation marks? --Antony


    It is for a general XPath expression evaluator, so I don't want to
    have any restrictions on attribute vaules other than those specified
    by XML.

    --
    Rob
     
    RobG, Mar 24, 2010
    #6
  7. RobG

    RobG Guest

    On Mar 25, 12:30 am, Dr J R Stockton <>
    wrote:
    > In comp.lang.javascript message <66446dfb-e9cb-4d82-8146-377423e48aa8@k4
    > g2000prh.googlegroups.com>, Tue, 23 Mar 2010 16:32:24, RobG
    > <> posted:
    >
    >
    >
    > >Modifying the attribute value means that the result will be wrong. Is
    > >there a regular expression that will only modify slashes outside
    > >square brackets?

    >
    > >An alternative is to fix the namespace when building the expression,
    > >which is less elegant than conditionally modifying the expression in
    > >the evaluator function.

    >
    > If you in fact only need to modify / where not somewhere preceded by [,
    > you could probably split on [, RegExp-alter [0], and rejoin.
    >
    >         expr = '/LandXML/Parcels/Parcel[@name="79a/SP199095"]'
    >         S = expr.split("[")
    >         S[0] = S[0].replace(/(\/+)/g,'$1xx:');
    >         expr = S.join("[")
    >
    > gave '/xx:LandXML/xx:parcels/xx:parcel[@name="79a/SP199095"]'.
    >
    > Or you could use, which might be more flexible,
    >
    >         expr = '/LandXML/Parcels/Parcel[@name="79a/SP199095"]'
    >         S = "" ; X = 0
    >         for (J=0 ; J<expr.length ; J++) { C = expr.charAt(J) ; S += C
    >           if        (C == "[") X++ ;
    >             else if (C == "]") X-- ;
    >             else if (C == "/" && X<=0) S += "xx:" }
    >         expr = S


    I'd though of something along those lines, however I think I'll deal
    with it at the expression builder stage, then pass a variable to
    indicate whether a default namespace is being used or not.

    It doesn't matter for IE (as it doesn't know what namespaces are
    anyway), but Firefox is another matte.

    --
    Rob
     
    RobG, Mar 24, 2010
    #7
  8. On Mar 24, 10:38 pm, RobG wrote:

    > On Mar 25, 1:09 am, Antony Scriven <> wrote:
    >
    > > On Mar 23, 11:32pm, RobG wrote:

    >
    > > > [...]
    > > >
    > > > expr = expr.replace(/(\/+)/g,'$1xx:');
    > > >
    > > > [...]
    > > >
    > > > /LandXML/Parcels/Parcel[@name="79a/SP199095"]
    > > >
    > > > is converted to:
    > > >
    > > > /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]
    > > >
    > > > Modifying the attribute value means that the result
    > > > will be wrong. Is there a regular expression that
    > > > will only modify slashes outside square brackets?

    >
    > > expr = expr.replace(/(?![^[]*\])\//g, '$&xx:');

    >
    > > Can your attribute value contain square brackets or
    > > escaped quotation marks? --Antony

    >
    > It is for a general XPath expression evaluator, so
    > I don't want to have any restrictions on attribute vaules
    > other than those specified by XML.


    Now you're moving the goalposts. I've no idea what you're
    really trying to accomplish, but it now sounds that you
    might be better off writing a proper parser rather than
    messing about with regexps. --Antony
     
    Antony Scriven, Mar 25, 2010
    #8
  9. RobG

    Elegie Guest

    RobG wrote:

    Hello Rob,

    <snip>

    > However, sometimes the expression
    > includes an attribute value that has slashes. In that case, I don't
    > want to modify the attribute value's slash. e.g. at the moment,
    >
    > /LandXML/Parcels/Parcel[@name="79a/SP199095"]
    >
    > is converted to:
    >
    > /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]
    >
    >
    > Modifying the attribute value means that the result will be wrong. Is
    > there a regular expression that will only modify slashes outside
    > square brackets?


    The following should give you some path worth walking :)

    ---
    (s).replace(
    /\/([^[/]*(\[(\\.|[^\\\]]*)*\])?)/g,
    "/xx:$1"
    )
    ---

    HTH,
    Elegie.
     
    Elegie, Mar 25, 2010
    #9
  10. On Mar 25, 8:49 pm, Elegie wrote:

    > > [...] In that case, I don't want to modify the
    > > attribute value's slash. e.g. at the moment,

    >
    > > /LandXML/Parcels/Parcel[@name="79a/SP199095"]

    >
    > > is converted to:

    >
    > > /xx:LandXML/xx:parcels/xx:parcel[@name="79a/xx:SP199095"]

    >
    > > Modifying the attribute value means that the result
    > > will be wrong. Is there a regular expression that will
    > > only modify slashes outside square brackets?

    >
    > The following should give you some path worth walking :)
    >
    > ---
    >    (s).replace(
    >      /\/([^[/]*(\[(\\.|[^\\\]]*)*\])?)/g,
    >      "/xx:$1"
    >    )


    Well, if you like brittle one-liners, try this.

    var s = '//LandXML/Parcels/Parcel[@name="]7]' +
    '9a/S\\"P19[9\\"]0/9[/5"]/Blah';

    // Add xx: after slashes, only if they are not within "...".
    // Handles \" within "...".
    s.replace(
    /(\/+)(("([^"]*\\.)*[^"]*"|[^\/])+)/g,
    '$1xx:$2'
    );

    There might be a neater regexp but I'm beginning to lose
    the will! And yes, this is now beginning to resemble line
    noise. --Antony
     
    Antony Scriven, Mar 25, 2010
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anand

    Regular Expression help...

    Anand, Jul 9, 2003, in forum: Perl
    Replies:
    1
    Views:
    1,245
    Eric J. Roode
    Jul 9, 2003
  2. Eric B.
    Replies:
    1
    Views:
    440
    Jim Gibson
    Dec 17, 2004
  3. VSK
    Replies:
    2
    Views:
    2,381
  4. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    881
    Alan Moore
    Dec 2, 2005
  5. GIMME
    Replies:
    3
    Views:
    12,046
    vforvikash
    Dec 29, 2008
Loading...

Share This Page