Matching substrings within a Regex

Discussion in 'Perl Misc' started by newjazzharmony@hotmail.com, May 16, 2006.

  1. Guest

    Hi,

    I need some help crafting a regular expression.
    Consider the following literal expressions:

    Literal String A:
    AccountNumber="

    Literal String B:
    StreetNumber="

    I want to find all instances where String A is followed by string B
    (with any number of miscellaneous characters in between).

    The catch is that I want to capture only instances where the left four
    digits of the AccountNumber equals the left four digits of the
    StreetNumber.

    Examples:

    I want to capture the following strings:

    AccountNumber="11324324" blablablablabla StreetNumber="11322345"
    AccountNumber="234534" blabla StreetNumber="2345999554"

    I DO NOT want to capture the following string:

    AccountNumber="345635" blabla StreetNumber="6789689"


    Thanks!
    Jonathan
     
    , May 16, 2006
    #1
    1. Advertising

  2. wrote:
    > Hi,
    >
    > I need some help crafting a regular expression.
    > Consider the following literal expressions:
    >
    > Literal String A:
    > AccountNumber="
    >
    > Literal String B:
    > StreetNumber="
    >
    > I want to find all instances where String A is followed by string B
    > (with any number of miscellaneous characters in between).
    >
    > The catch is that I want to capture only instances where the left four
    > digits of the AccountNumber equals the left four digits of the
    > StreetNumber.
    >
    > Examples:
    >
    > I want to capture the following strings:
    >
    > AccountNumber="11324324" blablablablabla StreetNumber="11322345"
    > AccountNumber="234534" blabla StreetNumber="2345999554"
    >
    > I DO NOT want to capture the following string:
    >
    > AccountNumber="345635" blabla StreetNumber="6789689"


    use strict; use warnings;

    while ( <DATA> ) {
    /AccountNumber="(\d{4}).*StreetNumber="\1/ and print "matched.\n";
    }

    __DATA__
    AccountNumber="11324324" blablablablabla StreetNumber="1322345"
     
    it_says_BALLS_on_your forehead, May 16, 2006
    #2
    1. Advertising

  3. David Squire Guest

    wrote:
    > Hi,
    >
    > I need some help crafting a regular expression.
    > Consider the following literal expressions:
    >
    > Literal String A:
    > AccountNumber="
    >
    > Literal String B:
    > StreetNumber="
    >
    > I want to find all instances where String A is followed by string B
    > (with any number of miscellaneous characters in between).
    >
    > The catch is that I want to capture only instances where the left four
    > digits of the AccountNumber equals the left four digits of the
    > StreetNumber.


    What have you tried so far?

    >
    > Examples:
    >
    > I want to capture the following strings:
    >
    > AccountNumber="11324324" blablablablabla StreetNumber="11322345"
    > AccountNumber="234534" blabla StreetNumber="2345999554"
    >
    > I DO NOT want to capture the following string:
    >
    > AccountNumber="345635" blabla StreetNumber="6789689"


    You need to learn about backreferences in regexes. See perldoc perlre.

    For example: /([A-Z]).\1/ matches AaA and B2B, but not AaB or B2A.

    DS
     
    David Squire, May 16, 2006
    #3
  4. it_says_BALLS_on_your forehead wrote:
    > wrote:
    > > Hi,
    > >
    > > I need some help crafting a regular expression.
    > > Consider the following literal expressions:
    > >
    > > Literal String A:
    > > AccountNumber="
    > >
    > > Literal String B:
    > > StreetNumber="
    > >
    > > I want to find all instances where String A is followed by string B
    > > (with any number of miscellaneous characters in between).
    > >
    > > The catch is that I want to capture only instances where the left four
    > > digits of the AccountNumber equals the left four digits of the
    > > StreetNumber.
    > >
    > > Examples:
    > >
    > > I want to capture the following strings:
    > >
    > > AccountNumber="11324324" blablablablabla StreetNumber="11322345"
    > > AccountNumber="234534" blabla StreetNumber="2345999554"
    > >
    > > I DO NOT want to capture the following string:
    > >
    > > AccountNumber="345635" blabla StreetNumber="6789689"

    >
    > use strict; use warnings;
    >
    > while ( <DATA> ) {
    > /AccountNumber="(\d{4}).*StreetNumber="\1/ and print "matched.\n";
    > }
    >
    > __DATA__
    > AccountNumber="11324324" blablablablabla StreetNumber="1322345"



    sorry, this illustrates the OP's request more fully.

    use strict; use warnings;

    while ( <DATA> ) {
    /AccountNumber="(\d{4}).*StreetNumber="\1/ and print "matched.\n" or
    print "did not match.\n";
    }

    __DATA__
    AccountNumber="11324324" blablablablabla StreetNumber="1322345"
    AccountNumber="11324324" blablablablabla StreetNumber="11322345"

    __OUTPUT__
    did not match.
    matched.
     
    it_says_BALLS_on_your forehead, May 16, 2006
    #4
  5. Guest

    Thanks for the replies.
    I should have specified that I am not actually using PERL...I just
    figured this would be the best place to get expert Regex advice :)

    I was hoping there would be a way to do this without any programmatic
    structures (e.g., the while loop in the example).
    Is there a way to codify all these rules (including the AcctNum to
    StreetNum match) in the regex string itself?

    If possible, I would like to do this in a simple text editor that
    supports regular expressions.

    Thanks,
    Jonathan
     
    , May 16, 2006
    #5
  6. Guest

    Thanks for the replies.
    I should have specified that I am not actually using PERL...I just
    figured this would be the best place to get expert Regex advice :)

    I was hoping there would be a way to do this without any programmatic
    structures (e.g., the while loop in the example).
    Is there a way to codify all these rules (including the AcctNum to
    StreetNum match) in the regex string itself?

    If possible, I would like to do this in a simple text editor that
    supports regular expressions.

    Thanks,
    Jonathan
     
    , May 16, 2006
    #6
  7. "" <> wrote in
    news::

    > Thanks for the replies.
    > I should have specified that I am not actually using PERL...I just
    > figured this would be the best place to get expert Regex advice :)


    **PLONK**

    Sinan
     
    A. Sinan Unur, May 16, 2006
    #7
  8. DJ Stunks Guest

    A. Sinan Unur wrote:
    > "" <> wrote in
    > news::
    >
    > > Thanks for the replies.
    > > I should have specified that I am not actually using PERL...I just
    > > figured this would be the best place to get expert Regex advice :)

    >
    > **PLONK**


    Agreed.

    Is today Internet Jackass Day? I must have missed a memo because they
    seem to be coming out of the woodwork...

    -jp
     
    DJ Stunks, May 16, 2006
    #8
  9. Guest

    >Is today Internet Jackass Day? I must have missed a memo because they
    >seem to be coming out of the woodwork...

    I'll gladly take the internet jackass lebel if you can answer my
    question :)
     
    , May 16, 2006
    #9
  10. David Squire Guest

    wrote:
    >> Is today Internet Jackass Day? I must have missed a memo because they
    >> seem to be coming out of the woodwork...

    > I'll gladly take the internet jackass lebel if you can answer my
    > question :)
    >


    You don't know what *plonk* means do you? See
    http://en.wikipedia.org/wiki/Plonk

    .... and if you intend to continue to post here in the hope that some are
    still seeing you, read the posting guidelines for this group first and
    follow them - starting with properly attributing quoted context.

    DS
     
    David Squire, May 16, 2006
    #10
  11. Guest

    >You don't know what *plonk* means do you?

    Plonk, as a disparaging term for cheap wine, especially cheap red wine,
    is now widely known in the UK and also to a lesser extent in the USA.
    It's so fixed a part of British English that many people are
    surprised to hear that it's originally Australian.
     
    , May 16, 2006
    #11
  12. David Squire Guest

    wrote:
    >> You don't know what *plonk* means do you?

    >
    > Plonk, as a disparaging term for cheap wine, especially cheap red wine,
    > is now widely known in the UK and also to a lesser extent in the USA.
    > It's so fixed a part of British English that many people are
    > surprised to hear that it's originally Australian.
    >


    OK. That was your last chance from me.

    *plonk*
     
    David Squire, May 16, 2006
    #12
  13. <> wrote:

    > Thanks for the replies.



    Whose replies?

    Please quote some context in followups like everybody else does.


    > I should have specified that I am not actually using PERL...



    Making off-topic postings is a quick way to wear out your welcome.


    > I just
    > figured this would be the best place to get expert Regex advice :)



    And you have received the expert Regex advice that you sought.


    > I was hoping there would be a way to do this without any programmatic
    > structures (e.g., the while loop in the example).



    The while loop is only for reading the data.

    It is not germane to the regex that you asked for.


    > Is there a way to codify all these rules (including the AcctNum to
    > StreetNum match) in the regex string itself?



    Yes. And you have already been given that.

    So what is the problem again?


    > If possible, I would like to do this in a simple text editor that
    > supports regular expressions.



    Then you already have everything that you need.

    Use the regex that you've been given.

    If Perl regexes don't work in your "simple text editor" then
    your question is off-topic here in a Perl newsgroup.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, May 16, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Markus Dehmann

    regex: How to extract substrings?

    Markus Dehmann, Dec 10, 2005, in forum: Java
    Replies:
    2
    Views:
    814
    IchBin
    Dec 10, 2005
  2. Amit Khemka
    Replies:
    8
    Views:
    318
    Amit Khemka
    Nov 23, 2005
  3. amadain
    Replies:
    11
    Views:
    448
    Paul McGuire
    Feb 14, 2007
  4. RolfK
    Replies:
    1
    Views:
    1,923
    Martin Honnen
    Jun 7, 2009
  5. Stuart Moore

    RegExp poser: matching two substrings

    Stuart Moore, Apr 26, 2004, in forum: Perl Misc
    Replies:
    4
    Views:
    125
    Anno Siegel
    Apr 27, 2004
Loading...

Share This Page