RegEx Help, Please? (match after n)

Discussion in 'Perl Misc' started by Smarta55 Chris, Jun 27, 2005.

  1. Can someone help me find the last n folders in a path,
    or the last n OUs in a fully qualified Netware name... for an app object,
    etc?

    Example:
    \\server\volume\share\folder\folder\folder
    I want to grab just the last 2 folders (including or excluding the \s), but
    sometimes there
    are 3 folders after the share, like in the example, and sometimes there are
    more.

    Same for an app object in eDirectory
    ..appobject.dept.biz.site.county.state.tree
    How to I grab just the .site.county.state.tree OUs?
    Sometimes there are 5 OUs after the app object, and sometimes more.
    I want, I think, to "start at the end of the string, and grab everything
    after the 4th dot from the end," but don't have a clue how to code this.

    I'm not a perl newbie, but I'm no expert....and I AM a RegEx idiot!

    Here's some of what I've tried:
    $string = ".appobject.dept.biz.site.county.state.tree";
    $string =~ /\..+\..+\..+/$; # i.e. match a literal dot followed my
    anything mult times, followed by another literal dot, followed, etc...
    # starting at the end of the string
    $string =~ /(\.\W)*/; # \w does NOT match AlphaNum like it's supposed to,
    \W DOES, but it's NOT supposed to.
    # unless . is an AlphaNum character, but I'm
    not sure

    And I can't remember the 4,612 other patterns I've tried, or the
    accompanying 4,612 hairs I've
    yanked from my scalp in the process. :-(

    Thank you,
    Chris
     
    Smarta55 Chris, Jun 27, 2005
    #1
    1. Advertising

  2. Smarta55 Chris wrote:
    > Can someone help me find the last n folders in a path,
    > or the last n OUs in a fully qualified Netware name... for an app object,
    > etc?
    >
    > Example:
    > \\server\volume\share\folder\folder\folder
    > I want to grab just the last 2 folders (including or excluding the \s), but
    > sometimes there
    > are 3 folders after the share, like in the example, and sometimes there are
    > more.
    >
    > Same for an app object in eDirectory
    > .appobject.dept.biz.site.county.state.tree
    > How to I grab just the .site.county.state.tree OUs?
    > Sometimes there are 5 OUs after the app object, and sometimes more.
    > I want, I think, to "start at the end of the string, and grab everything
    > after the 4th dot from the end," but don't have a clue how to code this.
    >
    > I'm not a perl newbie, but I'm no expert....and I AM a RegEx idiot!


    Then don't overuse them. How about this for your first example:

    my @folders = ( split /\\/, $path )[-2..-1];

    Leaving the second example as an exercise for you. ;-)

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jun 27, 2005
    #2
    1. Advertising

  3. Thank you. I used:
    my @ous = (split /\.,/,$string)[-2..-1]; and got back 'folder folder'.
    changing
    -2 to -3 brought back 'folder folder folder'
    I've used split before, but I don't understand the [-2..-1] part.
    I ruled out using split because I'll never know how long the string
    will be, or how many OUs will follow the object in its FQN.
    Also, it's not returning what I need....obviously my thought/need wasn't
    translated well to my text..

    When I match against ".appobject.dept.biz.site.county.state.tree", and
    need to return the last 4 OUs, what I need back is ".site.county.state.tree"
    The leading dot before
    site isn't important (but is preferred), but the 3 dots between site,
    county, state, and tree are needed.
    I plan to append a new object, and additional OU(s), before what's returned.

    Thank you for your help,
    Chris


    "Gunnar Hjalmarsson" <> wrote in message
    news:...
    >
    > Then don't overuse them. How about this for your first example:
    >
    > my @folders = ( split /\\/, $path )[-2..-1];
    >
    > Leaving the second example as an exercise for you. ;-)
    >
    > --
    > Gunnar Hjalmarsson
    > Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Smarta55 Chris, Jun 27, 2005
    #3
  4. "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    news::

    > Can someone help me find the last n folders in a path,
    > or the last n OUs


    You are missing an 'I' there ... Seriously, though, I don't know what
    you are referring to.

    > Example:
    > \\server\volume\share\folder\folder\folder
    > I want to grab just the last 2 folders (including or excluding the
    > \s), but sometimes there are 3 folders after the share, like in the
    > example, and sometimes there are more.


    OK.

    > Same for an app object in eDirectory
    > .appobject.dept.biz.site.county.state.tree
    > How to I grab just the .site.county.state.tree OUs?
    > Sometimes there are 5 OUs after the app object, and sometimes more.
    > I want, I think, to "start at the end of the string, and grab
    > everything after the 4th dot from the end," but don't have a clue how
    > to code this.


    perldoc perlfunc
    perldoc -f reverse
    perldoc -f split

    In addition,

    perldoc -f rindex
    perldoc -f substr

    might also be of interest, in case you want to write a different
    algorithm.

    > I'm not a perl newbie, but I'm no expert....and I AM a RegEx idiot!


    Probably not, but you need to look at this calmly.

    > Here's some of what I've tried:
    > $string = ".appobject.dept.biz.site.county.state.tree";
    > $string =~ /\..+\..+\..+/$; # i.e. match a literal dot followed my
    > anything mult times, followed by another literal dot, followed, etc...
    > # starting at the end of the
    > string
    > $string =~ /(\.\W)*/; # \w does NOT match AlphaNum like it's
    > supposed to, \W DOES, but it's NOT supposed to.
    > # unless . is an AlphaNum character,
    > but I'm
    > not sure


    This looks like Def Poetry to me.

    > And I can't remember the 4,612 other patterns I've tried, or the
    > accompanying 4,612 hairs I've yanked from my scalp in the process.
    > :-(


    Here are two examples to get you started:

    First:

    #!/usr/bin/perl

    use strict;
    use warnings;

    sub last_n_components {
    my ($s, $d, $n) = @_;
    join($d, reverse((reverse split /\Q$d\E/, $s) [0 .. $n - 1]))
    }

    print +last_n_components(
    q{.appobject.dept.biz.site.county.state.tree},
    '.', 2
    )."\n";

    print +last_n_components(
    q{\\\\server\volume\share\folder\folder\folder},
    '\\', 2
    )."\n";

    __END__

    Now, if you are going to deal with paths, your life will be much easier
    if you do it portably from the start:

    #!/usr/bin/perl

    use strict;
    use warnings;

    use File::Spec::Functions qw'catdir splitdir';

    sub last_n_path_components {
    my ($s, $n) = @_;
    catdir reverse((reverse splitdir $s)[0 .. $n - 1]);
    }

    print +last_n_path_components(
    q{\\\\server\volume\share\folder\folder\folder}, 2)."\n";

    __END__


    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 27, 2005
    #4
  5. "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    news::

    [ Top-posting fixed. Please do not do that. ]

    > "Gunnar Hjalmarsson" <> wrote in message
    > news:...
    >>
    >> Then don't overuse them. How about this for your first example:
    >>
    >> my @folders = ( split /\\/, $path )[-2..-1];
    >>
    >> Leaving the second example as an exercise for you. ;-)


    > my @ous = (split /\.,/,$string)[-2..-1]; and got back 'folder
    > folder'.
    > changing
    > -2 to -3 brought back 'folder folder folder'


    You mentioned something about not being a newbie. You do need to read
    the first few chapters of 'Learning Perl'.

    @ous is an array. You can join the elements of that array to get what
    ever you want.

    > I've used split before, but I don't understand the [-2..-1] part.


    That is an array slice. It selects all the elements from the second to
    last to the last one from the array returned by split. It is infinitely
    more efficient than the double reverse I used.

    > I ruled out using split because I'll never know how long the string
    > will be, or how many OUs will follow the object in its FQN.


    WTF? Is this an acronym contest?

    Anyway, by indexing from the end of the array, you do not need to know
    how many components in total there are, because, as you said, you are
    only interested in the last n.

    > Also, it's not returning what I need ...


    Yes it is. You just do not know Perl very well.

    > When I match against ".appobject.dept.biz.site.county.state.tree",
    > and need to return the last 4 OUs, what I need back is
    > ".site.county.state.tree" The leading dot before
    > site isn't important (but is preferred), but the 3 dots between site,
    > county, state, and tree are needed.


    Then join the elements of the returned array using a dot.

    That is what

    perldoc -f join

    is for.

    Sinan
    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 27, 2005
    #5
  6. "A. Sinan Unur" <> wrote in
    news:Xns9681E664C2F08asu1cornelledu@127.0.0.1:


    > sub last_n_components {
    > my ($s, $d, $n) = @_;
    > join($d, reverse((reverse split /\Q$d\E/, $s) [0 .. $n - 1]))
    > }


    Ahem ... This is what happens if you try to watch TV and post at the
    same time.

    Sorry about that.

    This should be better:

    #!/usr/bin/perl

    use strict;
    use warnings;

    use File::Spec::Functions qw'catdir splitdir';

    sub last_n_components {
    my ($s, $d, $n) = @_;
    join $d, (split /\Q$d\E/, $s)[-$n .. -1];
    }

    sub last_n_path_components {
    my ($s, $n) = @_;
    catdir( (splitdir $s)[-$n .. - 1] );
    }

    print +last_n_components(
    q{.appobject.dept.biz.site.county.state.tree},
    '.', 2
    )."\n";

    print +last_n_path_components(
    q{\\\\server\volume\share\folder\folder\folder}, 2)."\n";

    __END__



    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 27, 2005
    #6
  7. Thank you....for teaching me that I am, afterall, a perl newbie.
    I'm more confused than before. I'll check the perldocs you
    mention, then assign this script to someone else at work. :-(

    There's no simple, single-line $ao =~ /some pattern or other/; that'll work?
    I just want to append an app object's name to a biz name, then to the last 4
    OUs of the match, and create
    it in eDirectory with a 3rd party utility. :-(

    Thanks for your help!
    Chris

    "A. Sinan Unur" <> wrote in message
    news:Xns9681E664C2F08asu1cornelledu@127.0.0.1...
    > "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    > news::
    >
    > > Can someone help me find the last n folders in a path,
    > > or the last n OUs

    >
    > You are missing an 'I' there ... Seriously, though, I don't know what
    > you are referring to.
    >
    > > Example:
    > > \\server\volume\share\folder\folder\folder
    > > I want to grab just the last 2 folders (including or excluding the
    > > \s), but sometimes there are 3 folders after the share, like in the
    > > example, and sometimes there are more.

    >
    > OK.
    >
    > > Same for an app object in eDirectory
    > > .appobject.dept.biz.site.county.state.tree
    > > How to I grab just the .site.county.state.tree OUs?
    > > Sometimes there are 5 OUs after the app object, and sometimes more.
    > > I want, I think, to "start at the end of the string, and grab
    > > everything after the 4th dot from the end," but don't have a clue how
    > > to code this.

    >
    > perldoc perlfunc
    > perldoc -f reverse
    > perldoc -f split
    >
    > In addition,
    >
    > perldoc -f rindex
    > perldoc -f substr
    >
    > might also be of interest, in case you want to write a different
    > algorithm.
    >
    > > I'm not a perl newbie, but I'm no expert....and I AM a RegEx idiot!

    >
    > Probably not, but you need to look at this calmly.
    >
    > > Here's some of what I've tried:
    > > $string = ".appobject.dept.biz.site.county.state.tree";
    > > $string =~ /\..+\..+\..+/$; # i.e. match a literal dot followed my
    > > anything mult times, followed by another literal dot, followed, etc...
    > > # starting at the end of the
    > > string
    > > $string =~ /(\.\W)*/; # \w does NOT match AlphaNum like it's
    > > supposed to, \W DOES, but it's NOT supposed to.
    > > # unless . is an AlphaNum character,
    > > but I'm
    > > not sure

    >
    > This looks like Def Poetry to me.
    >
    > > And I can't remember the 4,612 other patterns I've tried, or the
    > > accompanying 4,612 hairs I've yanked from my scalp in the process.
    > > :-(

    >
    > Here are two examples to get you started:
    >
    > First:
    >
    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > sub last_n_components {
    > my ($s, $d, $n) = @_;
    > join($d, reverse((reverse split /\Q$d\E/, $s) [0 .. $n - 1]))
    > }
    >
    > print +last_n_components(
    > q{.appobject.dept.biz.site.county.state.tree},
    > '.', 2
    > )."\n";
    >
    > print +last_n_components(
    > q{\\\\server\volume\share\folder\folder\folder},
    > '\\', 2
    > )."\n";
    >
    > __END__
    >
    > Now, if you are going to deal with paths, your life will be much easier
    > if you do it portably from the start:
    >
    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > use File::Spec::Functions qw'catdir splitdir';
    >
    > sub last_n_path_components {
    > my ($s, $n) = @_;
    > catdir reverse((reverse splitdir $s)[0 .. $n - 1]);
    > }
    >
    > print +last_n_path_components(
    > q{\\\\server\volume\share\folder\folder\folder}, 2)."\n";
    >
    > __END__
    >
    >
    > --
    > A. Sinan Unur <>
    > (reverse each component and remove .invalid for email address)
    >
    > comp.lang.perl.misc guidelines on the WWW:
    > http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    Smarta55 Chris, Jun 27, 2005
    #7
  8. "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    news::

    > Thank you....


    You are welcome. However, please note that top-posting and full-quoting
    are generally not useful in facilitating a productive exchange.

    Please do read the posting guidelines for this group. They contain
    valuable information on how you can help yourself, and help others help
    you.

    > for teaching me that I am, afterall, a perl newbie.


    >> "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    >> news::
    >>
    >> > I'm not a perl newbie,


    > I'm more confused than before. I'll check the perldocs you
    > mention, then assign this script to someone else at work. :-(


    Well, to be honest, my double reverse was very convoluted, and I have
    onlyself (no, actually, HBO) to blame.

    > There's no simple, single-line $ao =~ /some pattern or other/;


    There is no reason to bring out the big guns for this kind of thing.
    Besides, I would probably mess that up even worse than I messed up with
    the double reverse.

    Sinan
    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 27, 2005
    #8

  9. > You are welcome. However, please note that top-posting and full-quoting
    > are generally not useful in facilitating a productive exchange.
    >
    > Please do read the posting guidelines for this group. They contain
    > valuable information on how you can help yourself, and help others help
    > you.


    Now I'm a newsgroups newbie, too! :)

    Thanks
     
    Smarta55 Chris, Jun 27, 2005
    #9
  10. Smarta55 Chris wrote:
    > Thank you. I used:
    > my @ous = (split /\.,/,$string)[-2..-1]; and got back 'folder folder'.

    --------------------------^
    ??

    > changing
    > -2 to -3 brought back 'folder folder folder'
    > I've used split before, but I don't understand the [-2..-1] part.


    split() returns a list, and the above picks a slice of the list.

    -1 represents the last, and -2 represents the second last element of a
    list, so -2..-1 are 'all' (two) elements between the second last and the
    last. See "perldoc perldata".

    Sinan made use of the reverse() function instead in his solutions.

    > I ruled out using split because I'll never know how long the string
    > will be, or how many OUs will follow the object in its FQN.
    > Also, it's not returning what I need....obviously my thought/need wasn't
    > translated well to my text..
    >
    > When I match against ".appobject.dept.biz.site.county.state.tree", and
    > need to return the last 4 OUs, what I need back is ".site.county.state.tree"
    > The leading dot before
    > site isn't important (but is preferred), but the 3 dots between site,
    > county, state, and tree are needed.


    Well, you could use join(). Or maybe just:

    my $n = 4;
    print $& if $string =~ /(?:\.[^.]+){$n}$/;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jun 27, 2005
    #10
  11. "Smarta55 Chris" <smarta55ATcomcastDOTnet> wrote in
    news::

    >> You are welcome. However, please note that top-posting and
    >> full-quoting are generally not useful in facilitating a productive
    >> exchange.
    >>
    >> Please do read the posting guidelines for this group. They contain
    >> valuable information on how you can help yourself, and help others
    >> help you.

    >
    > Now I'm a newsgroups newbie, too! :)


    Well, we all started somewhere :)

    Now, the next thing to take care of is to make sure you properly
    attribute the quoted portions, so that people can keep track of who said
    what.

    I see you are using Outlook Express on Windows: May I recommend XNews as
    a proper newsreader?

    <URL:http://xnews.3dnews.net/>

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Jun 27, 2005
    #11
  12. A. Sinan Unur wrote:
    > "Smarta55 Chris" wrote:
    >> I've used split before, but I don't understand the [-2..-1] part.

    >
    > That is an array slice. It selects all the elements from the second to
    > last to the last one from the array returned by split.


    s/array/list/g;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jun 27, 2005
    #12
  13. Smarta55 Chris

    Dave A. Guest

    Smarta55 Chris wrote:

    >>> Can someone help me find the last n folders in a path,
    >>> or the last n OUs in a fully qualified Netware name... for an app object,
    >>> etc?
    >>>
    >>>Example:
    >>>\\server\volume\share\folder\folder\folder
    >>>I want to grab just the last 2 folders (including or excluding the
    >>>\s), but sometimes there are 3 folders after the share, like in the
    >>>example, and sometimes there are more.


    <snip dialog in between>

    > There's no simple, single-line $ao =~ /some pattern or other/; that'll work?


    #!perl -w

    my $n = 4; # number of folders to match
    my $regexp = qr/((?:\\[^\\]+){$n})$/;

    while (<STDIN>) {
    chomp;
    (my $folders) = m/$regexp/; # matches only if 4 or more folders
    print "$folders\n";

    # or
    # /((?:\\[^\\]+){MIN,MAX})$/
    # match *between* MIN and MAX number of folders

    # or
    # /((?:\\[^\\]+){MIN,})$/
    # match *at least* MIN number of folders

    # etc...

    }
    __END__

    Dave
     
    Dave A., Jun 27, 2005
    #13
  14. Smarta55 Chris

    Dave A. Guest

    Dave A. wrote:

    > Smarta55 Chris wrote:


    >>There's no simple, single-line $ao =~ /some pattern or other/; that'll work?


    I want to stress that I would never use a regular expression in a
    situation such as this, where a simple *split* is clearly the right tool
    for the job. The code below was intended as an example only.

    Why use error-prone regular expressions when one can write, e.g.,
    @folders = split( m![\\/]!, $path )

    Dave

    > #!perl -w
    >
    > my $n = 4; # number of folders to match
    > my $regexp = qr/((?:\\[^\\]+){$n})$/;
    >
    > while (<STDIN>) {
    > chomp;
    > (my $folders) = m/$regexp/; # matches only if 4 or more folders
    > print "$folders\n";
    >
    > # or
    > # /((?:\\[^\\]+){MIN,MAX})$/
    > # match *between* MIN and MAX number of folders
    >
    > # or
    > # /((?:\\[^\\]+){MIN,})$/
    > # match *at least* MIN number of folders
    >
    > # etc...
    >
    > }
    > __END__
     
    Dave A., Jun 27, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hiwa
    Replies:
    0
    Views:
    648
  2. KK
    Replies:
    2
    Views:
    649
    Big Brian
    Oct 14, 2003
  3. Replies:
    3
    Views:
    797
    Reedick, Andrew
    Jul 1, 2008
  4. Dominic van der Zypen

    How to make Perl's regex engine "halt" after a match

    Dominic van der Zypen, Feb 18, 2006, in forum: Perl Misc
    Replies:
    14
    Views:
    175
    Wayne M. Poe
    Nov 18, 2006
  5. jwcarlton
    Replies:
    23
    Views:
    502
    ccc31807
    Feb 22, 2011
Loading...

Share This Page