regular expression / gsub question

Discussion in 'Ruby' started by Mmcolli00 Mom, Mar 13, 2009.

  1. Hi
    Do you know how use a regular expression to get only the scripname from
    the each filename below? I have long filename and I want to pull out a
    segment "scriptname" only. I have been using a regular expression with
    gsub for this.

    filename

    userfilename_scriptname_030109.txt
    userfilename3_scriptname1_031109.txt
    userfilename_scriptname0_031209.txt


    The gsub didn't work because the _ on both sides causes me to delete the
    whole filename. What would you recommend?

    stripfirstpart = filename.gsub(/*_/,"")
    stripsecondpart = filename.gsub(/_.*/,"")
    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Mar 13, 2009
    #1
    1. Advertising

  2. Mmcolli00 Mom wrote:
    > Hi
    > Do you know how use a regular expression to get only the scripname from
    > the each filename below? I have long filename and I want to pull out a
    > segment "scriptname" only. I have been using a regular expression with
    > gsub for this.
    >
    > filename
    >
    > userfilename_scriptname_030109.txt
    > userfilename3_scriptname1_031109.txt
    > userfilename_scriptname0_031209.txt
    >
    >
    > The gsub didn't work because the _ on both sides causes me to delete the
    > whole filename. What would you recommend?
    >
    > stripfirstpart = filename.gsub(/*_/,"")
    > stripsecondpart = filename.gsub(/_.*/,"")


    I like using #[] for this because it lets you think in terms of what you
    want to keep rather than what you want to remove.

    filename[/_(.*?)_/, 1]

    The ? is there in case there are more underscores later in the filename.

    --
    vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
    Joel VanderWerf, Mar 13, 2009
    #2
    1. Advertising

  3. Mmcolli00 Mom, Mar 13, 2009
    #3
  4. On 13.03.2009 20:46, Joel VanderWerf wrote:
    > Mmcolli00 Mom wrote:
    >> Hi
    >> Do you know how use a regular expression to get only the scripname from
    >> the each filename below? I have long filename and I want to pull out a
    >> segment "scriptname" only. I have been using a regular expression with
    >> gsub for this.
    >>
    >> filename
    >>
    >> userfilename_scriptname_030109.txt
    >> userfilename3_scriptname1_031109.txt
    >> userfilename_scriptname0_031209.txt
    >>
    >>
    >> The gsub didn't work because the _ on both sides causes me to delete the
    >> whole filename. What would you recommend?
    >>
    >> stripfirstpart = filename.gsub(/*_/,"")
    >> stripsecondpart = filename.gsub(/_.*/,"")

    >
    > I like using #[] for this because it lets you think in terms of what you
    > want to keep rather than what you want to remove.
    >
    > filename[/_(.*?)_/, 1]
    >
    > The ? is there in case there are more underscores later in the filename.


    AFAIK it is more robust and also more efficient to do

    filename[/_([^_]+)_/, 1]

    or even

    filename[/_(scriptname\d*)_/, 1]

    or even

    filename[/\Auserfilename\d*_(scriptname\d*)_\d+\.txt\z/, 1]

    In other words, rather explicitly define precisely what you want to
    match than rely on (non)greediness of repetition operators.

    Kind regards

    robert
    Robert Klemme, Mar 14, 2009
    #4
  5. You can also do a filename.split(/_/)[1] which is probably not that
    efficient but you can get all three parts of the string.
    --
    Posted via http://www.ruby-forum.com/.
    Milan Dobrota, Mar 14, 2009
    #5
  6. Mmcolli00 Mom

    7stud -- Guest

    Milan Dobrota wrote:
    > You can also do a filename.split(/_/)[1]
    >
    > which is probably not that
    > efficient but you can get all three parts of the string.


    Do you mean "not that efficient" in the sense that ruby programs are
    ponderously slow and that isn't?
    --
    Posted via http://www.ruby-forum.com/.
    7stud --, Mar 14, 2009
    #6
  7. 7stud -- wrote:
    > Milan Dobrota wrote:
    >> You can also do a filename.split(/_/)[1]
    >>
    >> which is probably not that
    >> efficient but you can get all three parts of the string.

    >
    > Do you mean "not that efficient" in the sense that ruby programs are
    > ponderously slow and that isn't?

    I still believe that
    filename[/\Auserfilename\d*_(scriptname\d*)_\d+\.txt\z/, 1]
    is the most efficient way of doing that. I just provide this as another
    option.
    --
    Posted via http://www.ruby-forum.com/.
    Milan Dobrota, Mar 14, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,269
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    829
    Alan Moore
    Dec 2, 2005
  3. GIMME
    Replies:
    3
    Views:
    11,920
    vforvikash
    Dec 29, 2008
  4. aurelianito

    gsub and gsub! are inconsistent

    aurelianito, Nov 8, 2005, in forum: Ruby
    Replies:
    9
    Views:
    153
    Robert Klemme
    Nov 9, 2005
  5. Ben
    Replies:
    4
    Views:
    130
    Robert Klemme
    Mar 25, 2008
Loading...

Share This Page