extracting numbers from a string

Discussion in 'Ruby' started by Matt Jones, Jun 12, 2007.

  1. Matt Jones

    Matt Jones Guest

    I have filenames from various digital cameras: DSC_1234.jpg,
    CRW1234.jpg, etc. What I really want is the numeric portion of that
    filename. How would I extract just that portion?

    I expect it to involve the regex /\d+/, but I'm unclear how to extract a
    portion of a string matching a regex.

    Thank you
    Matt Jones, Jun 12, 2007
    1. Advertisements

  2. Matt Jones

    Dan Zwell Guest

    This may be the simplest (and arguably the most ruby-esque):
    str = "DSC_1234.jpg"
    num = str.scan(/\d+/)[0]

    Other ways to do it:
    num = str.match(/\d+/)[0]

    num = (/\d+/).match(str)[0]

    num = str.scan(/\d+/) {|match| match}

    num = str =~ /(\d+)/ ? $1 : nil

    That is,
    num = if str =~ /(\d+)/

    if str =~ /\d+/
    num = $~[0]

    Some proponents of ruby have said that perl's "There is more than one
    way to do it," is a curse. But the same is true of ruby. However, it
    seems to me that most people learn reasonable idioms and common sense

    Dan Zwell, Jun 12, 2007
    1. Advertisements

  3. a = "DSC_1234.jpg"
    b = a.gsub(/[^[:digit:]]/, '')
    Michael W. Ryder, Jun 12, 2007
  4. Matt Jones

    come Guest

    If you just want to extract one number from a string, you could write
    something like :

    if a="DSC_1234.jpg"

    then a[/\d+/] will give you the first longest string of numbers, so

    If you want to be more precise, you could use parenthesis to extract
    the exact portion you want, like :

    a[/DSC_(\d+)\.jpg/,1] (<=> a.match(/DSC_(\d+)\.jpg/)[1])

    or even : a[/\ADSC_(\d+)\.jpg\Z/,1]
    come, Jun 12, 2007
  5. Matt Jones

    Bas van Gils Guest

    Some solutions have been posted already, but here's mine:

    irb(main):001:0> s="DSC_1234.jpg"
    => "DSC_1234.jpg"
    irb(main):002:0> s.sub(/\D+(\d+).*/,'\1')
    => "1234"

    basicially the regexp looks for :

    - one or more non-digits
    - one or more digits => because this is between parenthesis you can refer to
    it with \1 later on
    - something more

    The digits (safely stored in \1) is all you want to keep... this assumed you
    are only interested in the first sequence of numbers.



    Bas van Gils <>, http://www.van-gils.org
    [[[ Thank you for not distributing my E-mail address ]]]

    Quod est inferius est sicut quod est superius, et quod est superius est sicut
    quod est inferius, ad perpetranda miracula rei unius.
    Bas van Gils, Jun 12, 2007
  6. Or even simpler

    irb(main):001:0> "DSC_1234.jpg"[/\d+/]
    => "1234"
    irb(main):002:0> Integer("DSC_1234.jpg"[/\d+/])
    => 1234

    Kind regards

    Robert Klemme, Jun 12, 2007
  7. Last November (2006), there was a series of postings to the Columbus
    Ruby Brigade list beginning with:

    This was the pattern that I used when responding to Bill's code
    because many of *my* pictures had names like "100_5142.jpg",
    "100_5143.jpg", etc.

    NUMBERED_FILE_PATTERN = %r{^(.*\D)?(\d+)(.+)$}

    It became a constant since I used it in three places.

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Jun 12, 2007
  8. Matt Jones

    Matt Jones Guest

    A big thanks to everybody and all the creative solutions!
    Matt Jones, Jun 16, 2007
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.