Regex for "not matching" an unneeded prefix substring?

Discussion in 'Ruby' started by Jet Koten, Feb 26, 2010.

  1. Jet Koten

    Jet Koten Guest

    Hi all,

    I'm new to Ruby and even newer to regex. I'm trying to write my first
    [useful] Ruby program and need a way to cut out an unneeded prefix
    substring and retain the substring that comes after it.

    Here are the actual details from my code:

    result.each do |item|
    price = item.search(".price").text.match(/\d+[.]\d+/)
    condition = item.search(".condition").text.match(/Used - ([^,]+)/)
    rating = item.search(".rating a").text.to_i
    seller = item.search(".seller b").text
    puts "#{price} - #{condition} - #{rating} - #{seller}"
    end

    The one from condition [in the code above] is the one that is giving me
    a challenge. The string that is sent to condition will always be exactly
    one of the following and nothing else at all:

    "Used - Like New"
    "Used - Very Good"
    "Used - Good"
    "Used - Acceptable"

    I'm trying to get them to display as the following in the puts at the
    end of my code:

    "Like New"
    "Very Good"
    "Good"
    "Acceptable"

    The regex that I've got there in the condition line works in Rubular,
    but not in my code. I'm running 1.8.7 if that matters...

    One last thing that I don't understand too is that in Rubular my regex
    for price shows the match in the "Match result:" line, but the regex for
    condition shows the whole string as a match in the "Match result:" line
    but shows the correctlt matching substring in the "Match captures:"
    line.

    I'm grateful for this great resource (the list/forum) and would be very
    happy to hear from anyone who can help me sort this out!

    Thanks in advance,
    J
    --
    Posted via http://www.ruby-forum.com/.
    Jet Koten, Feb 26, 2010
    #1
    1. Advertising

  2. On 02/26/2010 22:20, Jet Koten wrote:
    > The one from condition [in the code above] is the one that is giving me
    > a challenge. The string that is sent to condition will always be exactly
    > one of the following and nothing else at all:
    >
    > "Used - Like New"
    > "Used - Very Good"
    > "Used - Good"
    > "Used - Acceptable"
    >
    > I'm trying to get them to display as the following in the puts at the
    > end of my code:
    >
    > "Like New"
    > "Very Good"
    > "Good"
    > "Acceptable"
    >


    If you just want to ged rid of the word "Used", you could use something
    like this:

    text = "Used - Like New"
    text[7, text.length]
    => "Like New"

    Regards
    Alexander Jesner, Feb 26, 2010
    #2
    1. Advertising

  3. 2010/2/26 Jet Koten <>:
    > Hi all,
    >
    > I'm new to Ruby and even newer to regex. I'm trying to write my first
    > [useful] Ruby program and need a way to cut out an unneeded prefix
    > substring and retain the substring that comes after it.
    >
    > Here are the actual details from my code:
    >
    > result.each do |item|
    > =A0price =3D item.search(".price").text.match(/\d+[.]\d+/)
    > =A0condition =3D item.search(".condition").text.match(/Used - ([^,]+)/)
    > =A0rating =3D item.search(".rating a").text.to_i
    > =A0seller =3D item.search(".seller b").text
    > =A0puts "#{price} - #{condition} - #{rating} - #{seller}"
    > end
    >
    > The one from condition [in the code above] is the one that is giving me
    > a challenge. The string that is sent to condition will always be exactly
    > one of the following and nothing else at all:
    >
    > "Used - Like New"
    > "Used - Very Good"
    > "Used - Good"
    > "Used - Acceptable"
    >
    > I'm trying to get them to display as the following in the puts at the
    > end of my code:
    >
    > "Like New"
    > "Very Good"
    > "Good"
    > "Acceptable"
    >
    > The regex that I've got there in the condition line works in Rubular,
    > but not in my code. I'm running 1.8.7 if that matters...


    I am not sure which regexp you are referring to specifically.
    However, you can do this

    irb(main):001:0> s =3D "Used - Like New"
    =3D> "Used - Like New"
    irb(main):002:0> s[/\AUsed\s+-\s+(.*)\z/, 1]
    =3D> "Like New"
    irb(main):003:0> s[7..-1]
    =3D> "Like New"

    String#[] with regular expression is a very powerful tool - especially
    when used with grouping as in this case.

    > One last thing that I don't understand too is that in Rubular my regex
    > for price shows the match in the "Match result:" line, but the regex for
    > condition shows the whole string as a match in the "Match result:" line
    > but shows the correctlt matching substring in the "Match captures:"
    > line.


    I am having difficulties to follow you here since I don't know what
    "item" is in your case. It's probably easier if you provide a simple
    test case that demonstrates your point. Using IRB often also helps.

    > I'm grateful for this great resource (the list/forum) and would be very
    > happy to hear from anyone who can help me sort this out!


    We'll try to help but please provide a bit more information.

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Feb 26, 2010
    #3
  4. Jet Koten

    Jet Koten Guest

    Alexander Jesner wrote:
    > On 02/26/2010 22:20, Jet Koten wrote:
    >> end of my code:
    >>
    >> "Like New"
    >> "Very Good"
    >> "Good"
    >> "Acceptable"
    >>

    >
    > If you just want to ged rid of the word "Used", you could use something
    > like this:
    >
    > text = "Used - Like New"
    > text[7, text.length]
    > => "Like New"
    >
    > Regards


    It works! :) I had to change 7 to 8 to get rid of an extra space, but
    that did it in a far less complex way than using regex! Thanks.
    --
    Posted via http://www.ruby-forum.com/.
    Jet Koten, Feb 26, 2010
    #4
  5. Jet Koten

    Jet Koten Guest

    Jet Koten wrote:
    > Alexander Jesner wrote:
    >> On 02/26/2010 22:20, Jet Koten wrote:
    >>> end of my code:
    >>>
    >>> "Like New"
    >>> "Very Good"
    >>> "Good"
    >>> "Acceptable"
    >>>

    >>
    >> If you just want to ged rid of the word "Used", you could use something
    >> like this:
    >>
    >> text = "Used - Like New"
    >> text[7, text.length]
    >> => "Like New"
    >>
    >> Regards

    >
    > It works!


    Hmmm, well, actually, it kind of works. I did this:

    result.each do |item|
    price = item.search(".price").text.match(/\d+[.]\d+/)
    condition = item.search(".condition").text
    rating = item.search(".rating a").text.to_i
    seller = item.search(".seller b").text
    puts "#{price} - #{condition.chomp[8, condition.length]} - #{rating} -
    #{seller}"
    end

    and then realized I actually need to be able to just put #{condition} by
    itself in the puts and not use #{condition.chomp[8, condition.length]}

    but, I tried and found that I don't know how to adjust the code in the
    block above. Can someone help again?
    --
    Posted via http://www.ruby-forum.com/.
    Jet Koten, Feb 26, 2010
    #5
  6. On 02/26/2010 23:06, Jet Koten wrote:
    > and then realized I actually need to be able to just put #{condition} by
    > itself in the puts and not use #{condition.chomp[8, condition.length]}
    >


    Insert

    condition = condition.chomp[8, condition.length]

    after

    condition = item.search(".condition").text


    and you can use #{condition} in the string.

    Regards
    Alexander Jesner, Feb 26, 2010
    #6
  7. Jet Koten

    Jet Koten Guest

    Robert Klemme wrote:
    > 2010/2/26 Jet Koten <>:
    >> �condition = item.search(".condition").text.match(/Used - ([^,]+)/)
    >> "Used - Very Good"
    >>
    >> The regex that I've got there in the condition line works in Rubular,
    >> but not in my code. I'm running 1.8.7 if that matters...

    >
    > I am not sure which regexp you are referring to specifically.
    > However, you can do this
    >
    > irb(main):001:0> s = "Used - Like New"
    > => "Used - Like New"
    > irb(main):002:0> s[/\AUsed\s+-\s+(.*)\z/, 1]
    > => "Like New"
    > irb(main):003:0> s[7..-1]
    > => "Like New"
    >
    > String#[] with regular expression is a very powerful tool - especially
    > when used with grouping as in this case.
    >
    >> One last thing that I don't understand too is that in Rubular my regex
    >> for price shows the match in the "Match result:" line, but the regex for
    >> condition shows the whole string as a match in the "Match result:" line
    >> but shows the correctlt matching substring in the "Match captures:"
    >> line.

    >
    > I am having difficulties to follow you here since I don't know what
    > "item" is in your case. It's probably easier if you provide a simple
    > test case that demonstrates your point. Using IRB often also helps.
    >
    >> I'm grateful for this great resource (the list/forum) and would be very
    >> happy to hear from anyone who can help me sort this out!

    >
    > We'll try to help but please provide a bit more information.
    >
    > Kind regards
    >
    > robert


    Hi Robert,

    Thanks a lot. I've discovered that there are many ways of achieving this
    goal, whether it's through regex, ranges, or even split (as a friend
    offline just advised me of).

    I've gotten it working for now, but I'll likely be back eventually when
    the next question arises. :)
    --
    Posted via http://www.ruby-forum.com/.
    Jet Koten, Feb 26, 2010
    #7
  8. Jet Koten wrote:
    > Hi all,
    >
    > I'm new to Ruby and even newer to regex. I'm trying to write my first
    > [useful] Ruby program and need a way to cut out an unneeded prefix
    > substring and retain the substring that comes after it.
    >
    > Here are the actual details from my code:
    >
    > result.each do |item|
    > price = item.search(".price").text.match(/\d+[.]\d+/)
    > condition = item.search(".condition").text.match(/Used - ([^,]+)/)
    > rating = item.search(".rating a").text.to_i
    > seller = item.search(".seller b").text
    > puts "#{price} - #{condition} - #{rating} - #{seller}"
    > end
    >
    > The one from condition [in the code above] is the one that is giving me
    > a challenge. The string that is sent to condition will always be exactly
    > one of the following and nothing else at all:
    >
    > "Used - Like New"
    > "Used - Very Good"
    > "Used - Good"
    > "Used - Acceptable"
    > (...)


    This is another option, avoiding regular expressions. It's kind of old
    school, but it's fast, flexible, and handles garbage.

    sanitize_condition = Hash.new("Unknown")
    sanitize_condition["Used - Like New"] = "Like New"
    sanitize_condition["Used - Very Good"] = "Very Good"
    sanitize_condition["Used - Good"] = "Good"
    sanitize_condition["Used - Acceptable"] = "Acceptable"
    sanitize_condition["Used - Broken"] = "Kaput"

    demo_conditions = ["Used - Like New","",nil,"Used - Broken","Used -
    Acceptable","garble"]
    demo_conditions.each{|cond| puts sanitize_condition[cond] }

    hth,

    Siep
    --
    Posted via http://www.ruby-forum.com/.
    Siep Korteling, Feb 26, 2010
    #8
  9. Jet Koten

    Jet Koten Guest

    Siep Korteling wrote:
    > Jet Koten wrote:
    >> Hi all,
    >>
    >> I'm new to Ruby and even newer to regex. I'm trying to write my first
    >> [useful] Ruby program and need a way to cut out an unneeded prefix
    >> substring and retain the substring that comes after it.
    >>
    >> Here are the actual details from my code:
    >>
    >> result.each do |item|
    >> price = item.search(".price").text.match(/\d+[.]\d+/)
    >> condition = item.search(".condition").text.match(/Used - ([^,]+)/)
    >> rating = item.search(".rating a").text.to_i
    >> seller = item.search(".seller b").text
    >> puts "#{price} - #{condition} - #{rating} - #{seller}"
    >> end
    >>
    >> The one from condition [in the code above] is the one that is giving me
    >> a challenge. The string that is sent to condition will always be exactly
    >> one of the following and nothing else at all:
    >>
    >> "Used - Like New"
    >> "Used - Very Good"
    >> "Used - Good"
    >> "Used - Acceptable"
    >> (...)

    >
    > This is another option, avoiding regular expressions. It's kind of old
    > school, but it's fast, flexible, and handles garbage.
    >
    > sanitize_condition = Hash.new("Unknown")
    > sanitize_condition["Used - Like New"] = "Like New"
    > sanitize_condition["Used - Very Good"] = "Very Good"
    > sanitize_condition["Used - Good"] = "Good"
    > sanitize_condition["Used - Acceptable"] = "Acceptable"
    > sanitize_condition["Used - Broken"] = "Kaput"
    >
    > demo_conditions = ["Used - Like New","",nil,"Used - Broken","Used -
    > Acceptable","garble"]
    > demo_conditions.each{|cond| puts sanitize_condition[cond] }
    >
    > hth,
    >
    > Siep


    Hi Siep,

    Thanks! My offline friend actually suggested that I refactor everything
    into a hash actually, because the condition info is just one criteria of
    many that I am pulling into my app...

    but it is making my head spin because I am so new to Ruby, so I'm going
    to take a break and then look at it again and also look at the
    documentation for hash and see what I can come up with.

    My friend also suggested that I write sudocode for all my desired
    functionality and that that could help a lot. I have a prioritized list
    for now, but it is making my head hurt to try and do so much that I
    don't know how to do! :)

    I can't say enough how helpful the list/forum is, and that I'm very
    grateful for everyone using their free time to help me along.
    --
    Posted via http://www.ruby-forum.com/.
    Jet Koten, Feb 26, 2010
    #9
  10. On 02/27/2010 12:23 AM, Jet Koten wrote:
    > Thanks a lot. I've discovered that there are many ways of achieving this
    > goal, whether it's through regex, ranges, or even split (as a friend
    > offline just advised me of).


    That's often the case with Ruby - and many of those ways are also elegant.

    > I've gotten it working for now, but I'll likely be back eventually when
    > the next question arises. :)


    "I'll be back." - oooh... ;-)

    Cheers

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Feb 27, 2010
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?S2VubnkgTS4=?=

    making an unneeded post back

    =?Utf-8?B?S2VubnkgTS4=?=, Jan 25, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    296
    =?Utf-8?B?Q2hpbm1heQ==?=
    Jan 25, 2005
  2. Tim Tyler
    Replies:
    36
    Views:
    1,461
    Darryl L. Pierce
    Dec 10, 2004
  3. Chris  Chiasson
    Replies:
    6
    Views:
    614
    Richard Tobin
    Nov 14, 2006
  4. Julek
    Replies:
    2
    Views:
    315
    James Kanze
    Jun 30, 2008
  5. Replies:
    3
    Views:
    200
    Sherm Pendley
    Aug 3, 2005
Loading...

Share This Page