Method to groom a string to floating point representation

Discussion in 'Ruby' started by Alex DeCaria, Apr 13, 2010.

  1. Alex DeCaria

    Alex DeCaria Guest

    I have a program that asks for the user to enter a string that
    represents a floating point number. Everytime a new character is typed
    I want a method that checks to make sure the string makes sense as a
    floating point number, and if not, deletes any bad characters. For
    instance, if the user enters '4.5e+6.7' I want the method to delete the
    extra decimal place and return '4.5e+67'. Or, if the user enters
    something like '4.5+e7' it deletes the misplaced plus sign and returens
    '4.7e7'. In short, I want the method to only allow correct
    representations of floating point numbers, but I want it to remain as a
    string. Anything other than a number or +, -, ., or e or E, should be
    deleted.

    I wrote a method that works like I want (attached), but it is long and
    cumbersome. I'm wondering if anyone has a shorter, better way to do
    this.

    --Alex

    Attachments:
    http://www.ruby-forum.com/attachment/4653/clean_string_lite.rb

    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 13, 2010
    #1
    1. Advertising

  2. Alex DeCaria

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Mon, Apr 12, 2010 at 9:34 PM, Alex DeCaria <
    > wrote:


    > I have a program that asks for the user to enter a string that
    > represents a floating point number. Everytime a new character is typed
    > I want a method that checks to make sure the string makes sense as a
    > floating point number, and if not, deletes any bad characters. For
    > instance, if the user enters '4.5e+6.7' I want the method to delete the
    > extra decimal place and return '4.5e+67'. Or, if the user enters
    > something like '4.5+e7' it deletes the misplaced plus sign and returens
    > '4.7e7'. In short, I want the method to only allow correct
    > representations of floating point numbers, but I want it to remain as a
    > string. Anything other than a number or +, -, ., or e or E, should be
    > deleted.
    >
    > I wrote a method that works like I want (attached), but it is long and
    > cumbersome. I'm wondering if anyone has a shorter, better way to do
    > this.
    >
    > --Alex
    >
    > Attachments:
    > http://www.ruby-forum.com/attachment/4653/clean_string_lite.rb
    >
    > --
    > Posted via http://www.ruby-forum.com/.
    >
    >

    It would probably be easier if you provided a set of tests we could check
    our function against, where we could be confident our function was correct
    once it passed all the tests.
     
    Josh Cheek, Apr 13, 2010
    #2
    1. Advertising

  3. Alex DeCaria

    Alex DeCaria Guest

    Josh Cheek wrote:
    > It would probably be easier if you provided a set of tests we could
    > check
    > our function against, where we could be confident our function was
    > correct
    > once it passed all the tests.


    Here are some examples of what it should do:

    Delete any characters other than digits, +, -, e, E, or .:
    '-24.5fge4x'5 => '-24.5e45'

    Delete any extra decimals:
    '2.4.5' => '2.45'
    '2..45' => '2.45'

    Delete any decimals in an exponent:
    '245e7.6' => '2.45e76'

    Delete any extra or misplaced + or – signs:
    '+45-68+e+45-' => '4568e+45'

    Delete any extra or misplaced ‘e’ or ‘E’ characters (first occurance of
    'e' or 'E' has precedence unless it doesn't make sense):
    '4.67e6e-7' => '4.67e67'
    '+e4.67e-7' => '+4.67e-7'

    The motivation for this is for a GUI input textbox, so that if the user
    enters a bad string it automatically corrects it to a valid
    floating-point representation in string form before converting to a
    floating-point for calculations. I toyed with just doing
    str = str.to_f.to_s
    and letting Ruby figure out the floating point respesentation, but I'd
    like more control over how the string is converted to floating point
    representation. For example, I want
    '2..45e9' => '2.45e9', whereas '2..45e9'.to_f.to_s => '2.0'

    --Alex


    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 13, 2010
    #3
  4. Alex DeCaria

    Josh Cheek Guest

    On Tue, Apr 13, 2010 at 6:20 AM, Alex DeCaria <=
    u
    > wrote:


    > Delete any decimals in an exponent:
    > '245e7.6' =3D> '2.45e76'
    >


    Where did the dot in between 2 and 4 come from? Am I interpreting the Strin=
    g
    or just cleaning it?


    > Delete any extra or misplaced + or =96 signs:
    > '+45-68+e+45-' =3D> '4568e+45'
    >
    > Delete any extra or misplaced =91e=92 or =91E=92 characters (first occura=

    nce of
    > '+e4.67e-7' =3D> '+4.67e-7'
    >
    >

    Why does the plus in front of 45 in the first one go away, but the plus in
    front of the e in the second one stays?

    -----

    This is what I have so far, please check and correct any tests that should
    be different

    def clean_string( str , options =3D Hash.new )
    str =3D~ /\A([-+]?)([^eE.]*\.?)([^eE]*)((?:[eE][+-]?)?)([^Z]*)\Z/
    posneg , prepre , postpre , e , post =3D $1 , $2 , $3 , $4 , $5
    posneg + prepre + postpre.gsub(/[^0-9]/,'') + e + post.gsub(/[^0-9]/,'')
    end

    require 'test/unit'
    class TestCleanString < Test::Unit::TestCase
    def test_delete_chars
    assert_equal '-24.5e45' , clean_string('-24.5fge4x5')
    end
    def test_delete_extra_decimal
    assert_equal '2.45' , clean_string('2.4.5')
    assert_equal '2.45' , clean_string('2..45')
    assert_equal '2.45' , clean_string('2...45')
    end
    def test_delete_extra_decimal_in_exponent
    assert_equal '245e76' , clean_string('245e7.6') # you said this should
    be '2.45e76' , but where did first dot come from?
    end
    def test_delete_extra_or_misplaced_pos_and_neg_signs
    assert_equal '4568e+45' , clean_string('+45-68+e+45-')
    end
    def test_delete_extra_or_misplaced_e_or_E
    assert_equal '4.67e67' , clean_string('4.67e6e-7')
    assert_equal '+4.67e-7' , clean_string('+e4.67e-7')
    end
    end
     
    Josh Cheek, Apr 14, 2010
    #4
  5. Hello,

    2010/4/14 Josh Cheek <>:
    > On Tue, Apr 13, 2010 at 6:20 AM, Alex DeCaria <alex.decaria@millersville.=

    edu
    >> wrote:

    >
    >> Delete any decimals in an exponent:
    >> '245e7.6' =3D> '2.45e76'
    >>

    >
    > Where did the dot in between 2 and 4 come from? Am I interpreting the Str=

    ing
    > or just cleaning it?


    As said Josh, here you are interpreting the string rather than
    cleaning it. 245e76 is a valid float, just not in the usual 2.45e78
    form.

    BTW, I would rather not do any cleaning under the hood: let the user
    correct its input himself. For example, give the input to Float() and
    if an error is raised (which Float does as opposed to to_f which never
    raise an error), rescue it by giving feedback to the user (where you
    could use your method to propose an alternative if you want) but do
    not continue without letting the user know he has made a mistake and
    giving him the ability to change his mind.

    Cheers,

    --=20
    JJ Fleck
    PCSI1 Lyc=E9e Kl=E9ber
     
    Jean-Julien Fleck, Apr 14, 2010
    #5
  6. Alex DeCaria

    Alex DeCaria Guest

    >
    >> Delete any decimals in an exponent:
    >> '245e7.6' => '2.45e76'
    >>

    >
    > Where did the dot in between 2 and 4 come from? Am I interpreting the
    > String
    > or just cleaning it?


    This was a typo on my part. It should have read:
    '245e7.6' => '245e76'

    >
    >
    >> Delete any extra or misplaced + or – signs:
    >> '+45-68+e+45-' => '4568e+45'
    >>
    >> Delete any extra or misplaced ‘e’ or ‘E’ characters (first occurance of
    >> '+e4.67e-7' => '+4.67e-7'
    >>
    >>

    > Why does the plus in front of 45 in the first one go away, but the plus
    > in
    > front of the e in the second one stays?


    Again, a typo on my part. It should have been:
    '+45-68+e+45-' => '+4568e+45'


    >
    > This is what I have so far, please check and correct any tests that
    > should
    > be different
    >


    Thank! I'll check the code you gave me and see how it does.

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #6
  7. Alex DeCaria

    Alex DeCaria Guest

    >
    > BTW, I would rather not do any cleaning under the hood: let the user
    > correct its input himself. For example, give the input to Float() and
    > if an error is raised (which Float does as opposed to to_f which never
    > raise an error), rescue it by giving feedback to the user (where you
    > could use your method to propose an alternative if you want) but do
    > not continue without letting the user know he has made a mistake and
    > giving him the ability to change his mind.
    >
    > Cheers,


    I didn't realize the difference between Float() and .to_f. Thanks for
    the suggestion.

    The user is still aware if they entered an incorrect string, since they
    are entering it into a GUI textbox, and the string cleaning is done
    after each character is entered. Thus, if they try to enter a misplaced
    + sign or another bad character, they won't see it appear in the
    textbox, which should cause them to notice it.

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #7
  8. Alex DeCaria

    Alex DeCaria Guest

    Josh Cheek wrote:
    > This is what I have so far, please check and correct any tests that
    > should
    > be different


    Josh,

    Your code works great! I knew there had to be a more elegant way to do
    this rather than my brute force method.

    The only test it didn't seem to work on was eliminating extra + or -
    signs, such as '+45-2+8' => '+4528', but now that I see what you are
    doing I can probably figure out how to do that. I definitely need to
    learn more about regular expressions!

    Thanks for your time and effort.

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #8
  9. Hello Alex,

    > The user is still aware if they entered an incorrect string, since they
    > are entering it into a GUI textbox, and the string cleaning is done
    > after each character is entered. =A0Thus, if they try to enter a misplace=

    d
    > + sign or another bad character, they won't see it appear in the
    > textbox, which should cause them to notice it.


    Well, then you can't use the Float() trick because 1.0e3 is a valid
    but 1.0e is not.
    Then there will be a lot of strings your user won't be able to type
    even if they are valid in the end.

    Cheers,

    --=20
    JJ Fleck
    PCSI1 Lyc=E9e Kl=E9ber
     
    Jean-Julien Fleck, Apr 14, 2010
    #9
  10. Alex DeCaria

    Alex DeCaria Guest

    Jean-Julien Fleck wrote:
    > Hello Alex,
    >
    >> The user is still aware if they entered an incorrect string, since they
    >> are entering it into a GUI textbox, and the string cleaning is done
    >> after each character is entered. �Thus, if they try to enter a misplaced
    >> + sign or another bad character, they won't see it appear in the
    >> textbox, which should cause them to notice it.

    >
    > Well, then you can't use the Float() trick because 1.0e3 is a valid
    > but 1.0e is not.
    > Then there will be a lot of strings your user won't be able to type
    > even if they are valid in the end.
    >
    > Cheers,


    Yes, there has to be some additional logic to allow a trailing 'e' with
    the assumption that the user will next enter a valid character
    afterward. That's what makes it a little complicated (and fun) to
    figure out. The goal is, as the user is entering data, to not allow
    them to enter anything that is obviously not going to work as a floating
    point representation.

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #10
  11. Hello Alex,

    > Yes, there has to be some additional logic to allow a trailing 'e' with
    > the assumption that the user will next enter a valid character
    > afterward. =A0That's what makes it a little complicated (and fun) to
    > figure out. =A0The goal is, as the user is entering data, to not allow
    > them to enter anything that is obviously not going to work as a floating
    > point representation.


    Sure, fun it is :eek:)
    But that's exactly the kind of software that could drive me mad (as a
    user). You assume that your user is making a typo but what if he is
    not ? What if he truly believe what he is writing is a perfectly
    correct float ? He will retry again, and again and again untill he
    decide that the whole software is just a fraud :eek:) So IMHO, it is more
    efficient to let your user know what kind of error he is (possibly
    repetitively) doing and propose an alternative rather than erase what
    he believe could be right.

    Cheers,

    --=20
    JJ Fleck
    PCSI1 Lyc=E9e Kl=E9ber
     
    Jean-Julien Fleck, Apr 14, 2010
    #11
  12. Alex DeCaria

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Wed, Apr 14, 2010 at 8:33 AM, Alex DeCaria <
    > wrote:


    > Josh Cheek wrote:
    > > This is what I have so far, please check and correct any tests that
    > > should
    > > be different

    >
    > Josh,
    >
    > Your code works great! I knew there had to be a more elegant way to do
    > this rather than my brute force method.
    >
    > The only test it didn't seem to work on was eliminating extra + or -
    > signs, such as '+45-2+8' => '+4528', but now that I see what you are
    > doing I can probably figure out how to do that. I definitely need to
    > learn more about regular expressions!
    >
    > Thanks for your time and effort.
    >
    > --Alex
    > --
    > Posted via http://www.ruby-forum.com/.
    >
    >

    It wasn't done, because I wanted clarification on the tests first.

    Anyway, this one passes all tests.

    def clean_string(str)
    str =~ /\A([-+]?)([eE]?)([^eE.]*\.?)([^eE]*)((?:[eE][+-]?)?)([^Z]*)\Z/
    posneg , misplaced_e , before_dec , after_dec , e , exponent = $1 , $2 ,
    $3 , $4 , $5 , $6
    posneg + before_dec.gsub(/[^0-9.]/,'') + after_dec.gsub(/[^0-9]/,'') + e +
    exponent.gsub(/[^0-9]/,'')
    end

    require 'test/unit'
    class TestCleanString < Test::Unit::TestCase
    def test_delete_chars
    assert_equal '-24.5e45' , clean_string('-24.5fge4x5')
    end
    def test_delete_extra_decimal
    assert_equal '2.45' , clean_string('2.4.5')
    assert_equal '2.45' , clean_string('2..45')
    assert_equal '2.45' , clean_string('2...45')
    end
    def test_delete_extra_decimal_in_exponent
    assert_equal '245e76' , clean_string('245e7.6')
    end
    def test_delete_extra_or_misplaced_pos_and_neg_signs
    assert_equal '+4568e+45' , clean_string('+45-68+e+45-')
    end
    def test_delete_extra_or_misplaced_e_or_E
    assert_equal '4.67e67' , clean_string('4.67e6e-7')
    assert_equal '+4.67e-7' , clean_string('+e4.67e-7')
    end
    end
     
    Josh Cheek, Apr 14, 2010
    #12
  13. Alex DeCaria

    Alex DeCaria Guest

    Josh Cheek wrote:
    > On Wed, Apr 14, 2010 at 8:33 AM, Alex DeCaria
    > <
    >> wrote:

    >
    >> The only test it didn't seem to work on was eliminating extra + or -
    >>

    > It wasn't done, because I wanted clarification on the tests first.
    >
    > Anyway, this one passes all tests.
    >

    Thanks again, Josh! May I use your code in my (non-commercial,
    educational-use-only) app?

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #13
  14. Alex DeCaria

    Alex DeCaria Guest

    Jean-Julien Fleck wrote:

    >
    > Sure, fun it is :eek:)
    > But that's exactly the kind of software that could drive me mad (as a
    > user). You assume that your user is making a typo but what if he is
    > not ? What if he truly believe what he is writing is a perfectly
    > correct float ? He will retry again, and again and again untill he
    > decide that the whole software is just a fraud :eek:) So IMHO, it is more
    > efficient to let your user know what kind of error he is (possibly
    > repetitively) doing and propose an alternative rather than erase what
    > he believe could be right.
    >
    > Cheers,


    I can't argue with the point you are making. I will continue to use the
    automatic string grooming, but will probably include a message to the
    user letting them know why what they are typing isn't showing up in the
    textbox.

    --Alex
    --
    Posted via http://www.ruby-forum.com/.
     
    Alex DeCaria, Apr 14, 2010
    #14
  15. Alex DeCaria

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Wed, Apr 14, 2010 at 10:06 AM, Alex DeCaria <
    > wrote:

    > Josh Cheek wrote:
    > > On Wed, Apr 14, 2010 at 8:33 AM, Alex DeCaria
    > > <
    > >> wrote:

    > >
    > >> The only test it didn't seem to work on was eliminating extra + or -
    > >>

    > > It wasn't done, because I wanted clarification on the tests first.
    > >
    > > Anyway, this one passes all tests.
    > >

    > Thanks again, Josh! May I use your code in my (non-commercial,
    > educational-use-only) app?
    >
    > --Alex
    > --
    > Posted via http://www.ruby-forum.com/.
    >
    >

    Sure, go ahead and throw the wtfpl on there, if you feel more comfortable
    with that. http://sam.zoy.org/wtfpl/

    And I guarantee that it does nothing other than pass the set of tests it was
    posted with, on my machine, with the settings that were used at the time of
    testing. So no warranty of any kind.

    Have fun :p
     
    Josh Cheek, Apr 14, 2010
    #15
  16. Alex DeCaria

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Wed, Apr 14, 2010 at 9:58 AM, Josh Cheek <> wrote:

    > On Wed, Apr 14, 2010 at 8:33 AM, Alex DeCaria <
    > > wrote:
    >
    >> Josh Cheek wrote:
    >> > This is what I have so far, please check and correct any tests that
    >> > should
    >> > be different

    >>
    >> Josh,
    >>
    >> Your code works great! I knew there had to be a more elegant way to do
    >> this rather than my brute force method.
    >>
    >> The only test it didn't seem to work on was eliminating extra + or -
    >> signs, such as '+45-2+8' => '+4528', but now that I see what you are
    >> doing I can probably figure out how to do that. I definitely need to
    >> learn more about regular expressions!
    >>
    >> Thanks for your time and effort.
    >>
    >> --Alex
    >> --
    >> Posted via http://www.ruby-forum.com/.
    >>
    >>

    > It wasn't done, because I wanted clarification on the tests first.
    >
    > Anyway, this one passes all tests.
    >
    > def clean_string(str)
    > str =~ /\A([-+]?)([eE]?)([^eE.]*\.?)([^eE]*)((?:[eE][+-]?)?)([^Z]*)\Z/
    > posneg , misplaced_e , before_dec , after_dec , e , exponent = $1 , $2 ,
    > $3 , $4 , $5 , $6
    > posneg + before_dec.gsub(/[^0-9.]/,'') + after_dec.gsub(/[^0-9]/,'') + e
    > + exponent.gsub(/[^0-9]/,'')
    >
    > end
    >
    > require 'test/unit'
    > class TestCleanString < Test::Unit::TestCase
    > def test_delete_chars
    > assert_equal '-24.5e45' , clean_string('-24.5fge4x5')
    > end
    > def test_delete_extra_decimal
    > assert_equal '2.45' , clean_string('2.4.5')
    > assert_equal '2.45' , clean_string('2..45')
    > assert_equal '2.45' , clean_string('2...45')
    > end
    > def test_delete_extra_decimal_in_exponent
    > assert_equal '245e76' , clean_string('245e7.6')
    > end
    > def test_delete_extra_or_misplaced_pos_and_neg_signs
    > assert_equal '+4568e+45' , clean_string('+45-68+e+45-')
    > end
    > def test_delete_extra_or_misplaced_e_or_E
    > assert_equal '4.67e67' , clean_string('4.67e6e-7')
    > assert_equal '+4.67e-7' , clean_string('+e4.67e-7')
    > end
    > end
    >


    Found a bug, the [^Z] in the last caputre group should be a [^\Z] (or you
    prefer, you could just swap it out with .* I don't know if it makes a
    difference, I just usually try to match based on the next thing I want to
    hit, in this case it's the end of the string).



    Here is another version, it does the same thing, but I think it's prettier.
    I swapped out the plusses for << because they're much quicker when you don't
    need a new object.

    def digits_only(str)
    str.gsub /[^0-9]/ , ''
    end

    def clean_string(str)
    str =~ /\A([-+]?)([eE]?)([^eE.]*)(\.?)([^eE]*)((?:[eE][+-]?)?)([^\Z]*)\Z/
    $1 << digits_only($3) << $4 << digits_only($5) << $6 << digits_only($7)
    end




    And here is the same thing, but it assigns them to variables first. It's
    uglier, but if you have to sort through it later, it can be nice to know
    what the regex is supposed to be capturing.

    def digits_only(str)
    str.gsub /[^0-9]/ , ''
    end

    def clean_string(str)
    str =~ /\A([-+]?)([eE]?)([^eE.]*)(\.?)([^eE]*)((?:[eE][+-]?)?)([^\Z]*)\Z/
    posneg , misplaced_e , before_dec , dec , after_dec , e ,
    exponent =
    $1 , $2 , digits_only($3) , $4 , digits_only($5) , $6 ,
    digits_only($7)
    $1 << digits_only($3) << $4 << digits_only($5) << $6 << digits_only($7)
    end
     
    Josh Cheek, Apr 14, 2010
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    417
    Patricia Shanahan
    Jun 23, 2006
  2. Replies:
    10
    Views:
    2,839
    Torsten Bronger
    Dec 15, 2005
  3. Dilip
    Replies:
    8
    Views:
    475
    Ernie Wright
    Dec 28, 2006
  4. Saraswati lakki
    Replies:
    0
    Views:
    1,356
    Saraswati lakki
    Jan 6, 2012
  5. Stefan Ram
    Replies:
    2
    Views:
    250
    Eric Sosman
    Dec 26, 2012
Loading...

Share This Page