yet another simple command-line option parser

Discussion in 'Ruby' started by Eric Mahurin, Sep 13, 2005.

  1. Eric Mahurin

    Eric Mahurin Guest

    I just put in a good example for:

    http://rcrchive.net/rcr/show/317

    It is a simple option parser that has option defaults and
    converts the options to the right type:

    # these klass.from_s methods are the meat of the RCR
    def Float.from_s(s);s.to_f;end
    def Integer.from_s(s);s.to_i;end
    def Symbol.from_s(s);s.to_sym;end
    def String.from_s(s);s.to_s;end
    def Regexp.from_s(s,*other);new(s,*other);end
    require 'time.rb'
    def Time.from_s(s,*other);Time.parse(s,*other);end

    def argv_options(options)
    i =3D 0
    while arg =3D ARGV
    if arg[0]=3D=3D?-
    arg.slice!(0)
    ARGV.slice!(i)
    break if arg=3D=3D"-" # -- terminates options
    opt =3D arg.to_sym
    default =3D options[opt]
    if default
    klass =3D default.class
    # would need a big case statement w/o RCR 317
    options[opt] =3D klass.from_s(ARGV.slice!(i))
    elsif default.nil?
    raise("unknown option -#{opt}")
    else # default=3D=3Dfalse
    options[opt] =3D true
    end
    else
    i +=3D 1
    end
    end
    options
    end

    ARGV.replace(%w(
    -n 4
    -multiplier 3.14
    -q
    -title foobar
    -pattern fo+
    -time 5:55PM
    -method downcase
    a b c
    ))
    # option =3D> default (or false for a flag)
    argv_options(
    :n =3D> 1,
    :multiplier =3D> 1.0,
    :q =3D> false,
    :title =3D> "hello world!",
    :pattern =3D> /.*/,
    :time =3D> Time.new,
    :method =3D> :to_s
    )


    If you have an opinion about the usefulness of this RCR, go
    vote and give a comment. I didn't have an example before to
    show it in action.


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around=20
    http://mail.yahoo.com=20
    Eric Mahurin, Sep 13, 2005
    #1
    1. Advertising

  2. Eric Mahurin

    Jim Freeze Guest

    That's pretty interesting Eric, to grab the type off the default.
    I think I'll add that to CommandLine::OptionParser.

    However, I'm still not sure if I like the #from_s form,=20
    but I can see the utility of it. For the common cases,=20
    I can use a simple case statement:

    case default
    when Float then Float(arg)
    when Fixnum then Integer(arg)
    end

    But, as you can see with even these simple cases, there
    are big issues and big questions to answer.
    1. Fixnum does not match Integer
    2. Do we use to_i or Integer(#) - Integer raises and to_i does not
    3. Do we use Float or to_f - Float raises and to_f does not

    Then, there are the tougher cases like

    require 'parsedate'
    case default
    when Time then Time.gm(*ParseDate.parsedate(arg))
    when Fixnum arg.to_i # what if yield bignum
    end

    where we have to use a help class and helper method (not new) to get
    the object we want. Or if the conversion method
    returns something other than what we requested, like
    Bignum instead of Fixnum.

    Sadly, the #from_s RCR doesn't seem to address any of these issues.


    > ARGV.replace(%w(
    > -n 4
    > -multiplier 3.14
    > -q
    > -title foobar
    > -pattern fo+
    > -time 5:55PM
    > -method downcase
    > a b c
    > ))
    > # option =3D> default (or false for a flag)
    > argv_options(
    > :n =3D> 1,
    > :multiplier =3D> 1.0,
    > :q =3D> false,
    > :title =3D> "hello world!",
    > :pattern =3D> /.*/,
    > :time =3D> Time.new,
    > :method =3D> :to_s
    > )


    --=20
    Jim Freeze
    Jim Freeze, Sep 13, 2005
    #2
    1. Advertising

  3. Eric Mahurin

    Eric Mahurin Guest

    Thanks for the input Jim. Comments below. I'm also putting
    this in the RCR comments.

    --- Jim Freeze <> wrote:
    > That's pretty interesting Eric, to grab the type off the
    > default.
    > I think I'll add that to CommandLine::OptionParser.
    >=20
    > However, I'm still not sure if I like the #from_s form,=20
    > but I can see the utility of it. For the common cases,=20
    > I can use a simple case statement:
    >=20
    > case default
    > when Float then Float(arg)
    > when Fixnum then Integer(arg)
    > end
    >=20
    > But, as you can see with even these simple cases, there
    > are big issues and big questions to answer.
    > 1. Fixnum does not match Integer


    No problem. Fixnum inherits from Integer. Do you care whether
    Fixnum.from_s returns a Fixnum or a Bignum? It could return
    either just like many of the other Fixnum instance methods.

    > 2. Do we use to_i or Integer(#) - Integer raises and to_i
    > does not
    > 3. Do we use Float or to_f - Float raises and to_f does not


    Good point. This RCR should to specify this. I would think it
    best if an exception occur if the full string doesn't parse the
    the target type. I'll change the implementation to use the
    methods that raise exceptions.

    > Then, there are the tougher cases like
    >=20
    > require 'parsedate'
    > case default
    > when Time then Time.gm(*ParseDate.parsedate(arg))


    I haven't dealt with dates and times to know what all the
    options are. I threw this in at the last minute. If you don't
    like the klass.from_s method, you could define your own derived
    class (or override that klass.from_s):

    require 'parsedate'
    class MyTime < Time
    def self.from_s(s)
    gm(*ParseDate.parsedate(s))
    end
    end

    and then make the default be a MyTime instead of a Time.

    > when Fixnum arg.to_i # what if yield bignum


    answered above

    > end
    >=20
    > where we have to use a help class and helper method (not new)
    > to get
    > the object we want. Or if the conversion method
    > returns something other than what we requested, like
    > Bignum instead of Fixnum.
    >=20
    > Sadly, the #from_s RCR doesn't seem to address any of these
    > issues.
    >=20
    >=20
    > > ARGV.replace(%w(
    > > -n 4
    > > -multiplier 3.14
    > > -q
    > > -title foobar
    > > -pattern fo+
    > > -time 5:55PM
    > > -method downcase
    > > a b c
    > > ))
    > > # option =3D> default (or false for a flag)
    > > argv_options(
    > > :n =3D> 1,
    > > :multiplier =3D> 1.0,
    > > :q =3D> false,
    > > :title =3D> "hello world!",
    > > :pattern =3D> /.*/,
    > > :time =3D> Time.new,
    > > :method =3D> :to_s
    > > )

    >=20
    > --=20
    > Jim Freeze
    >=20
    >=20




    =09
    =09
    ______________________________________________________=20
    Yahoo! for Good=20
    Donate to the Hurricane Katrina relief effort.=20
    http://store.yahoo.com/redcross-donate3/=20
    Eric Mahurin, Sep 13, 2005
    #3
  4. Eric Mahurin

    Jim Freeze Guest

    On 9/13/05, Eric Mahurin <> wrote:
    > Thanks for the input Jim. Comments below. I'm also putting
    > this in the RCR comments.
    >=20
    > No problem. Fixnum inherits from Integer. Do you care whether
    > Fixnum.from_s returns a Fixnum or a Bignum? It could return
    > either just like many of the other Fixnum instance methods.


    I was just thinking it would be strange for me to write

    Fixnum.from_s(...)=20

    and get back a BigNum.

    The natural thing to do is to have it return a BigNum, but that
    is not what was requested. The symmetric thing to do (see below)
    is to have it raise an exception, but that would have little use and
    be quite annoying.

    > > 2. Do we use to_i or Integer(#) - Integer raises and to_i
    > > does not
    > > 3. Do we use Float or to_f - Float raises and to_f does not

    >=20
    > Good point. This RCR should to specify this. I would think it
    > best if an exception occur if the full string doesn't parse the
    > the target type. I'll change the implementation to use the
    > methods that raise exceptions.


    In the two cases above I think an exception should be raised.

    > require 'parsedate'
    > class MyTime < Time
    > def self.from_s(s)
    > gm(*ParseDate.parsedate(s))
    > end
    > end
    >=20
    > and then make the default be a MyTime instead of a Time.

    =20
    Yes, that would work, and be a pain. My thought is that if #from_s was
    integrated into Ruby, we wouldn't have this problem. The parsing functional=
    ity
    would be built into the Time class.

    But even so, there will always be cases like those above, but with differen=
    t
    classes. Is there a way to handle them in an aesthetically pleasing way?

    --=20
    Jim Freeze
    Jim Freeze, Sep 13, 2005
    #4
  5. Eric Mahurin

    Eric Mahurin Guest

    --- Jim Freeze <> wrote:

    > On 9/13/05, Eric Mahurin <> wrote:
    > > Thanks for the input Jim. Comments below. I'm also

    > putting
    > > this in the RCR comments.
    > >=20
    > > No problem. Fixnum inherits from Integer. Do you care

    > whether
    > > Fixnum.from_s returns a Fixnum or a Bignum? It could

    > return
    > > either just like many of the other Fixnum instance methods.

    >=20
    > I was just thinking it would be strange for me to write
    >=20
    > Fixnum.from_s(...)=20
    >=20
    > and get back a BigNum.
    >=20
    > The natural thing to do is to have it return a BigNum, but
    > that
    > is not what was requested. The symmetric thing to do (see
    > below)
    > is to have it raise an exception, but that would have little
    > use and
    > be quite annoying.


    From my perspective, the only reason for the distinction
    between Fixnum and Bignum is the efficiency (runtime and
    memory) of Fixnum - a very good benefit. Otherwise, I think
    you should consider them the same. Many of the methods in
    Fixnum and Bignum can return either a Fixnum or Bignum.=20
    (Fixnum/Bignum/Integer).from_s would be no exception. I just
    think of Fixnum and Bignum being the same type.

    > > > 2. Do we use to_i or Integer(#) - Integer raises and to_i
    > > > does not
    > > > 3. Do we use Float or to_f - Float raises and to_f does

    > not
    > >=20
    > > Good point. This RCR should to specify this. I would

    > think it
    > > best if an exception occur if the full string doesn't parse

    > the
    > > the target type. I'll change the implementation to use the
    > > methods that raise exceptions.

    >=20
    > In the two cases above I think an exception should be raised.
    >=20
    > > require 'parsedate'
    > > class MyTime < Time
    > > def self.from_s(s)
    > > gm(*ParseDate.parsedate(s))
    > > end
    > > end
    > >=20
    > > and then make the default be a MyTime instead of a Time.

    > =20
    > Yes, that would work, and be a pain. My thought is that if
    > #from_s was
    > integrated into Ruby, we wouldn't have this problem. The
    > parsing functionality
    > would be built into the Time class.


    Maybe I just picked the wrong functionality for Time.from_s.

    > But even so, there will always be cases like those above, but
    > with different
    > classes. Is there a way to handle them in an aesthetically
    > pleasing way?


    You'll also find just as many holes when trying to use obj.to_s
    to convert an object to a string. Sometimes it isn't going to
    do it like you want. You could argue that we don't need any
    convention for #to_* methods based on that. I think some
    convention for specifying default ways to go to and from
    another specific class is a good thing. Preferrably those
    methods could take optional arguments to modify the behavior.=20
    Or just have additional methods.

    If you have a better idea I'm listening. I just proposed about
    the simplest way convert from an specific type to an arbitrary
    type (#to_* provides arbitrary type to specific type). There
    have been other solutions for going from arbitrary to
    arbitrary, but I have yet to see an application for that.


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around=20
    http://mail.yahoo.com=20
    Eric Mahurin, Sep 13, 2005
    #5
  6. Eric Mahurin

    Jim Freeze Guest

    In one way what you are suggesting is just a serialization
    method, it just so happens that you are suggesting a string be used
    to store the serialized data.

    obj.to_s.from_s =3D=3D obj #=3D> true

    Which is essentially a serialization of obj, and is OO.=20
    Right now, Marshal is functional

    Marshal.load(Mashal.dump(obj)) =3D=3D obj #=3D> true

    One could argue that we have this, but with YAML.
    YAML.load(obj.to_yaml) =3D=3D obj #=3D> true

    So, why don't we add a #from_yaml or #yaml_load to every
    object? I don't think I want this done automatically, but it
    would be nice to have it available so I can extend a class or
    object as needed.

    I think that every time we visit this subject, it goes back to
    similar arguments as to why every object doesn't have puts,
    such as "fred".puts. Matz has made statements on this
    which I agree with. Personally I think that puts("fred") is more natural.

    But, I'm warming up to #from_s more and more, but still not sure yet.

    On 9/13/05, Eric Mahurin <> wrote:

    > From my perspective, the only reason for the distinction
    > between Fixnum and Bignum is the efficiency (runtime and
    > memory) of Fixnum - a very good benefit. Otherwise, I think
    > you should consider them the same. Many of the methods in


    Agreed

    --=20
    Jim Freeze
    Jim Freeze, Sep 13, 2005
    #6
  7. Eric Mahurin

    Eric Mahurin Guest

    --- Jim Freeze <> wrote:

    > In one way what you are suggesting is just a serialization
    > method, it just so happens that you are suggesting a string
    > be used
    > to store the serialized data.
    >=20
    > obj.to_s.from_s =3D=3D obj #=3D> true


    With my proposal you would write the above as:

    obj.class.from_s(obj.to_s) =3D=3D obj #=3D> true

    But, the intent is not to do something like Marshal does. I'm
    only proposiing klass.from_s methods where it makes sense, not
    for handling all classes like Marshal does. Only if you would
    want to parse an object of a certain type from a human
    readable/writable string would you want to make a from_s class
    method for that type.

    On top of that, Marshal really doesn't help with type
    conversion. It only converts from a type to a machine readable
    byte stream and back. It's kind of like changing the storage
    from memory to file like mmap does?? YAML seems similar too.

    > Which is essentially a serialization of obj, and is OO.=20
    > Right now, Marshal is functional
    >=20
    > Marshal.load(Mashal.dump(obj)) =3D=3D obj #=3D> true
    >=20
    > One could argue that we have this, but with YAML.
    > YAML.load(obj.to_yaml) =3D=3D obj #=3D> true
    >=20
    > So, why don't we add a #from_yaml or #yaml_load to every
    > object? I don't think I want this done automatically, but it
    > would be nice to have it available so I can extend a class or
    > object as needed.


    I'm not sure what that would buy you.

    > I think that every time we visit this subject, it goes back
    > to
    > similar arguments as to why every object doesn't have puts,
    > such as "fred".puts. Matz has made statements on this
    > which I agree with. Personally I think that puts("fred") is
    > more natural.


    It probably ends up like that because in C++ objects can write
    themselves out to a stream and read themselves in from a
    stream. I like what is there now too. I haven't used C++ in a
    while, so I can't say a whole lot about how it is managed
    (where the methods live).

    > But, I'm warming up to #from_s more and more, but still not
    > sure yet.


    Good. I'd like to know what you finally decide on for your
    command-line parser.

    > On 9/13/05, Eric Mahurin <> wrote:
    >=20
    > > From my perspective, the only reason for the distinction
    > > between Fixnum and Bignum is the efficiency (runtime and
    > > memory) of Fixnum - a very good benefit. Otherwise, I

    > think
    > > you should consider them the same. Many of the methods in

    >=20
    > Agreed
    >=20
    > --=20
    > Jim Freeze
    >=20
    >=20



    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around=20
    http://mail.yahoo.com=20
    Eric Mahurin, Sep 13, 2005
    #7
  8. > [snip]
    >=20
    > I think that every time we visit this subject, it goes back to
    > similar arguments as to why every object doesn't have puts,
    > such as "fred".puts. Matz has made statements on this
    > which I agree with. Personally I think that puts("fred") is more natural.
    >=20


    We don't have puts, but we have display

    irb(main):005:0> 12.display
    12=3D> nil
    irb(main):006:0> [1,2,3].display
    123=3D> nil
    irb(main):007:0> "a string".display
    a string=3D> nil

    regards,

    Brian

    > [snip]


    --=20
    http://ruby.brian-schroeder.de/

    Stringed instrument chords: http://chordlist.brian-schroeder.de/
    Brian Schröder, Sep 14, 2005
    #8
  9. On Sep 13, 2005, at 3:50 PM, Eric Mahurin wrote:
    > But, the intent is not to do something like Marshal does. I'm
    > only proposiing klass.from_s methods where it makes sense, not
    > for handling all classes like Marshal does. Only if you would
    > want to parse an object of a certain type from a human
    > readable/writable string would you want to make a from_s class
    > method for that type.


    My problem with the RCR (which I added to the comments) goes like this:

    #from_s is a good design pattern for your own classes. Do it.

    The RCR is to change the core language classes. Why?
    It's not for convenience, because the same methods already exist,
    just in different forms.

    So I presume it's for consistency. The problem with this is that it
    still won't be consistent, because some classes do not have an
    unambiguous string->instance conversion path. String#split exists
    because there are lots of real-world cases with different delimiters
    for arrays. Array.from_s would need to
    * account for this (and thus have a different interface, with
    optional #split type param)
    * not account for it (requiring #split for all "non-standard"
    string cases)
    * not exist (as your RCR seems to suggest)

    If this is about object serialization and later deserialization,
    Marshal and YAML and friends exist.

    If this is about object deserialization only, then the real world
    jumps in and says:
    * Too many ambiguous cases to make this clear for more than a few
    core classes

    * So what's the point?
    Gavin Kistner, Sep 14, 2005
    #9
  10. Eric Mahurin

    Eric Mahurin Guest

    --- Gavin Kistner <> wrote:

    > On Sep 13, 2005, at 3:50 PM, Eric Mahurin wrote:
    > > But, the intent is not to do something like Marshal does.=20

    > I'm
    > > only proposiing klass.from_s methods where it makes sense,

    > not
    > > for handling all classes like Marshal does. Only if you

    > would
    > > want to parse an object of a certain type from a human
    > > readable/writable string would you want to make a from_s

    > class
    > > method for that type.

    >=20
    > My problem with the RCR (which I added to the comments) goes
    > like this:
    >=20
    > #from_s is a good design pattern for your own classes. Do it.
    >=20
    > The RCR is to change the core language classes. Why?
    > It's not for convenience, because the same methods already
    > exist, =20
    > just in different forms.
    >=20
    > So I presume it's for consistency. The problem with this is
    > that it =20
    > still won't be consistent, because some classes do not have
    > an =20
    > unambiguous string->instance conversion path. String#split
    > exists =20
    > because there are lots of real-world cases with different
    > delimiters =20
    > for arrays. Array.from_s would need to
    > * account for this (and thus have a different interface,
    > with =20
    > optional #split type param)
    > * not account for it (requiring #split for all
    > "non-standard" =20
    > string cases)
    > * not exist (as your RCR seems to suggest)
    >=20
    > If this is about object serialization and later
    > deserialization, =20
    > Marshal and YAML and friends exist.
    >=20
    > If this is about object deserialization only, then the real
    > world =20
    > jumps in and says:
    > * Too many ambiguous cases to make this clear for more
    > than a few =20
    > core classes
    >=20
    > * So what's the point?


    It is not for consistency. It is for converting a string to a
    somewhat arbitrary class. It doesn't need to be in every
    class. Just like #to_i is in some classes and not others.=20
    I've never suggested putting it in Array. I'm also not
    suggesting it be used for object deserialization.

    To get an idea of the usefulness, see the example I gave. How
    would you implement this simple option parser API (what started
    this thread)?

    ARGV.replace(%w(
    -n 4
    -multiplier 3.14
    -q
    -title foobar
    -pattern fo+
    -time 5:55PM
    -method downcase
    a b c
    ))
    options =3D argv_options(
    :n =3D> 1,
    :multiplier =3D> 1.0,
    :q =3D> false,
    :title =3D> "hello world!",
    :pattern =3D> /.*/,
    :time =3D> Time.new,
    :method =3D> :to_s,
    :default =3D> 1.23
    )

    #=3D> {:default=3D>1.23, :time=3D>Wed Sep 14 17:55:00 Central
    Daylight Time 2005, :n=3D>4, :multiplier=3D>3.14, :title=3D>"foobar",
    :q=3D>true, :method=3D>:downcase, :pattern=3D>/fo+/}

    I don't think you'll find a cleaner/more flexible solution than
    the one I gave.


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around=20
    http://mail.yahoo.com=20
    Eric Mahurin, Sep 14, 2005
    #10
  11. Eric Mahurin

    Eric Mahurin Guest

    --- Brian Schr=F6der <> wrote:

    > > [snip]
    > >=20
    > > I think that every time we visit this subject, it goes back

    > to
    > > similar arguments as to why every object doesn't have puts,
    > > such as "fred".puts. Matz has made statements on this
    > > which I agree with. Personally I think that puts("fred") is

    > more natural.
    > >=20

    >=20
    > We don't have puts, but we have display
    >=20
    > irb(main):005:0> 12.display
    > 12=3D> nil
    > irb(main):006:0> [1,2,3].display
    > 123=3D> nil
    > irb(main):007:0> "a string".display
    > a string=3D> nil


    I forgot about that one. This would be great to override if
    the object was a big datastructure and you could easily write
    it out a chunk at a time rather than convert it to one massive
    string first. But, I haven't seen anybody use this or override
    it.

    The complement to the this obj.display(io) method would be a
    klass.read(io) method, but it gets even harder to do than the
    klass.from_s(s) of my proposal because you don't know how much
    to read from the io (what would String.read(io) do? - maybe
    stop at a newline?). At some point you just have to go to
    traditional parsing. But, maybe it is still useful. I don't
    know.



    =09
    __________________________________=20
    Yahoo! Mail - PC Magazine Editors' Choice 2005=20
    http://mail.yahoo.com
    Eric Mahurin, Sep 14, 2005
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Newsgroup - Ann

    simple question about command line option

    Newsgroup - Ann, Aug 6, 2003, in forum: C++
    Replies:
    2
    Views:
    357
    E. Robert Tisdale
    Aug 6, 2003
  2. Greg B

    simple command line parser

    Greg B, Sep 2, 2003, in forum: C Programming
    Replies:
    4
    Views:
    493
  3. Manlio Perillo

    Yet Another Command Line Parser

    Manlio Perillo, Oct 26, 2004, in forum: Python
    Replies:
    9
    Views:
    349
    Manlio Perillo
    Oct 27, 2004
  4. Berehem
    Replies:
    4
    Views:
    530
    Lawrence Kirby
    Apr 28, 2005
  5. stevetuckner
    Replies:
    3
    Views:
    179
    stevetuckner
    Sep 21, 2005
Loading...

Share This Page