yet another simple command-line option parser

E

Eric Mahurin

I just put in a good example for:

http://rcrchive.net/rcr/show/317

It is a simple option parser that has option defaults and
converts the options to the right type:

# these klass.from_s methods are the meat of the RCR
def Float.from_s(s);s.to_f;end
def Integer.from_s(s);s.to_i;end
def Symbol.from_s(s);s.to_sym;end
def String.from_s(s);s.to_s;end
def Regexp.from_s(s,*other);new(s,*other);end
require 'time.rb'
def Time.from_s(s,*other);Time.parse(s,*other);end

def argv_options(options)
i =3D 0
while arg =3D ARGV
if arg[0]=3D=3D?-
arg.slice!(0)
ARGV.slice!(i)
break if arg=3D=3D"-" # -- terminates options
opt =3D arg.to_sym
default =3D options[opt]
if default
klass =3D default.class
# would need a big case statement w/o RCR 317
options[opt] =3D klass.from_s(ARGV.slice!(i))
elsif default.nil?
raise("unknown option -#{opt}")
else # default=3D=3Dfalse
options[opt] =3D true
end
else
i +=3D 1
end
end
options
end

ARGV.replace(%w(
-n 4
-multiplier 3.14
-q
-title foobar
-pattern fo+
-time 5:55PM
-method downcase
a b c
))
# option =3D> default (or false for a flag)
argv_options(
:n =3D> 1,
:multiplier =3D> 1.0,
:q =3D> false,
:title =3D> "hello world!",
:pattern =3D> /.*/,
:time =3D> Time.new,
:method =3D> :to_s
)


If you have an opinion about the usefulness of this RCR, go
vote and give a comment. I didn't have an example before to
show it in action.


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20
 
J

Jim Freeze

That's pretty interesting Eric, to grab the type off the default.
I think I'll add that to CommandLine::OptionParser.

However, I'm still not sure if I like the #from_s form,=20
but I can see the utility of it. For the common cases,=20
I can use a simple case statement:

case default
when Float then Float(arg)
when Fixnum then Integer(arg)
end

But, as you can see with even these simple cases, there
are big issues and big questions to answer.
1. Fixnum does not match Integer
2. Do we use to_i or Integer(#) - Integer raises and to_i does not
3. Do we use Float or to_f - Float raises and to_f does not

Then, there are the tougher cases like

require 'parsedate'
case default
when Time then Time.gm(*ParseDate.parsedate(arg))
when Fixnum arg.to_i # what if yield bignum
end

where we have to use a help class and helper method (not new) to get
the object we want. Or if the conversion method
returns something other than what we requested, like
Bignum instead of Fixnum.

Sadly, the #from_s RCR doesn't seem to address any of these issues.

ARGV.replace(%w(
-n 4
-multiplier 3.14
-q
-title foobar
-pattern fo+
-time 5:55PM
-method downcase
a b c
))
# option =3D> default (or false for a flag)
argv_options(
:n =3D> 1,
:multiplier =3D> 1.0,
:q =3D> false,
:title =3D> "hello world!",
:pattern =3D> /.*/,
:time =3D> Time.new,
:method =3D> :to_s
)

--=20
Jim Freeze
 
E

Eric Mahurin

Thanks for the input Jim. Comments below. I'm also putting
this in the RCR comments.

--- Jim Freeze said:
That's pretty interesting Eric, to grab the type off the
default.
I think I'll add that to CommandLine::OptionParser.
=20
However, I'm still not sure if I like the #from_s form,=20
but I can see the utility of it. For the common cases,=20
I can use a simple case statement:
=20
case default
when Float then Float(arg)
when Fixnum then Integer(arg)
end
=20
But, as you can see with even these simple cases, there
are big issues and big questions to answer.
1. Fixnum does not match Integer

No problem. Fixnum inherits from Integer. Do you care whether
Fixnum.from_s returns a Fixnum or a Bignum? It could return
either just like many of the other Fixnum instance methods.
2. Do we use to_i or Integer(#) - Integer raises and to_i
does not
3. Do we use Float or to_f - Float raises and to_f does not

Good point. This RCR should to specify this. I would think it
best if an exception occur if the full string doesn't parse the
the target type. I'll change the implementation to use the
methods that raise exceptions.
Then, there are the tougher cases like
=20
require 'parsedate'
case default
when Time then Time.gm(*ParseDate.parsedate(arg))

I haven't dealt with dates and times to know what all the
options are. I threw this in at the last minute. If you don't
like the klass.from_s method, you could define your own derived
class (or override that klass.from_s):

require 'parsedate'
class MyTime < Time
def self.from_s(s)
gm(*ParseDate.parsedate(s))
end
end

and then make the default be a MyTime instead of a Time.
when Fixnum arg.to_i # what if yield bignum

answered above
end
=20
where we have to use a help class and helper method (not new)
to get
the object we want. Or if the conversion method
returns something other than what we requested, like
Bignum instead of Fixnum.
=20
Sadly, the #from_s RCR doesn't seem to address any of these
issues.
=20
=20
=20
--=20
Jim Freeze
=20
=20



=09
=09
______________________________________________________=20
Yahoo! for Good=20
Donate to the Hurricane Katrina relief effort.=20
http://store.yahoo.com/redcross-donate3/=20
 
J

Jim Freeze

Thanks for the input Jim. Comments below. I'm also putting
this in the RCR comments.
=20
No problem. Fixnum inherits from Integer. Do you care whether
Fixnum.from_s returns a Fixnum or a Bignum? It could return
either just like many of the other Fixnum instance methods.

I was just thinking it would be strange for me to write

Fixnum.from_s(...)=20

and get back a BigNum.

The natural thing to do is to have it return a BigNum, but that
is not what was requested. The symmetric thing to do (see below)
is to have it raise an exception, but that would have little use and
be quite annoying.
=20
Good point. This RCR should to specify this. I would think it
best if an exception occur if the full string doesn't parse the
the target type. I'll change the implementation to use the
methods that raise exceptions.

In the two cases above I think an exception should be raised.
require 'parsedate'
class MyTime < Time
def self.from_s(s)
gm(*ParseDate.parsedate(s))
end
end
=20
and then make the default be a MyTime instead of a Time.
=20
Yes, that would work, and be a pain. My thought is that if #from_s was
integrated into Ruby, we wouldn't have this problem. The parsing functional=
ity
would be built into the Time class.

But even so, there will always be cases like those above, but with differen=
t
classes. Is there a way to handle them in an aesthetically pleasing way?

--=20
Jim Freeze
 
E

Eric Mahurin

--- Jim Freeze said:
=20
I was just thinking it would be strange for me to write
=20
Fixnum.from_s(...)=20
=20
and get back a BigNum.
=20
The natural thing to do is to have it return a BigNum, but
that
is not what was requested. The symmetric thing to do (see
below)
is to have it raise an exception, but that would have little
use and
be quite annoying.

From my perspective, the only reason for the distinction
between Fixnum and Bignum is the efficiency (runtime and
memory) of Fixnum - a very good benefit. Otherwise, I think
you should consider them the same. Many of the methods in
Fixnum and Bignum can return either a Fixnum or Bignum.=20
(Fixnum/Bignum/Integer).from_s would be no exception. I just
think of Fixnum and Bignum being the same type.
=20
In the two cases above I think an exception should be raised.
=20
=20
Yes, that would work, and be a pain. My thought is that if
#from_s was
integrated into Ruby, we wouldn't have this problem. The
parsing functionality
would be built into the Time class.

Maybe I just picked the wrong functionality for Time.from_s.
But even so, there will always be cases like those above, but
with different
classes. Is there a way to handle them in an aesthetically
pleasing way?

You'll also find just as many holes when trying to use obj.to_s
to convert an object to a string. Sometimes it isn't going to
do it like you want. You could argue that we don't need any
convention for #to_* methods based on that. I think some
convention for specifying default ways to go to and from
another specific class is a good thing. Preferrably those
methods could take optional arguments to modify the behavior.=20
Or just have additional methods.

If you have a better idea I'm listening. I just proposed about
the simplest way convert from an specific type to an arbitrary
type (#to_* provides arbitrary type to specific type). There
have been other solutions for going from arbitrary to
arbitrary, but I have yet to see an application for that.


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20
 
J

Jim Freeze

In one way what you are suggesting is just a serialization
method, it just so happens that you are suggesting a string be used
to store the serialized data.

obj.to_s.from_s =3D=3D obj #=3D> true

Which is essentially a serialization of obj, and is OO.=20
Right now, Marshal is functional

Marshal.load(Mashal.dump(obj)) =3D=3D obj #=3D> true

One could argue that we have this, but with YAML.
YAML.load(obj.to_yaml) =3D=3D obj #=3D> true

So, why don't we add a #from_yaml or #yaml_load to every
object? I don't think I want this done automatically, but it
would be nice to have it available so I can extend a class or
object as needed.

I think that every time we visit this subject, it goes back to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is more natural.

But, I'm warming up to #from_s more and more, but still not sure yet.

From my perspective, the only reason for the distinction
between Fixnum and Bignum is the efficiency (runtime and
memory) of Fixnum - a very good benefit. Otherwise, I think
you should consider them the same. Many of the methods in

Agreed

--=20
Jim Freeze
 
E

Eric Mahurin

--- Jim Freeze said:
In one way what you are suggesting is just a serialization
method, it just so happens that you are suggesting a string
be used
to store the serialized data.
=20
obj.to_s.from_s =3D=3D obj #=3D> true

With my proposal you would write the above as:

obj.class.from_s(obj.to_s) =3D=3D obj #=3D> true

But, the intent is not to do something like Marshal does. I'm
only proposiing klass.from_s methods where it makes sense, not
for handling all classes like Marshal does. Only if you would
want to parse an object of a certain type from a human
readable/writable string would you want to make a from_s class
method for that type.

On top of that, Marshal really doesn't help with type
conversion. It only converts from a type to a machine readable
byte stream and back. It's kind of like changing the storage
from memory to file like mmap does?? YAML seems similar too.
Which is essentially a serialization of obj, and is OO.=20
Right now, Marshal is functional
=20
Marshal.load(Mashal.dump(obj)) =3D=3D obj #=3D> true
=20
One could argue that we have this, but with YAML.
YAML.load(obj.to_yaml) =3D=3D obj #=3D> true
=20
So, why don't we add a #from_yaml or #yaml_load to every
object? I don't think I want this done automatically, but it
would be nice to have it available so I can extend a class or
object as needed.

I'm not sure what that would buy you.
I think that every time we visit this subject, it goes back
to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is
more natural.

It probably ends up like that because in C++ objects can write
themselves out to a stream and read themselves in from a
stream. I like what is there now too. I haven't used C++ in a
while, so I can't say a whole lot about how it is managed
(where the methods live).
But, I'm warming up to #from_s more and more, but still not
sure yet.

Good. I'd like to know what you finally decide on for your
command-line parser.
=20
Agreed
=20
--=20
Jim Freeze
=20
=20


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20
 
B

Brian Schröder

[snip]
=20
I think that every time we visit this subject, it goes back to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is more natural.
=20

We don't have puts, but we have display

irb(main):005:0> 12.display
12=3D> nil
irb(main):006:0> [1,2,3].display
123=3D> nil
irb(main):007:0> "a string".display
a string=3D> nil

regards,

Brian

--=20
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/
 
G

Gavin Kistner

But, the intent is not to do something like Marshal does. I'm
only proposiing klass.from_s methods where it makes sense, not
for handling all classes like Marshal does. Only if you would
want to parse an object of a certain type from a human
readable/writable string would you want to make a from_s class
method for that type.

My problem with the RCR (which I added to the comments) goes like this:

#from_s is a good design pattern for your own classes. Do it.

The RCR is to change the core language classes. Why?
It's not for convenience, because the same methods already exist,
just in different forms.

So I presume it's for consistency. The problem with this is that it
still won't be consistent, because some classes do not have an
unambiguous string->instance conversion path. String#split exists
because there are lots of real-world cases with different delimiters
for arrays. Array.from_s would need to
* account for this (and thus have a different interface, with
optional #split type param)
* not account for it (requiring #split for all "non-standard"
string cases)
* not exist (as your RCR seems to suggest)

If this is about object serialization and later deserialization,
Marshal and YAML and friends exist.

If this is about object deserialization only, then the real world
jumps in and says:
* Too many ambiguous cases to make this clear for more than a few
core classes

* So what's the point?
 
E

Eric Mahurin

--- Gavin Kistner said:
=20
My problem with the RCR (which I added to the comments) goes
like this:
=20
#from_s is a good design pattern for your own classes. Do it.
=20
The RCR is to change the core language classes. Why?
It's not for convenience, because the same methods already
exist, =20
just in different forms.
=20
So I presume it's for consistency. The problem with this is
that it =20
still won't be consistent, because some classes do not have
an =20
unambiguous string->instance conversion path. String#split
exists =20
because there are lots of real-world cases with different
delimiters =20
for arrays. Array.from_s would need to
* account for this (and thus have a different interface,
with =20
optional #split type param)
* not account for it (requiring #split for all
"non-standard" =20
string cases)
* not exist (as your RCR seems to suggest)
=20
If this is about object serialization and later
deserialization, =20
Marshal and YAML and friends exist.
=20
If this is about object deserialization only, then the real
world =20
jumps in and says:
* Too many ambiguous cases to make this clear for more
than a few =20
core classes
=20
* So what's the point?

It is not for consistency. It is for converting a string to a
somewhat arbitrary class. It doesn't need to be in every
class. Just like #to_i is in some classes and not others.=20
I've never suggested putting it in Array. I'm also not
suggesting it be used for object deserialization.

To get an idea of the usefulness, see the example I gave. How
would you implement this simple option parser API (what started
this thread)?

ARGV.replace(%w(
-n 4
-multiplier 3.14
-q
-title foobar
-pattern fo+
-time 5:55PM
-method downcase
a b c
))
options =3D argv_options(
:n =3D> 1,
:multiplier =3D> 1.0,
:q =3D> false,
:title =3D> "hello world!",
:pattern =3D> /.*/,
:time =3D> Time.new,
:method =3D> :to_s,
:default =3D> 1.23
)

#=3D> {:default=3D>1.23, :time=3D>Wed Sep 14 17:55:00 Central
Daylight Time 2005, :n=3D>4, :multiplier=3D>3.14, :title=3D>"foobar",
:q=3D>true, :method=3D>:downcase, :pattern=3D>/fo+/}

I don't think you'll find a cleaner/more flexible solution than
the one I gave.


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20
 
E

Eric Mahurin

--- Brian Schr=F6der said:
[snip]
=20
I think that every time we visit this subject, it goes back to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is more natural.
=20
=20
We don't have puts, but we have display
=20
irb(main):005:0> 12.display
12=3D> nil
irb(main):006:0> [1,2,3].display
123=3D> nil
irb(main):007:0> "a string".display
a string=3D> nil

I forgot about that one. This would be great to override if
the object was a big datastructure and you could easily write
it out a chunk at a time rather than convert it to one massive
string first. But, I haven't seen anybody use this or override
it.

The complement to the this obj.display(io) method would be a
klass.read(io) method, but it gets even harder to do than the
klass.from_s(s) of my proposal because you don't know how much
to read from the io (what would String.read(io) do? - maybe
stop at a newline?). At some point you just have to go to
traditional parsing. But, maybe it is still useful. I don't
know.



=09
__________________________________=20
Yahoo! Mail - PC Magazine Editors' Choice 2005=20
http://mail.yahoo.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top