Duck Typing Hash-Like Objects

G

Gary Wright

Is it *really* a problem that strings and integers produce values that
your method would make use of? Say someone wants to encode those input
parameters into a string - as long as [] works, they can. Why is this
a problem?

As a general response, I'd say that there is a strong semantic
difference between objects that store key/value pairs (Hash,
OrderedHash,
RBTree, etc.) and objects that aren't general collections but do
have some sort of indexing features (Integers, Strings). Arrays are
sort of in the middle.

You certainly can consider integers as ordered collections of bits
and strings
as collections of bytes (and sometimes that would be just great) but at
other times I think it is reasonable to consider integers
and strings as distinctly different than a hash or a tree.

I suppose this is a general problem of agreeing on some sort of
taxonomy for
data structures and being able to dynamically query for various
features.
Smalltalk pushes this a lot farther than Ruby: Dictionary,
IdentityDictionary,
Bag, Set, SortedCollection, ArrayedCollection, ByteArray, String,
Text, Array,
RunArray, LinkedList, OrderedCollection.

Usually I would avoid introspection on an object but it seems like
when you
are designing DSLs and/or dealing with object construction it is
sometimes
necessary to be a little more nosey.

I'm going to stick with testing for #fetch when I want to know if an
object
is a collection with keys. I think this is marginally better than using
(Hash === obj). I can always test for #at if I want to pick out Arrays.

Gary Wright
 
L

Leslie Viljoen

Is it *really* a problem that strings and integers produce values that
your method would make use of? Say someone wants to encode those input
parameters into a string - as long as [] works, they can. Why is this
a problem?

As a general response, I'd say that there is a strong semantic
difference between objects that store key/value pairs (Hash,
OrderedHash,
RBTree, etc.) and objects that aren't general collections but do
have some sort of indexing features (Integers, Strings). Arrays are
sort of in the middle.

But why does the difference matter? Are you going to be using the
parameters to your function in a way that would make Arrays or Strings
throw exceptions? Then you would need to create some sort of test for
that usage.

Otherwise, why restrict the user of your method this way?
 
G

Gary Wright

Otherwise, why restrict the user of your method this way?

The entire context is that of object construction and how best to
'parse' an argument list. I'm trying to focus on this very
specific context (which is closely associated with DSL designs)
and not the instance method situation where you've already got
a constructed object.

I'll go back to my simple example:

Array.new 3
Array.new [1,2]

Do you really think that Array.new(3) should result in
[1, 1], by viewing the number 3 as an array of bits? Or
alternatively should Array.new([1,2]) fail because it
doesn't make sense to allocate [1,2] entries?

Array.new must 'parse' its argument list and act
accordingly. The interface to Array could be:

Array.new( 3 )
Array.new_from_arrayish_obj( [1,2] )

so as to avoid the overloading problem but I don't
particularly think this is an improvement. Array#new
already uses a form of duck typing:

h = {1,2,3,4}
def h.to_ary; to_a; end

a = Array.new h
puts a # [[1,2],[3,4]]

Array#new doesn't even check to see if something is an instance
of Array but instead tries #to_ary. If it fails then Array trys
#to_int, and if that doesn't work reports an exception.

My original question was how best to do this with hash-like objects.
David Black suggested #to_hash, which makes a lot of sense but it
might result in slurping up a huge data structure into an in-memory
hash when all you really want to do is look things up by keys.

Based on this thread discussion, it doesn't seem there is an
'obvious' answer for all (or most) situations.
There are a variety of tradeoffs between using #to_hash, trying #[] or
#fetch, testing for #has_key?, or even simply iterating with #each. All
of those seem marginally better than using #kind_of?(Hash).

Gary Wright
 
R

Rick DeNatale

A few years ago there were some interesting attempts to
come up with a systematic way to determine an object's type, in the
sense of its full profile and interface, at any given point in its
life. The idea was to be able to get some kind of rich response from
the object, well beyond what respond_to? and is_a? provide, in order
to determine whether you'd gotten hold of the type of object you
needed. I seem to recall it turned out to be very challenging,
perhaps impossible, to come up with a complete system for this. I'm
not sure if anyone is still working on it. But it's an interesting
area.

All of this is personal perspective of course, but my view of duck
typing is that it's really a question of the type of a variable rather
than on objects. In other words in my view variables have types which
are generated by their usage. Types in this view are like job
requirements.

So rather than talking about say an array type, to me duck typing is
talking about the type of a variable which is used in a certain way,
which might be somewhat idiosyncratic to the user. Some common such
types do exist, like a queue, a stack, a generalized collection (with
various requirements as to access, ordering etc.) given one of these,
or a more idiosyncratic type, several objects might work as the value
of the variable in question.

As an example, lets say I'm looking for something to use to drive a
nail. The obvious 'type' of thing for this job is a hammer, but a
heavy wrench, or a rock can also serve since the usage really just
requires a mass which can be conveniently accelerated so as to impart
inertia to that nail. If I don't have a hammer to hand, I can press
one of these other objects into service.

In this view of duck typing choosing an object is akin to hiring
someone, you make an initial assessment of whether the potential
employee has the requirements, and if s/he passes that sniff test, you
hire him/her and test that assessment over time.

These kind of types also can require much more than a simple list of
provided interfaces, they also typically rely to one degree or another
on the semantics of those interfaces, often including how the object's
observed behavior is affected by the SEQUENCE of calls. These types
of types are much harder to statically check. And they cause bugs in
either a statically or dynamically typed system. In my experience,
the kind of stupid bugs which are flushed out by a static type system
are a small percentage of the bugs which are caused by these semantic
mismatches.

I know that this viewpoint cause conniption fits in folks who believe
in the religion of static type checking, and I've long ago given up
trying to proselytize those with strong convictions. All I can say is
that I've found such a view of typing combined with a language such as
Ruby which supports it has provided a powerful approach to building
software.

Both static and dynamic typing have benefits and problems, speaking
solely for myself, I just prefer both the benefits and the problems of
dynamically typed systems over those of statically typed ones.
 
D

dblack

Hi --

All of this is personal perspective of course, but my view of duck
typing is that it's really a question of the type of a variable rather
than on objects. In other words in my view variables have types which
are generated by their usage. Types in this view are like job
requirements.

I'm not sure how that detaches it from the object, though, since
variables are going to contain references to objects and messages are
sent to objects rather than variables. Do you mean in terms of
documentation?
So rather than talking about say an array type, to me duck typing is
talking about the type of a variable which is used in a certain way,
which might be somewhat idiosyncratic to the user. Some common such
types do exist, like a queue, a stack, a generalized collection (with
various requirements as to access, ordering etc.) given one of these,
or a more idiosyncratic type, several objects might work as the value
of the variable in question.

I don't think duck typing has ever been about a checklist of common
data structures or types (queue, stack, etc.), but more about just
sending messages to objects without a lot of pre-message checking as
to their class and ancestry and so forth. To me, the ultimate duck
typing idiom is:

def m(x)
x << "string"
end

There's no checking, no probing the object -- you just ask it to do
something, and deal with what happens. In the end, that's *always*
how things are when writing Ruby. I've always seen "duck typing" as
just a kind of embrace of the conditions under which every single line
of Ruby is code is written anyway.
As an example, lets say I'm looking for something to use to drive a
nail. The obvious 'type' of thing for this job is a hammer, but a
heavy wrench, or a rock can also serve since the usage really just
requires a mass which can be conveniently accelerated so as to impart
inertia to that nail. If I don't have a hammer to hand, I can press
one of these other objects into service.

All these nouns sound more like classes than types. Actually I don't
think the type of a Ruby object can ever be named. It's circular: the
type of an object is... the type of objects that are of this object's
type :) As soon as it gets noun-like it starts getting class-bound
again.

A duck-typing approach to hammering would be something like:

def hammer(h,n)
h.pound(n)
end

Everything's so in the moment. I know I have a somewhat Utopian view
of Ruby types sometimes :) But I really do find the implications
rather enthralling.

Noun-like-ness, by the way, has always seemed to me to be the
Achilles' heel of the duck-typing analogy -- specifically, the part
where it says: "... then it *is* a duck." That has a way of leading
people back, quite literally, to is_a? -- as if the idea was that Duck
was a class. (I know that's not what you're saying; I'm just reminded
of this thought.) I've seen it rewritten (I can't remember whether it
was by me or Florian Gross; one of us, I think) as:

If it walks like a duck and quacks like a duck, then it walks like
a duck and quacks like a duck.

:)
In this view of duck typing choosing an object is akin to hiring
someone, you make an initial assessment of whether the potential
employee has the requirements, and if s/he passes that sniff test, you
hire him/her and test that assessment over time.

I think it's that very initial assessment that Gary was talking about:
is there something -- not static typing, but some kind of dynamic,
run-time mechanism -- that can provide pre-method-call semantics of an
object?
These kind of types also can require much more than a simple list of
provided interfaces, they also typically rely to one degree or another
on the semantics of those interfaces, often including how the object's
observed behavior is affected by the SEQUENCE of calls. These types
of types are much harder to statically check. And they cause bugs in
either a statically or dynamically typed system. In my experience,
the kind of stupid bugs which are flushed out by a static type system
are a small percentage of the bugs which are caused by these semantic
mismatches.

The question, though, is whether it's possible to come up with a
semantically rich way of reflecting on the type of a Ruby object.
It's not a static-vs.-dynamic thing; static checking wouldn't enter
into it. It's more a question of having something that would do what
people think class-checking does before they realize that it doesn't,
if you see what I mean :) It may well not be possible.


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
G

Gary Wright

The question, though, is whether it's possible to come up with a
semantically rich way of reflecting on the type of a Ruby object.
It's not a static-vs.-dynamic thing; static checking wouldn't enter
into it. It's more a question of having something that would do what
people think class-checking does before they realize that it doesn't,
if you see what I mean :) It may well not be possible.

I think that a consistent and well-known naming protocol gets you
pretty far down the road. David, consider your example:

def m(x)
x << "string"
end

The only reason that this is useful at all is that there are some
generally agreed upon semantics for methods named '<<'. It may be
that for some objects Mystery#append does the same thing but that
isn't as useful within the Ruby ecosystem as the exact same method
named '<<' instead.

The well-known but not-actually-enforced rules for '<<' aren't universal
of course. Sometimes '<<' means 'shift' rather than 'append'. But
that alternate view is useful within its own subset of the Ruby
ecosystem. The fact that the two views are mutually exclusive is OK
because a human is generally making sure that in any particular context
only one view or the other is being assumed.

I guess my point, which certainly isn't original, is that a consistent
naming methodology can often be a substitute or proxy for semantics.
Instead of declaring semantics via some sort of machine readable formal
language we cheat and intuit semantics from the patterns of syntax
created by the naming methodology.

Gary Wright
 
R

Rick DeNatale

Hi --

On Sat, 3 Mar 2007, Rick DeNatale wrote:


I'm not sure how that detaches it from the object, though, since
variables are going to contain references to objects and messages are
sent to objects rather than variables. Do you mean in terms of
documentation?

First of all let me reiterate that this is my personal perspective,
based on 20+ years of using a variety of dynamic OO languages.

I'm trying to say that the code which uses objects in general sees
them through variables. This isn't always evident, and it's muddied
because so often the same code which uses a variable initializes it,
but in the case of parameters, or the results of expressions based on
parameters that code has less control over what the objects referenced
by variables really are.

One way to think about what I'm saying is to think of the programmer
like a playwright who writes a script in terms of dramatis personae or
roles, without necessarily knowing or caring which actors will
ultimately play those roles.

Many years ago I wrote a paper called "Types from the Client's
Viewpoint" which talked about this, A PDF of this is available on my
blog website at
http://talklikeaduck.denhaven2.com/files/TypesFromTheClientsViewpoint.PDF

I've been struggling for many years about how to describe this without
getting wrapped up in what is really a paradigm shift which confounds
discussion since terms like type have subtly different meanings. That
paper is an early attempt, but it provides on such attempt. Another
glimmer of what I'm trying to say and it's evolution can be seen in my
mini-memoir
http://talklikeaduck.denhaven2.com/articles/2006/07/29/about-me

Because of this paradigm shift problem, I sometimes use the term role
instead of type when talking about variables in an analogy the
playwright metaphor.

One of the key aspects of the paradigm shift is that in a dynamically
typed system a strong wall of encapsulation can be erected between the
user of an object and the implementation of that object.
Traditionally types in programming languages are really names or
calculi for describing how a string of bits are to be interpreted by a
program using those bits. In a language like Ruby or Smalltalk, all
the bit interpretation gets done by the object itself, so the user of
the object doesn't or shouldn't care about the implementation of the
object. This is why those of us in the dynamic duck typing camp are
so adverse to class checking, since the class of an object is an
implementation matter.

This encapsulation wall at the interface between object and user is
what I consider the hallmark of object centered softwre design This
is what Alan Kay seemed to have in mind when he coined the term
"Object Oriented Programming." Unfortunately, Peter Wegner came along
later and coopted the term to require classes and inheritance, both of
which might be useful implementation techniques but are non-essential
to my mind.

And the fact that classes and inheritance are implementation
techniques is another thing which makes those familiar with C++ like
languages have difficulties in getting duck typing. Since
traditionally types are a way of describing an implementation rather
than more abstract usage requirements, classes, inheritance, and the
concept of a strong type all get wrapped so tightly together that they
can't separate them.
I don't think duck typing has ever been about a checklist of common
data structures or types (queue, stack, etc.), but more about just
sending messages to objects without a lot of pre-message checking as
to their class and ancestry and so forth.

I wasn't saying that all of the roles fall into one of these
checklists, but that these and their variations of them are common
roles, just like there are common roles like ingenue, jilted lover,
thug etc. Other roles are more inventive and complex, like the
ex-astronaut rancher building a missile in his barn, or an autistic
savant called rain man.
To me, the ultimate duck
typing idiom is:

def m(x)
x << "string"
end

There's no checking, no probing the object -- you just ask it to do
something, and deal with what happens. In the end, that's *always*
how things are when writing Ruby. I've always seen "duck typing" as
just a kind of embrace of the conditions under which every single line
of Ruby is code is written anyway.

Yes, and this is completely compatible with what I'm saying. Think
again about the play, the playwright writes the play, actors are cast,
and the production of the play is developed through rehearsal
(testing), debugging (re-writes) and sometimes re-casting. Later
productions can use different casts, possibly with additional
adaptation.
All these nouns sound more like classes than types. Actually I don't
think the type of a Ruby object can ever be named. It's circular: the
type of an object is... the type of objects that are of this object's
type :) As soon as it gets noun-like it starts getting class-bound
again.

Which nouns? hammer, wrench, rock, yes you can think of these as
classes (but more on this in a bit). But mass?

Now you are right that it's often hard to name a role, but never say
never. The name might stand for a complex type like Romeo, Othello,
or Tony Soprano, or it might be a noun phrase like "mass which can be
conveniently accelerated...".

The point though is that naming it isn't enough. Again in an analogy
with writing a play, the playwright's conception of the role doesn't
usually spring full-blown when the character's name gets written down,
but evolves as the play is written and the character interacts with
other characters in the play, and then further evolves as the director
begins to interpret the play.

Imagine Shakespeare writing Romeo and Juliet in C++, he'd have to
pre-declare all the players, and come up with strong types closely
bound to the implementation of the actors, before he wrote a line of
dialog.

Ay there's the rub!
 
D

Dean Wampler

...
If it walks like a duck and quacks like a duck, then it walks like
a duck and quacks like a duck.

:)

"... behaves like a duck", rather than "... is a duck", perhaps?
 
D

dblack

Hi --

Now you are right that it's often hard to name a role, but never say
never. The name might stand for a complex type like Romeo, Othello,
or Tony Soprano, or it might be a noun phrase like "mass which can be
conveniently accelerated...".

It's not that there aren't things-that-can-be-named, but that I don't
use the word "type" to refer to any of those things in Ruby. You can
say things like "This object is Enumerable", or "This object
implements what lots of people think is a hash-like interface". But
those aren't the object's type, though they may tell you part of the
picture.

In fact, I find the word "type" pretty useless in connection with
Ruby. It's possible to come up with some salient characteristic of
how Ruby objects are engineered, and call that "type", but I've never
found the word really useful. My characterization of it as circular
(an object is of type "the type of objects of this type") is really
another way of saying that it's not a very useful term. It doesn't,
to borrow from one of your analogies, hit any of the available nails
on the head :)

So I think we should really talk about interesting ideas like roles
and behaviors and encapsulation, and forget "type".
The point though is that naming it isn't enough. Again in an
analogy with writing a play, the playwright's conception of the role
doesn't usually spring full-blown when the character's name gets
written down, but evolves as the play is written and the character
interacts with other characters in the play, and then further
evolves as the director begins to interpret the play.

Imagine Shakespeare writing Romeo and Juliet in C++, he'd have to
pre-declare all the players, and come up with strong types closely
bound to the implementation of the actors, before he wrote a line of
dialog.

I don't think this is what's at stake in any of this discussion,
though. We're all hip to the dynamicness of Ruby :) No one's
talking about static typing. But I'd be happy to experiment with a
moritorium on the word "type".


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
R

Rick DeNatale

I don't think this is what's at stake in any of this discussion,
though. We're all hip to the dynamicness of Ruby :) No one's
talking about static typing. But I'd be happy to experiment with a
moritorium on the word "type".

Well many of us are in fact hip to it, but the fact that attempts to
formalize or mechanize 'duck typing' indicate that not all who swim in
the waters of this group are. And since ruby-talk gets reflected to
comp.lang.ruby and therefore many of the postings get cross-posted to
various comp.lang.* groups means that there are at least some static
typing advocates who read some of what transpires here.

As for avoiding the term type to describe objects are used in a
dynamic language, I'm all for it.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

IPMS/USA Region 12 Coordinator
http://ipmsr12.denhaven2.com/

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/
 
R

Robert Dober

Can serve as a duck.
Can be served as a duck!

But I read something more into Dean's joke, maybe duck typing shall
not be confused with introspection too much.

I will make it walk and talk and eat it as if it were a duck and let
us not worry about digestion right now.

Robert
 
E

Ed Howland

Otherwise, why restrict the user of your method this way?

The entire context is that of object construction and how best to
'parse' an argument list. I'm trying to focus on this very
specific context (which is closely associated with DSL designs)
and not the instance method situation where you've already got
a constructed object.

I'll go back to my simple example:

Array.new 3
Array.new [1,2]

Do you really think that Array.new(3) should result in
[1, 1], by viewing the number 3 as an array of bits? Or
alternatively should Array.new([1,2]) fail because it
doesn't make sense to allocate [1,2] entries?

Array.new must 'parse' its argument list and act
accordingly. The interface to Array could be:

Array.new( 3 )
Array.new_from_arrayish_obj( [1,2] )

so as to avoid the overloading problem but I don't
particularly think this is an improvement. Array#new
already uses a form of duck typing: [...]
Based on this thread discussion, it doesn't seem there is an
'obvious' answer for all (or most) situations.
There are a variety of tradeoffs between using #to_hash, trying #[] or
#fetch, testing for #has_key?, or even simply iterating with #each. All
of those seem marginally better than using #kind_of?(Hash).

Gary Wright

This may be OT, (or merely off-base,) but somewhere else on this list
I read about using mutators (not sure if that is the formal name or
not) on #new. This keeps the complexity out of initialize having to
parse each argument for its type.

E.g.

my_obj = MyClass.new # no arg ctor
my_obj = MyClass.new('John', 'Smith') # canon - takes strings
my_obj = MyClass.new.from_hash {:last_name => 'Smith', :first_name => 'John'}
my_obj = MyClass.new.from_arr ['John', 'Smith']

class My; def from_hash(hash); @first = hash[:first]; @last =
hash[:last]; self; end; end

Anyway, I just think this is a reasonable way to extend the
initializer for other argument types. I'd use this when the % of times
I need to construct an object from another type is less than say 33
(JAROT).

Ed
 
R

Robert Dober

Otherwise, why restrict the user of your method this way?

The entire context is that of object construction and how best to
'parse' an argument list. I'm trying to focus on this very
specific context (which is closely associated with DSL designs)
and not the instance method situation where you've already got
a constructed object.

I'll go back to my simple example:

Array.new 3
Array.new [1,2]

Do you really think that Array.new(3) should result in
[1, 1], by viewing the number 3 as an array of bits? Or
alternatively should Array.new([1,2]) fail because it
doesn't make sense to allocate [1,2] entries?

Array.new must 'parse' its argument list and act
accordingly. The interface to Array could be:

Array.new( 3 )
Array.new_from_arrayish_obj( [1,2] )

so as to avoid the overloading problem but I don't
particularly think this is an improvement. Array#new
already uses a form of duck typing: [...]
Based on this thread discussion, it doesn't seem there is an
'obvious' answer for all (or most) situations.
There are a variety of tradeoffs between using #to_hash, trying #[] or
#fetch, testing for #has_key?, or even simply iterating with #each. All
of those seem marginally better than using #kind_of?(Hash).

Gary Wright

This may be OT, (or merely off-base,) but somewhere else on this list
I read about using mutators (not sure if that is the formal name or
not) on #new. This keeps the complexity out of initialize having to
parse each argument for its type.

E.g.

my_obj = MyClass.new # no arg ctor
my_obj = MyClass.new('John', 'Smith') # canon - takes strings
my_obj = MyClass.new.from_hash {:last_name => 'Smith', :first_name => 'John'}
my_obj = MyClass.new.from_arr ['John', 'Smith']

No that is not OT at all, one of the strength of ruby IMHO
look e.g. at REXML::Document.new and many others.

Those interface generalizers shall be used on well defined entry
points only, I feel.

Cheers
Robert
class My; def from_hash(hash); @first = hash[:first]; @last =
hash[:last]; self; end; end

Anyway, I just think this is a reasonable way to extend the
initializer for other argument types. I'd use this when the % of times
I need to construct an object from another type is less than say 33
(JAROT).

Ed
 
D

dblack

Hi --

Well many of us are in fact hip to it, but the fact that attempts to
formalize or mechanize 'duck typing' indicate that not all who swim in
the waters of this group are. And since ruby-talk gets reflected to
comp.lang.ruby and therefore many of the postings get cross-posted to
various comp.lang.* groups means that there are at least some static
typing advocates who read some of what transpires here.

On the other hand, if a thread about types *isn't*, for once, a
"debate" about static "vs." dynamic typing, just enjoy it! :)


David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black)
(See what readers are saying! http://www.rubypal.com/r4rrevs.pdf)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)
 
G

Gary Wright

This may be OT, (or merely off-base,) but somewhere else on this list
I read about using mutators (not sure if that is the formal name or
not) on #new. This keeps the complexity out of initialize having to
parse each argument for its type.

E.g.

my_obj = MyClass.new # no arg ctor
my_obj = MyClass.new('John', 'Smith') # canon - takes strings
my_obj = MyClass.new.from_hash {:last_name =>
'Smith', :first_name => 'John'}
my_obj = MyClass.new.from_arr ['John', 'Smith']

This certainly simplifies MyClass#initialize but Ruby is flexible enough
that you can create your own methods to allocate and initialize
without going
through MyClass#initialize:

class MyClass
class <<self
def new_from_hash(h)
o = allocate
o.send:)initialize_from_hash, h)
o
end
end
attr_accessor :pairs
def initialize_from_hash(h)
@pairs = h.to_a
end
private :initialize_from_hash
end

mc = MyClass.new_from_hash({1,2,3,4})
puts mc.pairs # [[1,2],[3,4]]


Gary Wright
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top