empty classes as c structs?

  • Thread starter Christopher J. Bottaro
  • Start date
S

Steven Bethard

Alex said:
I think this ``view'' or however you call it should be a classmethod
too, for the same reason -- let someone handily subclass Bunch and still
get this creational pattern w/o extra work. Maybe a good factoring
could be something like:

class Bunch(object):

def __init__(self, *a, **k):
self.__dict__.update(*a, **k)

def getDict(self):
return self.__dict__

def setDict(self, adict):
self.__dict__ = adict

theDict = property(getDict, setDict, None,
"direct access to the instance dictionary"
)

@classmethod
def wrapDict(cls, adict, *a, **k):
result = cls.__new__(cls, *a, **k)
result.setDict(adict)
cls.__init__(result, *a, **k)
return result

I'm thinking of use cases where a subclass of Bunch might override
setDict (to do something else in addition to Bunch.setDict, e.g.
maintain some auxiliary data structure for example) -- structuring
wrapDict as a classmethod in a ``Template Method'' DP might minimize the
amount of work, and the intrusiveness, needed for the purpose. (I don't
have a real-life use case for such a subclass, but it seems to cost but
little to provide for it as a possibility anyway).

Seems pretty reasonable -- the only thing I worry about is that
classmethods and other attributes (e.g. properties) that are accessible
from instances can lead to subtle bugs when a user accidentally
initializes a Bunch object with the attributes of the same name, e.g.:

b = Bunch(getDict=1)

where

b.getDict()

now fails with something like "TypeError: 'int' object is not callable".
(For another discussion about this problem, see [1]).

I don't know what the right solution is here... I wonder if I should
write a classmethod-style descriptor that disallows the calling of a
function from an instance? Or maybe I should just document that the
classmethods should only be called from the class? Hmm...

How do you feel about getDict and setDict also being classmethods?


Steve

[1] http://mail.python.org/pipermail/python-dev/2005-January/051328.html
 
A

Alex Martelli

Steven Bethard said:
Seems pretty reasonable -- the only thing I worry about is that
classmethods and other attributes (e.g. properties) that are accessible
from instances can lead to subtle bugs when a user accidentally
initializes a Bunch object with the attributes of the same name, e.g.:

b = Bunch(getDict=1)

where

b.getDict()

now fails with something like "TypeError: 'int' object is not callable".

Well, that's the problem with confusing items and attributes in the
first place, of course -- which IS Bunch's purpose;-)
(For another discussion about this problem, see [1]).

I don't know what the right solution is here... I wonder if I should
write a classmethod-style descriptor that disallows the calling of a
function from an instance? Or maybe I should just document that the
classmethods should only be called from the class? Hmm...

Another approach is to add a few "reserved words" to the ones Python
itself would reject in the initialization. Just you cannot do:

b = Bunch(continue=23)

you may choose to forbid using getDict=42 - if you do that you probably
want to forbid any magicname too, since e.g.

b = Bunch(__dict__=99)

can't work ``right'' no matter what, while setting e.g. __deepcopy__
similarly might confuse any copy.deepcopy(b), etc, etc.
How do you feel about getDict and setDict also being classmethods?

Uh? I want to get or set the dict of a specific instance -- not those
of the whole class. How would you design them as classmethods...?


Alex
 
S

Steven Bethard

Alex said:
Another approach is to add a few "reserved words" to the ones Python
itself would reject in the initialization. Just you cannot do:

b = Bunch(continue=23)

you may choose to forbid using getDict=42 - if you do that you probably
want to forbid any magicname too, since e.g.

b = Bunch(__dict__=99)

can't work ``right'' no matter what, while setting e.g. __deepcopy__
similarly might confuse any copy.deepcopy(b), etc, etc.

Yeah, I thought about this. I certainly don't have any problem with
disallowing magic names. For other names, I'm less certain...
Uh? I want to get or set the dict of a specific instance -- not those
of the whole class. How would you design them as classmethods...?

The classmethod would have to be called with an instance, e.g.:

class Bunch(object):
@classmethod
def getDict(cls, self):
return self.__dict__

@classmethod
def setDict(cls, self, dct):
self.__dict__ = dct

Of course, these could be staticmethods instead since they don't need
the class, but the point is that you should be calling them like:

adict = Bunch.getDict(bunch)

or

Bunch.setDict(bunch, adict)

The complications with attribute hiding is one of main reasons I've
tried to minimize the number of methods associated with Bunch...

Steve
 
A

Alex Martelli

Brian van den Broek said:
(I'm just a hobbyist, so if this suggestion clashes with some well
established use of 'Bag' in CS terminology, well, never mind.)

Yep: a Bag is a more common and neater name for a "multiset" -- a
set-like container which holds each item ``a certain number of times''
(you can alternatively see it as a mapping from items to counts of
number of times the item is held).


Alex
 
F

Fernando Perez

Steven said:
The complications with attribute hiding is one of main reasons I've
tried to minimize the number of methods associated with Bunch...

in order for bunches to be fully useful in general, open contexts, I think that
number of methods should be exactly zero (at least without leading
underscores). Otherwise, as soon as someone loads a config file into a bunch
and they assign randomname_which_you'd_used, the whole thing breaks.

I see two ways to implement additional (needed) functionality into bunches:

1. The class/staticmethod approach. This has clean syntax, though it may make
inheritance issues tricky, I'm not too sure.

2. To break with pythonic convention a bit, and make the public API of Bunch
consist of methods which all start with a leading _. You can even (via
__setattr__ or metaclass tricks) block assignment to these, and state up front
that Bunches are meant to hold public data only in attributes starting with a
letter. I think that would be a reasonable compromise, allowing you to do:

b = Bunch()
b.update = 'yes'
b._update(somedict)
b.copy = 'no'
c = b._copy()

# BUT:
c._update = 'no' ## an exception is raised

etc.

It's not very pretty, and it does break with pythonic convention. But a Bunch
class which simultaneously provides certain non-trivial functionality
(otherwise the usual 'class Bunch:pass' idiom would be enough), while allowing
users to store arbitrarily named attributes in it, is inevitably going to need
to play namespace tricks somewhere.

FWIW, I personally could live with #2 as an acceptable compromise.

Cheers,

f
 
M

Michael Spencer

Alex said:
Hmm... interesting. This isn't the main intended use of
Bunch/Struct/whatever, but it does seem like a useful thing to have...
I wonder if it would be worth having, say, a staticmethod of Bunch that
produced such a view, e.g.:

class Bunch(object):
...
@staticmethod
def view(data):
result = Bunch()
result.__dict__ = data
return result

Then you could write your code as something like:

gbls = Bunch.view(globals())

I'm probably gonna need more feedback though from people though to know
if this is a commonly desired use case...


Reasonably so, is my guess. Witness the dict.fromkeys classmethod -- it
gives you, on dict creation, the same kind of nice syntax sugar that
wrapping a dict in a bunch gives you for further getting and setting
(and with similar constraints: all keys must be identifiers and not
happen to clash with reserved words).

I think this ``view'' or however you call it should be a classmethod
too, for the same reason -- let someone handily subclass Bunch and still
get this creational pattern w/o extra work. Maybe a good factoring
could be something like:

class Bunch(object):

def __init__(self, *a, **k):
self.__dict__.update(*a, **k)

def getDict(self):
return self.__dict__

def setDict(self, adict):
self.__dict__ = adict

theDict = property(getDict, setDict, None,
"direct access to the instance dictionary"
)

@classmethod
def wrapDict(cls, adict, *a, **k):
result = cls.__new__(cls, *a, **k)
result.setDict(adict)
cls.__init__(result, *a, **k)
return result

I'm thinking of use cases where a subclass of Bunch might override
setDict (to do something else in addition to Bunch.setDict, e.g.
maintain some auxiliary data structure for example) -- structuring
wrapDict as a classmethod in a ``Template Method'' DP might minimize the
amount of work, and the intrusiveness, needed for the purpose. (I don't
have a real-life use case for such a subclass, but it seems to cost but
little to provide for it as a possibility anyway).

[[given the way property works, one would need extra indirectness in
getDict and setDict -- structuring THEM as Template Methods, too -- to
fully support such a subclass; but that's a well-known general issue
with property, and the cost of the extra indirection -- mostly in terms
of complication -- should probably not be borne here, it seems to me]]


Alex

Steven et al

I like the idea of making the 'bunch' concept a little more standard.
I also like the suggestion Nick Coghlan cited (not sure who suggested the term
in this context) of calling this 'namespace' in part because it can lead to
easily-explained semantics.

ISTM that 'bunch' or 'namespace' is in effect the complement of vars i.e., while
vars(object) => object.__dict__, namespace(somedict) gives an object whose
__dict__ is somedict.

Looked at this way, namespace (or bunch) is a minimal implementation of an
object that implements the hasattr(object,__dict__) protocol. The effect of the
class is to make operations on __dict__ simpler. namespace instances can be
compared with any other object that has a __dict__. This differs from the PEP
reference implementation which compares only with other bunch instances. In
practice, comparisons with module and class may be useful.

The class interface implements the protocol and little else.

For 'bunch' applications, namespace can be initialized or updated with keyword
args (just like a dict)
i.e.,
For dict-wrapping applications:but, unlike the PEP implmentation, this sets wrappeddict.__dict__ = bigdict

I think that this interface allows for either use case, without introducing
'fromdict' classmethod.

Some dict-operations e.g., __copy__ might carry over to the namespace class

Michael


An implementation follows:


# An alternative implementation of Steven Bethard's PEP XXX 'bunch' with
# slightly different interface and semantics:

class namespace(object):
"""
namespace(somedict) => object (with object.__dict__ = somedict)
NB, complement of vars: vars(object) => object.__dict__

namespace objects provide attribute access to their __dict__

In general, operations on namespace equate to the operations
on namespace.__dict__

"""

def __init__(self, E = None, **F):
"""__init__([namespace|dict], **kwds) -> None"""
if isinstance(E, dict):
self.__dict__ = E
elif hasattr(E, "__dict__"):
self.__dict__ = E.__dict__
self.__dict__.update(**F)

# define only magic methods to limit pollution
def __update__(self, E = None, **F):
"""update([namespace|dict], **kwds) -> None
equivalent to self.__dict__.update
with the addition of namespace as an acceptable operand"""
if hasattr(other, "keys"):
self.__dict__.update(E)
elif hasattr(other, "__dict__"):
self.__dict__.update(E.__dict__)
self.__dict__.update(**F)
def __repr__(self):
return "namespace(%s)" % repr(self.__dict__)

# Possible additional methods: (All are conveniences for dict operations
# An operation on namespace translates operation on namespace.__dict__
# So A op B => A.__dict__ op B.__dict__)

def __copy__(self):
return namespace(self.__dict__.__copy__)
def __eq__(self, other):
return self.__dict__ == other.__dict__

def __contains__(self, key):
return self.__dict__.__contains__(key)
def __iter__(self):
return iter(self.__dict__)
# etc...
 
S

Steven Bethard

Michael said:
ISTM that 'bunch' or 'namespace' is in effect the complement of vars
i.e., while vars(object) => object.__dict__, namespace(somedict) gives
an object whose __dict__ is somedict.

Yeah, I kinda liked this application too, and I think the symmetry would
be nice.
Looked at this way, namespace (or bunch) is a minimal implementation of
an object that implements the hasattr(object,__dict__) protocol. The
effect of the class is to make operations on __dict__ simpler.
namespace instances can be compared with any other object that has a
__dict__. This differs from the PEP reference implementation which
compares only with other bunch instances.

Yeah, I wanted to support this, but I couldn't decide how to arbitrate
things in update -- if a dict has a __dict__ attribute, do I update the
Namespace object with the dict or the __dict__? That is, what should I
do in the following case:

py> class xdict(dict):
.... pass
....
py> d = xdict(a=1, b=2)
py> d.x = 1
py> d
{'a': 1, 'b': 2}
py> d.__dict__
{'x': 1}
py> Namespace(d)

The dict d has both the items of a dict and the attributes of a
__dict__. Which one gets assigned to the __dict__ of the Namespace? Do
I do:

self.__dict__ = d

or do I do:

self.__dict__ = d.__dict__

It was because these seem like two separate cases that I wanted two
different functions for them (__init__ and, say, dictview)...

Steve
 
N

Nick Coghlan

Steven said:
It was because these seem like two separate cases that I wanted two
different functions for them (__init__ and, say, dictview)...

The other issue is that a namespace *is* a mutable object, so the default
behaviour should be to make a copy (yeah, I know, I'm contradicting myself - I
only just thought of this issue). So an alternate constructor is definitely the
way to go.

I think Michael's implementation also fell into a trap whereby 'E' couldn't be
used as an attribute name. The version below tries to avoid this (using
magic-style naming for the other args in the methods which accept keyword
dictionaries).

To limit the special casing in update, I've switched to only using __dict__ for
the specific case of instances of namespace (otherwise the semantics are too
hard to explain). This is to allow easy copying of an existing namespace - for
anything else, invoking vars() is easy enough.

And I was reading Carlos's page on MetaTemplate, so I threw in an extra class
"record" which inherits from namespace and allows complex data structures to be
defined via class syntax (see the record.__init__ docstring for details). That
bit's entirely optional, but I thought it was neat.

Finally, I've just used normal names for the functions. I think the issue of
function shadowing is best handled by recommending that all of the functions be
called using the class explicitly - this works just as well for instance methods
as it does for class or static methods.

Cheers,
Nick.

+++++++++++++++++++++++++++++++++++++++++++++

from types import ClassType

class namespace(object):
"""
namespace([namespace|dict]) => object

namespace objects provide attribute access to their __dict__
Complement of vars: vars(object) => object.__dict__

Non-magic methods should generally be invoked via the class to
avoid inadvertent shadowing by instance attributes

Using attribute names that look like magic attributes is not
prohibited but can lead to surprising behaviour.

In general, operations on namespace equate to the operations
on namespace.__dict__
"""

def __init__(__self__, __orig__ = None, **__kwds__):
"""__init__([namespace|dict], **kwds) -> None"""
type(__self__).update(__self__, __orig__, **__kwds__)

@classmethod
def view(cls, orig):
"""namespace.view(dict) -> namespace

Creates a namespace that is a view of the original
dictionary. Allows modification of an existing
dictionary via namespace syntax"""
new = cls()
new.__dict__ = orig
return new

def __repr__(self):
return "%s(%s)" % (self.__class__.__name__, repr(self.__dict__))

# Recommend calling non-magic methods via class form to
# avoid problems with inadvertent attribute shadowing
def _checked_update(self, other):
try:
self.__dict__.update(other)
except (TypeError):
raise TypeError("Namespace update requires mapping "
"keyed with valid Python identifiers")

def update(__self__, __other__ = None, **__kwds__):
"""type(self).update(self, [namespace|dict], **kwds) -> None
equivalent to self.__dict__.update"""
# Handle direct updates
if __other__ is not None:
if isinstance(__other__, namespace):
type(__self__)._checked_update(__self__, __other__.__dict__)
else:
type(__self__)._checked_update(__self__, __other__)
# Handle keyword updates
if __kwds__ is not None:
type(__self__)._checked_update(__self__, __kwds__)


class record(namespace):
def __init__(self, definition=None):
"""record([definition]) -> record

Constructs a namespace based on the given class definition
Nested classes are created as sub-records
Fields starting with an underscore are ignored
If definition is not given, uses current class
This is handy with subclasses
Using subclasses this way has the advantage that the
created records are also instances of the subclass.

For example:
Py> from ns import namespace, record
Py> class Record:
... a = 1
... b = ""
... class SubRecord:
... c = 3
...
Py> x = record(Record)
Py> x
record({'a': 1, 'b': '', 'SubRecord': record({'c': 3})})
Py> class Record2(record):
... a = 1
... b = ""
... class SubRecord2(record):
... c =3
...
Py> x = Record2()
Py> x
Record2({'a': 1, 'b': '', 'SubRecord2': SubRecord2({'c': 3})})
"""
cls = type(self)
if definition is None:
definition = cls
cls.update_from_definition(self, definition)

def update_from_definition(self, definition):
"""type(self).update_from_definition(self, definition) -> None

Updates the namespace based on the given class definition
Nested classes are created as sub-records
Fields starting with an underscore are ignored"""
try:
for field, default in definition.__dict__.iteritems():
if field.startswith("_"):
continue
if (isinstance(default, (type, ClassType))
and issubclass(default, record)):
# It's a record, so make it an instance of itself
self.__dict__[field] = default()
else:
try:
# If we can make a record of it, do so
self.__dict__[field] = record(default)
except TypeError:
# Throw it in a standard field
self.__dict__[field] = default
except AttributeError:
raise TypeError("Namespace definition must have __dict__ attribute")
 
B

Brian van den Broek

Alex Martelli said unto the world upon 2005-02-06 18:06:
Yep: a Bag is a more common and neater name for a "multiset" -- a
set-like container which holds each item ``a certain number of times''
(you can alternatively see it as a mapping from items to counts of
number of times the item is held).


Alex

Thanks for that info. I've yet to check the link to the Bag cookbook
recipe that Carlos Ribeiro posted for me
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259174>, but
now I definitely will! (When I first learned of Python's set type, I
was disappointed there wasn't also a multi-set type.)

Thanks and best,

Brian vdB
 
S

Steven Bethard

Nick said:
Finally, I've just used normal names for the functions. I think the
issue of function shadowing is best handled by recommending that all of
the functions be called using the class explicitly - this works just as
well for instance methods as it does for class or static methods.

I wonder if it would be worth adding a descriptor that gives a warning
for usage from instances, e.g.:

py> import new
py> import warnings
py> class InstanceWarner(object):
.... def __init__(self, func):
.... self.func = func
.... def __get__(self, obj, type=None):
.... if obj is None:
.... return self.func
.... else:
.... warnings.warn('methods of this type should not be '
.... 'invoked from instances')
.... return new.instancemethod(self.func, obj, type)
....
py> class Bunch(object):
.... @InstanceWarner
.... def update(self):
.... print 'updating', self
....
py> Bunch.update(Bunch())
updating <__main__.Bunch object at 0x01152830>
py> Bunch().update()
__main__:8: UserWarning: methods of this type should not be invoked from
instances
updating <__main__.Bunch object at 0x011527D0>

Steve
 
M

Michael Spencer

I see this, but I think it weakens the case for a single implementation, given
that each implementation is essentially one method.
The other issue is that a namespace *is* a mutable object, so the
default behaviour should be to make a copy
I don't follow this argument. Why does mutability demand copy? Given that
somedict here is either a throwaway (in the classic bunch application ) or a
dict that must be updated (the wrap-dict case), copying doesn't make much sense
to me.

OTOH, dict.__init__(dict) copies. hmmmm....

I think Michael's implementation also fell into a trap whereby 'E'
couldn't be used as an attribute name. The version below tries to avoid
this (using magic-style naming for the other args in the methods which
accept keyword dictionaries).

You're right - I hadn't considered that. In case it wasn't obvious, I was
matching the argspec of dict. Your solution avoids the problem.
To limit the special casing in update, I've switched to only using
__dict__ for the specific case of instances of namespace

That seems a pity to me.
(otherwise the
semantics are too hard to explain). This is to allow easy copying of an
existing namespace -

Can't this be spelled namespace(somenamespace).__copy__()?
> for anything else, invoking vars() is easy enough.

If there is potential for confusion, I'd be tempted to disallow namespaces as an
argument to update/__update__

We could use __add__, instead for combining namespaces
And I was reading Carlos's page on MetaTemplate, so I threw in an extra
class "record" which inherits from namespace and allows complex data
structures to be defined via class syntax (see the record.__init__
docstring for details). That bit's entirely optional, but I thought it
was neat.

Good idea. The implementation ought to be tested against several plausible
specializations.
Finally, I've just used normal names for the functions. I think the
issue of function shadowing is best handled by recommending that all of
the functions be called using the class explicitly - this works just as
well for instance methods as it does for class or static methods.

I don't like the sound of that. The whole point here - whether as Steven's nice
straightforward bunch, as originally conceived, or the other cases you and I and
others have been 'cluttering' the discussion with ;-) is convenience, and
readability. If there are hoops to jump through to use it, then the benefit is
quickly reduced to zero.

Regards

Michael
Cheers,
Nick.

+++++++++++++++++++++++++++++++++++++++++++++

from types import ClassType

class namespace(object):
"""
namespace([namespace|dict]) => object

namespace objects provide attribute access to their __dict__
Complement of vars: vars(object) => object.__dict__

Non-magic methods should generally be invoked via the class to
avoid inadvertent shadowing by instance attributes

Using attribute names that look like magic attributes is not
prohibited but can lead to surprising behaviour.

In general, operations on namespace equate to the operations
on namespace.__dict__
"""

def __init__(__self__, __orig__ = None, **__kwds__):
"""__init__([namespace|dict], **kwds) -> None"""
type(__self__).update(__self__, __orig__, **__kwds__)

@classmethod
def view(cls, orig):
"""namespace.view(dict) -> namespace

Creates a namespace that is a view of the original
dictionary. Allows modification of an existing
dictionary via namespace syntax"""
new = cls()
new.__dict__ = orig
return new

def __repr__(self):
return "%s(%s)" % (self.__class__.__name__, repr(self.__dict__))

# Recommend calling non-magic methods via class form to
# avoid problems with inadvertent attribute shadowing
def _checked_update(self, other):
try:
self.__dict__.update(other)
except (TypeError):
raise TypeError("Namespace update requires mapping "
"keyed with valid Python identifiers")

def update(__self__, __other__ = None, **__kwds__):
"""type(self).update(self, [namespace|dict], **kwds) -> None
equivalent to self.__dict__.update"""
# Handle direct updates
if __other__ is not None:
if isinstance(__other__, namespace):
type(__self__)._checked_update(__self__,
__other__.__dict__)
else:
type(__self__)._checked_update(__self__, __other__)
# Handle keyword updates
if __kwds__ is not None:
type(__self__)._checked_update(__self__, __kwds__)


class record(namespace):
def __init__(self, definition=None):
"""record([definition]) -> record

Constructs a namespace based on the given class definition
Nested classes are created as sub-records
Fields starting with an underscore are ignored
If definition is not given, uses current class
This is handy with subclasses
Using subclasses this way has the advantage that the
created records are also instances of the subclass.

For example:
Py> from ns import namespace, record
Py> class Record:
... a = 1
... b = ""
... class SubRecord:
... c = 3
...
Py> x = record(Record)
Py> x
record({'a': 1, 'b': '', 'SubRecord': record({'c': 3})})
Py> class Record2(record):
... a = 1
... b = ""
... class SubRecord2(record):
... c =3
...
Py> x = Record2()
Py> x
Record2({'a': 1, 'b': '', 'SubRecord2': SubRecord2({'c': 3})})
"""
cls = type(self)
if definition is None:
definition = cls
cls.update_from_definition(self, definition)

def update_from_definition(self, definition):
"""type(self).update_from_definition(self, definition) -> None

Updates the namespace based on the given class definition
Nested classes are created as sub-records
Fields starting with an underscore are ignored"""
try:
for field, default in definition.__dict__.iteritems():
if field.startswith("_"):
continue
if (isinstance(default, (type, ClassType))
and issubclass(default, record)):
# It's a record, so make it an instance of itself
self.__dict__[field] = default()
else:
try:
# If we can make a record of it, do so
self.__dict__[field] = record(default)
except TypeError:
# Throw it in a standard field
self.__dict__[field] = default
except AttributeError:
raise TypeError("Namespace definition must have __dict__
attribute")
 
S

Steven Bethard

Michael said:
I see this, but I think it weakens the case for a single implementation,
given that each implementation is essentially one method.

Do you mean there should be a separate Namespace and Bunch class? Or do
you mean that an implementation with only a single method is less useful?

If the former, then you either have to repeat the methods __repr__,
__eq__ and update for both Namespace and Bunch, or one of Namespace and
Bunch can't be __repr__'d, __eq__'d or updated.

If the latter (setting aside the fact that the implementation provides 4
methods, not 1), I would argue that even if an implementation is only
one method, if enough users are currently writing their own version,
adding such an implementation to the stdlib is still a net benefit.
You're right - I hadn't considered that. In case it wasn't obvious, I
was matching the argspec of dict. Your solution avoids the problem.

Another way to avoid the problem is to use *args, like the current Bunch
implementation does:

def update(*args, **kwargs):
"""bunch.update([bunch|dict|seq,] **kwargs) -> None

Updates the Bunch object's attributes from (if
provided) either another Bunch object's attributes, a
dictionary, or a sequence of (name, value) pairs, then from
the name=value pairs in the keyword argument list.
"""
if not 1 <= len(args) <= 2:
raise TypeError('expected 1 or 2 arguments, got %i' %
len(args))
self = args[0]
if len(args) == 2:
other = args[1]
if isinstance(other, Bunch):
other = other.__dict__
try:
self.__dict__.update(other)
except (TypeError, ValueError):
raise TypeError('cannot update Bunch with %s' %
type(other).__name__)
self.__dict__.update(kwargs)

This even allows you to use the keywords __self__ and __orig__ if you're
sick enough to want to. It's slightly more work, but I prefer it
because it's more general.
That seems a pity to me.

Is it that much worse to require the following code:

Namespace.update(namespace, obj.__dict__)

or:

namespace.udpate(obj.__dict__)

if you really want to update a Namespace object with the attributes of a
non-Namespace object?

For that matter, do you have a use-case for where this would be useful?
I definitely see the view-of-a-dict example, but I don't see the
view-of-an-object example since an object already has dotted-attribute
style access...
Can't this be spelled namespace(somenamespace).__copy__()?

I'd prefer to be consistent with dict, list, set, deque, etc. all of
which use their constructor for copying.
We could use __add__, instead for combining namespaces

I don't think this is a good idea. For the same reasons that dicts
don't have an __add__ (how should attributes with different values be
combined?), I don't think Bunch/Namespace should have an __add__.

Steve
 
C

Carlos Ribeiro

I don't think this is a good idea. For the same reasons that dicts
don't have an __add__ (how should attributes with different values be
combined?), I don't think Bunch/Namespace should have an __add__.

For entirely unrelated reasons I did it for a bunch-like class of
mine, and called it 'merge'. For this particular application it was a
better name than update and append, but that's IMHO.

--
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: (e-mail address removed)
mail: (e-mail address removed)
 
S

Steven Bethard

Carlos said:
For entirely unrelated reasons I did it for a bunch-like class of
mine, and called it 'merge'. For this particular application it was a
better name than update and append, but that's IMHO.

Did 'merge' have the same semantics as the 'update' being discussed?
That is, did it modify the first 'bunch'? Or did it create a new
'bunch'? To me, 'merge' sounds more like the second...

Steve
 
C

Carlos Ribeiro

Did 'merge' have the same semantics as the 'update' being discussed?
That is, did it modify the first 'bunch'? Or did it create a new
'bunch'? To me, 'merge' sounds more like the second...

In my particular example it was more like the second, but it doesn't
apply exactly because the goal was a little bit different; I
implemented it to merge two configuration dictionaries, one being the
'base' (with default values) and the other one with values to
overrride the base values. Anyway, it was just a suggestion; and while
I don't think that merge really implies one behavior over the other,
having it as a constructor does make sense...

--
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: (e-mail address removed)
mail: (e-mail address removed)
 
M

Michael Spencer

Steven said:
Do you mean there should be a separate Namespace and Bunch class? Or do
you mean that an implementation with only a single method is less useful?

The former.
If the former, then you either have to repeat the methods __repr__,
__eq__ and update for both Namespace and Bunch, or one of Namespace and
Bunch can't be __repr__'d, __eq__'d or updated.

I see no problem in repeating the methods, or inheriting the implementation.
However, if namespace and bunch are actually different concepts (one with
reference semantics, the other with copy), then __repr__ at least would need to
be specialized, to highlight the difference.

So, on balance, if copy semantics are important to bunch uses, and references
for namespace (though Nick changed his mind on this, and I don't yet know why) I
think they would be better as two small implementations. I remain unsure about
why you need or want copying, aside from matching the behavior of the builtins.
If the latter (setting aside the fact that the implementation provides 4
methods, not 1), I would argue that even if an implementation is only
one method, if enough users are currently writing their own version,
adding such an implementation to the stdlib is still a net benefit.

Yes, I agree with this: I was not picking on the class size ;-)

....
Another way to avoid the problem is to use *args, like the current Bunch
implementation does:

def update(*args, **kwargs):
"""bunch.update([bunch|dict|seq,] **kwargs) -> None

Sure - nice trick to avoid shadowing self too

....
Is it that much worse to require the following code:

Namespace.update(namespace, obj.__dict__)
or:
namespace.udpate(obj.__dict__)

if you really want to update a Namespace object with the attributes of a
non-Namespace object?

No problem at all - just a question of what the class is optimized for, and
making the interface as convenient as possible, given the use case. I agree
that for straight attribute access to a dictionary, your update interface is
clearly superior.
For that matter, do you have a use-case for where this would be useful?
I definitely see the view-of-a-dict example, but I don't see the
view-of-an-object example since an object already has dotted-attribute
style access...


Yes, I have various cases in mind relating to argument-passing, dispatching,
interface-checking and class composition. Here the class becomes useful if it
grows some namespace-specific semantics.

For example, I could write something like:
namespace(obj1) >= namespace(obj2) to mean obj1 has at least the attributes of obj2

implemented like:

def __ge__(self, other):
for attrname in other.__dict__.keys():
if not attrname in self.__dict__:
return False
return True

I realize that interfaces may be addressed formally by a current PEP, but, even
if they are, this "cheap and cheerful" approach appeals to me for duck-typing.


However, as I think more about this, I realize that I am stretching your concept
past its breaking point, and that whatever the merits of this approach, it's not
helping you with bunch. Thanks for knocking the ideas around with me.

Cheers

Michael
 
S

Steven Bethard

Michael said:
I see no problem in repeating the methods, or inheriting the
implementation. However, if namespace and bunch are actually different
concepts (one with reference semantics, the other with copy), then
__repr__ at least would need to be specialized, to highlight the
difference.

Yeah, I could certainly see them being separate... Of course, someone
else will have to write the PEP for Namespace then. ;)
def __ge__(self, other):
for attrname in other.__dict__.keys():
if not attrname in self.__dict__:
return False
return True

I realize that interfaces may be addressed formally by a current PEP,
but, even if they are, this "cheap and cheerful" approach appeals to me
for duck-typing.

However, as I think more about this, I realize that I am stretching your
concept past its breaking point, and that whatever the merits of this
approach, it's not helping you with bunch. Thanks for knocking the
ideas around with me.

My pleasure. It's good to talk some use-cases, and make sure I cover as
much as is feasible in the PEP. I think __ge__ is probably too far out
from the original intentions, but I'll make sure to write
Bunch/Namespace to be as inheritance-friendly as possible so that adding
such behavior by inheriting from Bunch/Namespace should be simple.

Thanks for all your comments!

Steve
 
N

Nick Coghlan

Steven said:
I wonder if it would be worth adding a descriptor that gives a warning
for usage from instances, e.g.:

Thinking about it some more, I realised that a class method approach means that
'type(self).method(self,...)' still works as a way to spell the call in a
polymorphism friendly way.

And if we're going to have to spell the call that way *anyway*. . .

So maybe it does make sense to simply say that all non-magic Bunch/namespace
operations are implemented as class methods (and most magic methods are
effectively treated as class methods when it comes to looking them).

Cheers,
Nick.
 
N

Nick Coghlan

Michael said:
I don't follow this argument. Why does mutability demand copy? Given
that somedict here is either a throwaway (in the classic bunch
application ) or a dict that must be updated (the wrap-dict case),
copying doesn't make much sense to me.

OTOH, dict.__init__(dict) copies. hmmmm....

As you noticed, it's a precedent from the other builtins and objects in the
standard library. The mutable ones (dict, list, set, etc) all make a copy of
whatever you pass in.
Can't this be spelled namespace(somenamespace).__copy__()?

Again, as Steven mentioned, this is based on precedent from other objects in the
interpreter. To get your own copy of a mutable type, you invoke the constructor
with the original as the sole argument.
If there is potential for confusion, I'd be tempted to disallow
namespaces as an argument to update/__update__

Limiting confusion is why I decided to restrict the special-case of querying the
__dict__ to instances of namespace. Again, as Steven pointed out, the semantics
get screwed up when the object supplied is usable as a dictionary, but also has
a __dict__ attribute.

For a namespace, the special case means that namespace(ns_obj) and
namespace(vars(ns_obj)) have the same result. Just don't go creating a namespace
subclass which provides a direct mapping interface to anything other than it's
own __dict__ and expect to affect namespaces created using the normal
constructor. I figure that limitation is obscure enough that we can live with it :)

For an arbitrary object, you can poke around in its __dict__ by doing:
namespace.view(vars(obj))
We could use __add__, instead for combining namespaces

Update already let's us combine namespaces. To create a new object that merges
two namespaces do:
namespace.update(namespace(ns_1), ns_2)
Good idea. The implementation ought to be tested against several
plausible specializations.

One thing I'm going to drop is the ability to use an arbitrary class as a
subrecord. I realised it screws up storage of classes and class instances which
have a __dict__ attribute.

Instead, I'll change it so that the optional argument allows you to set some of
the attributes.
I don't like the sound of that. The whole point here - whether as
Steven's nice straightforward bunch, as originally conceived, or the
other cases you and I and others have been 'cluttering' the discussion
with ;-) is convenience, and readability. If there are hoops to jump
through to use it, then the benefit is quickly reduced to zero.

Yes, once I realised that the polymorphism friendly 'type(self).update(self,
other)' works with a classmethod, I realised it made sense to go with Steven's
classmethod approach.

I'll also revert to his *args based solution to the keyword argument problem, too.

Time to go play cricket, so I won't actually be posting any of the above changes
tonight :)

Cheers,
Nick.
 
A

Alex Martelli

Nick Coghlan said:
Update already let's us combine namespaces. To create a new object that merges
two namespaces do:
namespace.update(namespace(ns_1), ns_2)

One thing I'd like to see in namespaces is _chaining_ -- keeping each
namespace separate but having lookups proceed along the chain. (The
best semantics for _bindings_ as opposed to lookups isn't clear though).


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,808
Messages
2,569,684
Members
45,435
Latest member
HemanttSod

Latest Threads

Top