Why is there no natural syntax for accessing attributes with namesnot being valid identifiers?

  • Thread starter Piotr Dobrogost
  • Start date
P

Piotr Dobrogost

Hi!

I find global getattr() function awkward when reading code.
What is the reason there's no "natural" syntax allowing to access attributes with names not being valid Python identifiers in a similar way to other attributes?
Something along the line of my_object.'valid-attribute-name-but-not-valid-identifier'?


Regards,
Piotr Dobrogost
 
E

Ethan Furman

I find global getattr() function awkward when reading code.
What is the reason there's no "natural" syntax allowing to
access attributes with names not being valid Python
identifiers in a similar way to other attributes?

Something along the line of my_object.'valid-attribute-name-but-not-valid-identifier'?

When would you have attribute names that are not valid identifiers?
 
N

Ned Batchelder

Hi!

I find global getattr() function awkward when reading code.
What is the reason there's no "natural" syntax allowing to access attributes with names not being valid Python identifiers in a similar way to other attributes?
Something along the line of my_object.'valid-attribute-name-but-not-valid-identifier'?


Regards,
Piotr Dobrogost

I don't know the real reason, but I imagine it would be that it would be
very rarely used. It would need to be an attribute that isn't a valid
identifier, and you know the attribute at the time you write the code.

I can see scenarios for needing attributes that aren't identifiers, but
in many of them you also need to access them through a variable rather
than literally.

--Ned.
 
D

Dave Angel

Re: Why is there no natural syntax for accessing attributes with names
not being valid identifiers?


I find global getattr() function awkward when reading code.

Me too.
What is the reason there's no "natural" syntax allowing to access
attributes with names not being valid Python identifiers in a similar
way to other attributes?

There is. Just use a dictionary.
 
R

random832

Hi!

I find global getattr() function awkward when reading code.
What is the reason there's no "natural" syntax allowing to access
attributes with names not being valid Python identifiers in a similar way
to other attributes?
Something along the line of
my_object.'valid-attribute-name-but-not-valid-identifier'?

The getattr function is meant for when your attribute name is in a
variable. Being able to use strings that aren't valid identifiers is a
side effect. Why are you designing classes with attributes that aren't
valid identifiers?
 
P

Piotr Dobrogost

The getattr function is meant for when your attribute name is in a
variable. Being able to use strings that aren't valid identifiers is a
side effect.

Why do you say it's a side effect? Could you elaborate? I see nothing odd in passing literal (string literal in this case) as a value of function's argument.
Why are you designing classes with attributes that aren't
valid identifiers?

Attribute access syntax being very concise is very often preferred to dict's interface. That's why various containers expose their elements as attributes. In my case I'm using in-house web form library which provides FieldSetclass holding objects of type Field or other FieldSets. This nesting leadsto names of the form 'outer_fieldset-inner_fieldset-third_field' which arenot valid identifiers due to minus sign.
 
T

Tim Chase

Why do you say it's a side effect?

I think random832 is saying that the designed purpose of setattr()
was to dynamically set attributes by name, so they could later be
accessed the traditional way; not designed from the ground-up to
support non-identifier names. But because of the getattr/setattr
machinery (dict key/value pairs), it doesn't prevent you from having
non-identifiers as names as long as you use only the getattr/setattr
method of accessing them.

I see non-traditional-identifiers most frequently in test code where
the globals() dictionary gets injected with various objects for
testing purposes, driven by a table with descriptors. Something like
(untested)

tests = [
dict(desc="Test 1", input=10, expected=42),
dict(desc="Test 2", input=314, expected=159),
]
for test in tests:
test_name = "test_" + test["desc"]
globals()[test_name] = generate_test_function(
test["input"], test["output"])

-tkc
 
T

Tim Roberts

Piotr Dobrogost said:
Attribute access syntax being very concise is very often preferred
to dict's interface.

It is not "very concise". It is slightly more concise.

x = obj.value1
x = dct['value1']

You have saved 3 keystrokes. That is not a significant enough savings to
create new syntax. Remember the Python philosophy that there ought to be
one way to do it.
 
R

rusi

Piotr said:
Attribute access syntax being very concise is very often preferred
to dict's interface.

It is not "very concise". It is slightly more concise.

x = obj.value1
x = dct['value1']

You have saved 3 keystrokes. That is not a significant enough savings to
create new syntax. Remember the Python philosophy that there ought to be
one way to do it.

Its a more fundamental problem than that:
It emerges from the OP's second post) that he wants '-' in the attributes.
Is that all?

Where does this syntax-enlargement stop? Spaces? Newlines?
 
I

Ian Kelly

Its a more fundamental problem than that:
It emerges from the OP's second post) that he wants '-' in the attributes.
Is that all?

Where does this syntax-enlargement stop? Spaces? Newlines?

At non-strings.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: attribute name must be string, not 'int'
 
E

Ethan Furman

Piotr Dobrogost said:
Attribute access syntax being very concise is very often preferred
to dict's interface.

It is not "very concise". It is slightly more concise.

x = obj.value1
x = dct['value1']

You have saved 3 keystrokes. That is not a significant enough savings to
create new syntax. Remember the Python philosophy that there ought to be
one way to do it.

That should be "one obvious way".

On my keyboard, at least, those are an important three keystrokes! ;)

To be clear, I am

-1

on the new syntax.
 
R

rusi

At non-strings.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: attribute name must be string, not 'int'

Not sure what's your point.

OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
Say I have a python expression:
obj.outer_fieldset-inner_fieldset-third_field

It can (in the proposed extension) be parsed as above, or as:
obj.outer_fieldset - inner_fieldset-third_field
the first hyphen being minus and the second being part of the identifier.

How do we decide which '-' are valid identifier components -- hyphens
and which minus-signs?

So to state my point differently:
The grammar of python is well-defined
It has a 'sub-grammar' of strings that is completely* free-for-all ie just
about anything can be put into a string literal.
The border between the orderly and the wild world are the quote-marks.
Remove that border and you get complete grammatical chaos.
[Maybe I should have qualified my reference to 'spaces'.
Algol-68 allowed spaces in identifiers (for readability!!)
The result was chaos]

I used the spaces case to indicate the limit of chaos. Other characters (that
already have uses) are just as problematic.

* Oh well there are some restrictions like quotes need to be escaped, no
newlines etc etc -- minor enough to be ignored.
 
A

Antoon Pardon

Op 04-12-13 11:09, rusi schreef:
At non-strings.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: attribute name must be string, not 'int'

Not sure what's your point.

OP wants attribute identifiers like outer_fieldset-inner_fieldset-third_field.
Say I have a python expression:
obj.outer_fieldset-inner_fieldset-third_field

It can (in the proposed extension) be parsed as above, or as:
obj.outer_fieldset - inner_fieldset-third_field
the first hyphen being minus and the second being part of the identifier.

How do we decide which '-' are valid identifier components -- hyphens
and which minus-signs?

So to state my point differently:
The grammar of python is well-defined
It has a 'sub-grammar' of strings that is completely* free-for-all ie just
about anything can be put into a string literal.
The border between the orderly and the wild world are the quote-marks.
Remove that border and you get complete grammatical chaos.
[Maybe I should have qualified my reference to 'spaces'.
Algol-68 allowed spaces in identifiers (for readability!!)
The result was chaos]

I used the spaces case to indicate the limit of chaos. Other characters (that
already have uses) are just as problematic.

I don't agree with the latter. As it is now python can make the
distinction between

from A import B and fromAimportB.

I see no a priori reason why this should be limited to letters. A
language designer might choose to allow a bigger set of characters
in identifiers like '-', '+' and others. In that case a-b would be
an identifier and a - b would be the operation. Just as in python
fromAimportB is an identifier and from A import B is an import
statement.
 
T

Tim Chase

I don't think so. What the OP asked for was:

my_object.'valid-attribute-name-but-not-valid-identifier'

Or describing it another way: A literal string instead of a token.
This is conceivable, at least, but I don't think it gives any
advantage over a dictionary.

In both cases (attribute-access-as-dict-functionality and
attribute-access-as-avoiding-setattr), forcing a literal actually
diminishes Python's power. I like the ability to do

a[key.strip().lower()] = some_value
setattr(thing, key.strip().lower(), some_value)

which can't be done (?) with mere literal notation. What would they
look like?

a.(key.strip().lower()) = some_value

(note that "key.strip().lower()" not actually a "literal" that
ast.literal_eval would accept). That's pretty ugly, IMHO :)

-tkc
 
R

rusi

I don't think so. What the OP asked for was:

my_object.'valid-attribute-name-but-not-valid-identifier'

Or describing it another way: A literal string instead of a token.

This is just pushing the issue one remove away.
Firstly a literal string is very much a token -- lexically.
Now consider the syntax as defined by the grammar.

Let Ident = Set of strings* that are valid python identifiers --
something like [a-zA-Z][a-zA-Z0-9]*

Let Exp = Set to strings* that are python expressions

* Note that I am using string from the language implementers pov not language
user ie the python identifier var is the implementers string "var" whereas
the python string literal "var" is the implementer's string "\"var\""

Now clearly Ident is a proper subset of Exp.

Now what is the proposal?
You want to extend the syntactically allowable a.b set.
If the b's can be any arbitrary expression we can have
var.fld(1,2) with the grammatical ambiguity that this can be
(var.fld)(1,2) -- the usual interpretation
Or
var.(fld(1,2)) -- the new interpretation -- ie a computed field name.

OTOH if you say superset of Ident but subset of Exp, then we have to determine
what this new limbo set is to be. ie what is the grammatical category of
'what-follows-a-dot' ??

Some other-language notes:
1. In C there is one case somewhat like this:
#include "string"
the "string" cannot be an arbitrary expression as the rest of C. But then this
is not really C but the C preprocessor

2. In lisp the Ident set is way more permissive than in most languages --
allowing operators etc that would be delimiters in most languages.
If one wants to go even beyond that and include say spaces and parenthesis --
almost the only delimiters that lisp has -- one must write |ident with spaces|
ie for identifiers the bars behave somewhat like strings' quote marks.
Because the semantics of identifiers and strings are different -- the lexical
structures need to reflect that difference -- so you cannot replace the bars
by quotes.
 
J

Jussi Piitulainen

rusi said:
Not sure what's your point.

OP wants attribute identifiers like
outer_fieldset-inner_fieldset-third_field.
Say I have a python expression:
obj.outer_fieldset-inner_fieldset-third_field

It can (in the proposed extension) be parsed as above, or as:
obj.outer_fieldset - inner_fieldset-third_field
the first hyphen being minus and the second being part of the
identifier.

How do we decide which '-' are valid identifier components --
hyphens and which minus-signs?

I think the OP might be after the JavaScript mechanism where an
attribute name can be any string, the indexing brackets are always
available, and the dot notation is available when the attribute name
looks like a simple identifier. That could be made to work. (I'm not
saying should, or should not. Just that it seems technically simple.)

Hm. Can't specific classes be made to behave this way even now by
implementing suitable underscored methods?
 
C

Chris Angelico

Hm. Can't specific classes be made to behave this way even now by
implementing suitable underscored methods?

Yup. Definitely possible. I don't think it'd be a good idea, though,
not without somehow changing every dict method into a stand-alone
function.

ChrisA
 
R

rusi

Op 04-12-13 11:09, rusi schreef:

I don't agree with the latter. As it is now python can make the
distinction between

from A import B and fromAimportB.

I see no a priori reason why this should be limited to letters. A
language designer might choose to allow a bigger set of characters
in identifiers like '-', '+' and others. In that case a-b would be
an identifier and a - b would be the operation. Just as in python
fromAimportB is an identifier and from A import B is an import
statement.

Im not sure what you are saying.
Sure a language designer can design a language differently from python.
I mentioned lisp. Cobol is another behaving exactly as you describe.

My point is that when you do (something like) that, you will need to change the
lexical and grammatical structure of the language. And this will make
for rather far-reaching changes ALL OVER the language not just in what-follows-dot.

IOW: I dont agree that we have a disagreement :)
 
A

Antoon Pardon

Op 04-12-13 13:01, rusi schreef:
Im not sure what you are saying.
Sure a language designer can design a language differently from python.
I mentioned lisp. Cobol is another behaving exactly as you describe.

My point is that when you do (something like) that, you will need to change the
lexical and grammatical structure of the language. And this will make
for rather far-reaching changes ALL OVER the language not just in what-follows-dot.

No you don't need to change the lexical and grammatical structure of
the language. Changing the characters allowed in identifiers, is not a
change in lexical structure. The only difference in lexical structuring
would be that '-', '>=' and other similars symbols would have to be
treated like keyword like 'from', 'as' etc instead of being recognizable
by just being present.

And the grammatical structure of the language wouldn't change at all.
Sure a-b would now be an identifier and not an operation but that is
of no concern for the parser.

People would have to be careful to insert spaces around operators
and that might make the language somewhat error prone but that doesn't
mean the syntactical structure is different.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top