RFC: Assignment as expression (pre-PEP)

Discussion in 'Python' started by TimeHorse@gmail.com, Apr 5, 2007.

  1. Guest

    I would like to gauge interest in the following proposal:

    Problem:

    Assignment statements cannot be used as expressions.

    Performing a list of mutually exclusive checks that require data
    processing can cause excessive tabification. For example, consider
    the following python snipet...

    temp = my_re1.match(exp)
    if temp:
    # do something with temp
    else:
    temp = my_re2.match(exp)
    if temp:
    # do something with temp
    else:
    temp = my_re3.match(exp)
    if temp:
    # do something with temp
    else:
    temp = my_re4.match(exp)

    # etc.

    Even with 2-space tabification, after about 20 matches, the
    indentation will take up half an 80-column terminal screen.

    Details:

    Python distinguishes between an assignment statement and an equality
    expression. This is to force disambiguation of assignment and
    comparison so that a statement like:

    if x = 3:

    Will raise an expression rather than allowing the programmer to
    accidentally overwrite x. Likewise,

    x == 3

    Will either return True, False or raise a NameError exception, which
    can alert the author of any potential coding mistakes since if x = 3
    (assignment) was meant, assignment being a statement returns nothing
    (though it may raise an exception depending on the underlying
    assignment function called).

    Because this forced disambiguation is a guiding virtue of the python
    language, it would NOT be wise to change these semantics of the
    language.

    Proposal:

    Add a new assignment-expression operator to disambiguate it completely
    from existing operators.

    Although any number of glyph could be used for such a new operator, I
    here propose using pascal/gnu make-like assignment. Specifically,

    let:

    x = 3

    Be a statement that returns nothing;

    let:

    x == 3

    Be an expression that, when x is a valid, in-scope name, returns True
    or False;

    let:

    x := 3

    Be an expression that first assigned the value (3) to x, then returns
    x.

    Thus...

    if x = 3:
    # Rais exception
    pass

    if x == 3:
    # Execute IFF x has a value equivalent to 3
    pass

    if x := 3:
    # Executes based on value of x after assignment;
    # since x will be 3 and non-zero and thus represents true, always
    executed
    pass

    Additional:

    Since python allows in-place operator assignment, (e.g. +=, *=, etc.),
    allow for these forms again by prefixing each diglyph with a colon
    :)), forming a triglyph.

    E.g.

    if x :+= 3:
    # Executes IFF, after adding 3 to x, x represents a non-zero number.
    pass

    Also note, that although the colon operator is used to denote the
    beginning of a programme block, it should be easily distinguished from
    the usage of : to denote a diglyph or triglyph assignment expression
    as well as the trinary conditional expression. This is because
    firstly, the statement(s) following a colon :)) in existing python
    should never begin with an assignment operator. I.e.,

    if x: = y

    is currently not valid python. Any attempt at interpreting the
    meaning of such an expression in the current implementation of python
    is likely to fail. Secondly, the diglyph and triglyph expressions do
    not contain spaces, further disambiguating them from existing python.

    Alternative proposals for dyglyph and triglyph representations for
    assignment expressions are welcome.

    Implementation details:

    When the python interpreter parser encounters a diglyph or triglyph
    beginning with a colon :)) and ending with an equals sign (=), perform
    the assignment specified by glyph[1:] and then return the value of the
    variable(s) on the left-hand side of the expression. The assignment
    function called would be based on standard python lookup rules for the
    corresponding glyph[1:] operation (the glyph without the leading
    colon).

    Opposition:

    Adding any new operator to python could be considered code bloat.

    Using a colon in this way could still be ambiguous.

    Adding the ability to read triglyph operators in the python
    interpreter parser would require too big a code revision.

    Usage is too obscure.

    Using an assignment expression would lead to multiple conceptual
    instructions for a single python statement (e.g. first an assignment,
    then an if based on the assignment would mean two operations for a
    single if statement.)

    Comments:

    [Please comment]

    Jeffrey.
    , Apr 5, 2007
    #1
    1. Advertising

  2. On Thu, 2007-04-05 at 12:51 -0700, wrote:
    > I would like to gauge interest in the following proposal:
    >
    > Problem:
    >
    > Assignment statements cannot be used as expressions.
    >
    > Performing a list of mutually exclusive checks that require data
    > processing can cause excessive tabification. For example, consider
    > the following python snipet...
    >
    > temp = my_re1.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re2.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re3.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re4.match(exp)
    >
    > # etc.
    >
    > Even with 2-space tabification, after about 20 matches, the
    > indentation will take up half an 80-column terminal screen.


    If that's your only justification for this proposal, that's almost
    certainly not enough to convince anybody of its necessity. Your code
    example should be rewritten as a loop:

    match_actions = [(my_re1, action1),
    (my_re2, action2),
    ...]

    for my_re, action in match_actions:
    if my_re.match(exp):
    action(exp)
    break

    Hope this helps,

    Carsten
    Carsten Haese, Apr 5, 2007
    #2
    1. Advertising

  3. Duncan Booth Guest

    wrote:

    > Performing a list of mutually exclusive checks that require data
    > processing can cause excessive tabification. For example, consider
    > the following python snipet...
    >
    > temp = my_re1.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re2.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re3.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re4.match(exp)
    >


    Can you come up with a real example where this happens and which cannot be
    easily rewritten to provide better, clearer code without the indentation?

    I'll admit to having occasionally had code not entirely dissimilar to this
    when first written, but I don't believe it has ever survived more than a
    few minutes before being refactored into a cleaner form. I would claim that
    it is a good thing that Python makes it obvious that code like this should
    be refactored.
    Duncan Booth, Apr 5, 2007
    #3
  4. Duncan Booth Guest

    Carsten Haese <> wrote:

    > If that's your only justification for this proposal, that's almost
    > certainly not enough to convince anybody of its necessity. Your code
    > example should be rewritten as a loop:
    >
    > match_actions = [(my_re1, action1),
    > (my_re2, action2),
    > ...]
    >
    > for my_re, action in match_actions:
    > if my_re.match(exp):
    > action(exp)
    > break
    >


    Depending on what his 'do something with temp' actually was, it may or may
    not be easy to rewrite it as a for loop. However, even if a for loop isn't
    an obvious replacement other solutions may be appropriate such as combining
    the regular expressions to a single regex with named groups and/or using
    the command pattern.

    If assignment was an expression that only addresses one problem with
    the sample code. It still leaves his code with excessive repetition and
    probably with an excessively long function that calls out to be refactored
    as a group of smaller methods.
    Duncan Booth, Apr 5, 2007
    #4
  5. Guest

    On Apr 5, 4:22 pm, Duncan Booth <> wrote:
    > Can you come up with a real example where this happens and which cannot be
    > easily rewritten to provide better, clearer code without the indentation?
    >
    > I'll admit to having occasionally had code not entirely dissimilar to this
    > when first written, but I don't believe it has ever survived more than a
    > few minutes before being refactored into a cleaner form. I would claim that
    > it is a good thing that Python makes it obvious that code like this should
    > be refactored.


    I am trying to write a parser for a text string. Specifically, I am
    trying to take a filename that contains meta-data about the content of
    the A/V file (mpg, mp3, etc.).

    I first split the filename into fields separated by spaces and dots.

    Then I have a series of regular expression matches. I like
    Cartesian's 'event-based' parser approach though the even table gets a
    bit unwieldy as it grows. Also, I would prefer to have the 'action'
    result in a variable assignment specific to the test. E.g.

    def parseName(name):
    fields = sd.split(name)
    fields, ext = fields[:-1], fields[-1]
    year = ''
    capper = ''
    series = None
    episodeNum = None
    programme = ''
    episodeName = ''
    past_title = false
    for f in fields:
    if year_re.match(f):
    year = f
    past_title = True
    else:
    my_match = capper_re.match(f):
    if my_match:
    capper = capper_re.match(f).group(1)
    if capper == 'JJ' or capper == 'JeffreyJacobs':
    capper = 'Jeffrey C. Jacobs'
    past_title = True
    else:
    my_match = epnum_re.match(f):
    if my_match:
    series, episodeNum = my_match.group('series',
    'episode')
    past_title = True
    else:
    # If I think of other parse elements, they go
    here.
    # Otherwise, name is part of a title; check for
    capitalization
    if f[0] >= 'a' and f[0] <= 'z' and f not in
    do_not_capitalize:
    f = f.capitalize()
    if past_title:
    if episodeName: episodeName += ' '
    episodeName += f
    else:
    if programme: programme += ' '
    programme += f

    return programme, series, episodeName, episodeNum, year, capper,
    ext

    Now, the problem with this code is that it assumes only 2 pieces of
    free-form meta-data in the name (i.e. Programme Name and Episode
    Name). Also, although this is not directly adaptable to Cartesian's
    approach, you COULD rewrite it using a dictionary in the place of
    local variable names so that the event lookup could consist of 3
    properties per event: compiled_re, action_method, dictionary_string.
    But even with that, in the case of the epnum match, two assignments
    are required so perhaps a convoluted scheme such that if
    dictionary_string is a list, for each of the values returned by
    action_method, bind the result to the corresponding ith dictionary
    element named in dictionary_string, which seems a bit convoluted. And
    the fall-through case is state-dependent since the 'unrecognized
    field' should be shuffled into a different variable dependent on
    state. Still, if there is a better approach I am certainly up for
    it. I love event-based parsers so I have no problem with that
    approach in general.
    , Apr 5, 2007
    #5
  6. Neil Hodgson Guest

    :

    > else:
    > my_match = capper_re.match(f):
    > if my_match:
    > capper = capper_re.match(f).group(1)
    > if capper == 'JJ' or capper == 'JeffreyJacobs':
    > capper = 'Jeffrey C. Jacobs'
    > past_title = True


    The assignment to my_match here is not used, so the test can be "if
    capper_re.match(f)" which can then merge up into the previous else as an
    elif dropping one level of indentation.

    Neil
    Neil Hodgson, Apr 5, 2007
    #6
  7. Guest

    On Apr 5, 6:01 pm, Neil Hodgson <> wrote:
    > :
    >
    > > else:
    > > my_match = capper_re.match(f):
    > > if my_match:
    > > capper = capper_re.match(f).group(1)
    > > if capper == 'JJ' or capper == 'JeffreyJacobs':
    > > capper = 'Jeffrey C. Jacobs'
    > > past_title = True

    >
    > The assignment to my_match here is not used, so the test can be "if
    > capper_re.match(f)" which can then merge up into the previous else as an
    > elif dropping one level of indentation.
    >
    > Neil


    That was a typo. I meant to reuse my_match in the line "capper =
    my_match.group(1)" rather than the line above just so I would not have
    to evaluate the regular expression twice. Sorry for the confusion.

    Jeffrey.
    , Apr 5, 2007
    #7
  8. wrote:
    > On Apr 5, 4:22 pm, Duncan Booth <> wrote:
    >> Can you come up with a real example where this happens and which cannot be
    >> easily rewritten to provide better, clearer code without the indentation?
    >>
    >> I'll admit to having occasionally had code not entirely dissimilar to this
    >> when first written, but I don't believe it has ever survived more than a
    >> few minutes before being refactored into a cleaner form. I would claim that
    >> it is a good thing that Python makes it obvious that code like this should
    >> be refactored.

    >
    > I am trying to write a parser for a text string. Specifically, I am
    > trying to take a filename that contains meta-data about the content of
    > the A/V file (mpg, mp3, etc.).
    >
    > I first split the filename into fields separated by spaces and dots.
    >
    > Then I have a series of regular expression matches. I like
    > Cartesian's 'event-based' parser approach though the even table gets a
    > bit unwieldy as it grows. Also, I would prefer to have the 'action'
    > result in a variable assignment specific to the test. E.g.
    >
    > def parseName(name):
    > fields = sd.split(name)
    > fields, ext = fields[:-1], fields[-1]
    > year = ''
    > capper = ''
    > series = None
    > episodeNum = None
    > programme = ''
    > episodeName = ''
    > past_title = false
    > for f in fields:
    > if year_re.match(f):
    > year = f
    > past_title = True
    > else:
    > my_match = capper_re.match(f):
    > if my_match:
    > capper = capper_re.match(f).group(1)
    > if capper == 'JJ' or capper == 'JeffreyJacobs':
    > capper = 'Jeffrey C. Jacobs'
    > past_title = True
    > else:
    > my_match = epnum_re.match(f):
    > if my_match:
    > series, episodeNum = my_match.group('series',
    > 'episode')
    > past_title = True
    > else:
    > # If I think of other parse elements, they go
    > here.
    > # Otherwise, name is part of a title; check for
    > capitalization
    > if f[0] >= 'a' and f[0] <= 'z' and f not in
    > do_not_capitalize:
    > f = f.capitalize()
    > if past_title:
    > if episodeName: episodeName += ' '
    > episodeName += f
    > else:
    > if programme: programme += ' '
    > programme += f
    >
    > return programme, series, episodeName, episodeNum, year, capper,
    > ext


    Why can't you combine your regular expressions into a single expression,
    e.g. something like::

    >>> exp = r'''

    ... (?P<year>\d{4})
    ... |
    ... by\[(?P<capper>.*)\]
    ... |
    ... S(?P<series>\d\d)E(?P<episode>\d\d)
    ... '''
    >>> matcher = re.compile(exp, re.VERBOSE)
    >>> matcher.match('1990').groupdict()

    {'series': None, 'capper': None, 'episode': None, 'year': '1990'}
    >>> matcher.match('by[Jovev]').groupdict()

    {'series': None, 'capper': 'Jovev', 'episode': None, 'year': None}
    >>> matcher.match('S01E12').groupdict()

    {'series': '01', 'capper': None, 'episode': '12', 'year': None}

    Then your code above would look something like::

    for f in fields:
    match = matcher.match(f)
    if match is not None:
    year = match.group('year')
    capper = match.group('capper')
    if capper == 'JJ' or capper == 'JeffreyJacobs':
    capper = 'Jeffrey C. Jacobs'
    series = match.group('series')
    episodeNum = match.group('episode')
    past_title = True
    else:
    if 'a' <= f[0] <= 'z' and f not in do_not_capitalize:
    f = f.capitalize()
    if past_title:
    if episodeName:
    episodeName += ' '
    episodeName += f
    else:
    if programme:
    programme += ' '
    programme += f

    STeVe
    Steven Bethard, Apr 5, 2007
    #8
  9. Steven Bethard wrote:
    > wrote:
    >> On Apr 5, 4:22 pm, Duncan Booth <> wrote:
    >>> Can you come up with a real example where this happens and which
    >>> cannot be
    >>> easily rewritten to provide better, clearer code without the
    >>> indentation?
    >>>
    >>> I'll admit to having occasionally had code not entirely dissimilar to
    >>> this
    >>> when first written, but I don't believe it has ever survived more than a
    >>> few minutes before being refactored into a cleaner form. I would
    >>> claim that
    >>> it is a good thing that Python makes it obvious that code like this
    >>> should
    >>> be refactored.

    >>
    >> I am trying to write a parser for a text string. Specifically, I am
    >> trying to take a filename that contains meta-data about the content of
    >> the A/V file (mpg, mp3, etc.).
    >>
    >> I first split the filename into fields separated by spaces and dots.
    >>
    >> Then I have a series of regular expression matches. I like
    >> Cartesian's 'event-based' parser approach though the even table gets a
    >> bit unwieldy as it grows. Also, I would prefer to have the 'action'
    >> result in a variable assignment specific to the test. E.g.
    >>
    >> def parseName(name):
    >> fields = sd.split(name)
    >> fields, ext = fields[:-1], fields[-1]
    >> year = ''
    >> capper = ''
    >> series = None
    >> episodeNum = None
    >> programme = ''
    >> episodeName = ''
    >> past_title = false
    >> for f in fields:
    >> if year_re.match(f):
    >> year = f
    >> past_title = True
    >> else:
    >> my_match = capper_re.match(f):
    >> if my_match:
    >> capper = capper_re.match(f).group(1)
    >> if capper == 'JJ' or capper == 'JeffreyJacobs':
    >> capper = 'Jeffrey C. Jacobs'
    >> past_title = True
    >> else:
    >> my_match = epnum_re.match(f):
    >> if my_match:
    >> series, episodeNum = my_match.group('series',
    >> 'episode')
    >> past_title = True
    >> else:
    >> # If I think of other parse elements, they go
    >> here.
    >> # Otherwise, name is part of a title; check for
    >> capitalization
    >> if f[0] >= 'a' and f[0] <= 'z' and f not in
    >> do_not_capitalize:
    >> f = f.capitalize()
    >> if past_title:
    >> if episodeName: episodeName += ' '
    >> episodeName += f
    >> else:
    >> if programme: programme += ' '
    >> programme += f
    >>
    >> return programme, series, episodeName, episodeNum, year, capper,
    >> ext

    >
    > Why can't you combine your regular expressions into a single expression,
    > e.g. something like::
    >
    > >>> exp = r'''

    > ... (?P<year>\d{4})
    > ... |
    > ... by\[(?P<capper>.*)\]
    > ... |
    > ... S(?P<series>\d\d)E(?P<episode>\d\d)
    > ... '''
    > >>> matcher = re.compile(exp, re.VERBOSE)
    > >>> matcher.match('1990').groupdict()

    > {'series': None, 'capper': None, 'episode': None, 'year': '1990'}
    > >>> matcher.match('by[Jovev]').groupdict()

    > {'series': None, 'capper': 'Jovev', 'episode': None, 'year': None}
    > >>> matcher.match('S01E12').groupdict()

    > {'series': '01', 'capper': None, 'episode': '12', 'year': None}
    >
    > Then your code above would look something like::
    >
    > for f in fields:
    > match = matcher.match(f)
    > if match is not None:
    > year = match.group('year')
    > capper = match.group('capper')
    > if capper == 'JJ' or capper == 'JeffreyJacobs':
    > capper = 'Jeffrey C. Jacobs'
    > series = match.group('series')
    > episodeNum = match.group('episode')
    > past_title = True


    I guess you need to be a little more careful here not to overwrite your
    old values, e.g. something like::

    year = match.group('year') or year
    capper = match.group('capper') or capper
    ...

    STeVe
    Steven Bethard, Apr 6, 2007
    #9
  10. En Thu, 05 Apr 2007 18:08:46 -0300,
    <> escribió:

    > I am trying to write a parser for a text string. Specifically, I am
    > trying to take a filename that contains meta-data about the content of
    > the A/V file (mpg, mp3, etc.).
    >
    > I first split the filename into fields separated by spaces and dots.
    >
    > Then I have a series of regular expression matches. I like
    > Cartesian's 'event-based' parser approach though the even table gets a
    > bit unwieldy as it grows. Also, I would prefer to have the 'action'
    > result in a variable assignment specific to the test. E.g.
    >
    > def parseName(name):
    > fields = sd.split(name)
    > fields, ext = fields[:-1], fields[-1]
    > year = ''
    > capper = ''
    > series = None
    > episodeNum = None
    > programme = ''
    > episodeName = ''
    > past_title = false
    > for f in fields:
    > if year_re.match(f):
    > year = f
    > past_title = True
    > else:
    > my_match = capper_re.match(f):
    > if my_match:
    > capper = capper_re.match(f).group(1)
    > if capper == 'JJ' or capper == 'JeffreyJacobs':
    > capper = 'Jeffrey C. Jacobs'
    > past_title = True
    > else:
    > my_match = epnum_re.match(f):
    > if my_match:
    > series, episodeNum = my_match.group('series',
    > 'episode')
    > past_title = True
    > else:
    > # If I think of other parse elements, they go
    > here.
    > # Otherwise, name is part of a title; check for
    > capitalization
    > if f[0] >= 'a' and f[0] <= 'z' and f not in
    > do_not_capitalize:
    > f = f.capitalize()
    > if past_title:
    > if episodeName: episodeName += ' '
    > episodeName += f
    > else:
    > if programme: programme += ' '
    > programme += f
    >
    > return programme, series, episodeName, episodeNum, year, capper,
    > ext
    >
    > Now, the problem with this code is that it assumes only 2 pieces of
    > free-form meta-data in the name (i.e. Programme Name and Episode
    > Name). Also, although this is not directly adaptable to Cartesian's
    > approach, you COULD rewrite it using a dictionary in the place of
    > local variable names so that the event lookup could consist of 3
    > properties per event: compiled_re, action_method, dictionary_string.
    > But even with that, in the case of the epnum match, two assignments
    > are required so perhaps a convoluted scheme such that if
    > dictionary_string is a list, for each of the values returned by
    > action_method, bind the result to the corresponding ith dictionary
    > element named in dictionary_string, which seems a bit convoluted. And
    > the fall-through case is state-dependent since the 'unrecognized
    > field' should be shuffled into a different variable dependent on
    > state. Still, if there is a better approach I am certainly up for
    > it. I love event-based parsers so I have no problem with that
    > approach in general.


    Maybe it's worth using a class instance. Define methods to handle each
    matching regex, and keep state in the instance.

    class NameParser:

    def handle_year(self, field, match):
    self.year = field
    self.past_title = True

    def handle_capper(self, field, match):
    capper = match.group(1)
    if capper == 'JJ' or capper == 'JeffreyJacobs':
    capper = 'Jeffrey C. Jacobs'
    self.capper = capper
    self.past_title = True

    def parse(self, name):
    fields = sd.split(name)
    for field in fields:
    for regex,handler in self.handlers:
    match = regex.match(field)
    if match:
    handler(self, field, match)
    break

    You have to build the handlers list, containing (regex, handler) items;
    the "unknown" case might be a match-all expression at the end.
    Well, after playing a bit with decorators I got this:

    class NameParser:

    year = ''
    capper = ''
    series = None
    episodeNum = None
    programme = ''
    episodeName = ''
    past_title = False
    handlers = []

    def __init__(self, name):
    self.name = name
    self.parse()

    def handle_this(regex, handlers=handlers):
    # A decorator; associates the function to the regex
    # (Not intended to be used as a normal method! not even a static
    method!)
    def register(function, regex=regex):
    handlers.append((re.compile(regex), function))
    return function
    return register

    @handle_this(r"\(?\d+\)?")
    def handle_year(self, field, match):
    self.year = field
    self.past_title = True

    @handle_this(r"(expression)")
    def handle_capper(self, field, match):
    capper = match.group(1)
    if capper == 'JJ' or capper == 'JeffreyJacobs':
    capper = 'Jeffrey C. Jacobs'
    self.capper = capper
    self.past_title = True

    @handle_this(r".*")
    def handle_unknown(self, field, match):
    if field[0] >= 'a' and field[0] <= 'z' and field not in
    do_not_capitalize:
    field = field.capitalize()
    if self.past_title:
    if self.episodeName: self.episodeName += ' '
    self.episodeName += field
    else:
    if self.programme: self.programme += ' '
    self.programme += field

    def parse(self):
    fields = sd.split(self.name)
    for field in fields:
    for regex,handler in self.handlers:
    match = regex.match(field)
    if match:
    handler(self, field, match)
    break


    --
    Gabriel Genellina
    Gabriel Genellina, Apr 6, 2007
    #10
  11. a écrit :
    > I would like to gauge interest in the following proposal:
    >
    > Problem:
    >
    > Assignment statements cannot be used as expressions.


    This is by design.

    > Performing a list of mutually exclusive checks that require data
    > processing can cause excessive tabification. For example, consider
    > the following python snipet...
    >
    > temp = my_re1.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re2.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re3.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re4.match(exp)


    OMG.

    actions = [
    (my_re1, do_something_with_temp1),
    (my_re2, do_something_with_temp2),
    (my_re3, do_something_with_temp3),
    (my_re4, do_something_with_temp4),
    ]

    for my_re, do_something_with in actions:
    temp = my_re.match(exp):
    if temp:
    do_something_with(temp)
    break

    Having full-blown anonymous functions or Ruby/Smalltalk-like code blocks
    would be much more interesting IMHO.
    Bruno Desthuilliers, Apr 6, 2007
    #11
  12. Duncan Booth a écrit :
    > wrote:
    >
    >
    >>Performing a list of mutually exclusive checks that require data
    >>processing can cause excessive tabification. For example, consider
    >>the following python snipet...
    >>
    >>temp = my_re1.match(exp)
    >>if temp:
    >> # do something with temp
    >>else:
    >> temp = my_re2.match(exp)
    >> if temp:
    >> # do something with temp
    >> else:
    >> temp = my_re3.match(exp)
    >> if temp:
    >> # do something with temp
    >> else:
    >> temp = my_re4.match(exp)
    >>

    >
    >
    > Can you come up with a real example where this happens and which cannot be
    > easily rewritten to provide better, clearer code without the indentation?
    >
    > I'll admit to having occasionally had code not entirely dissimilar to this
    > when first written, but I don't believe it has ever survived more than a
    > few minutes before being refactored into a cleaner form. I would claim that
    > it is a good thing that Python makes it obvious that code like this should
    > be refactored.


    +2 QOTW
    Bruno Desthuilliers, Apr 6, 2007
    #12
  13. Duncan Booth Guest

    "Gabriel Genellina" <> wrote:

    > You have to build the handlers list, containing (regex, handler) items;
    > the "unknown" case might be a match-all expression at the end.
    > Well, after playing a bit with decorators I got this:

    <snip>

    That's a nice class, and more maintainable with the separate handler
    methods than a long function. Here's a completely untested variation. I
    hope the intent is clear:

    def handle_this(regex, handlers=handlers):
    # A decorator; associates the function to the regex
    # (Not intended to be used as a normal method!
    # not even a static method!)
    def register(function, regex=regex):
    handlers.append((function.__name__, regex))
    return function
    return register

    ... insert handlers here ...

    def parse(self):
    regex = '|'.join(['(?P<%s>%s)' % pair for pair in self.handlers])
    fields = str.split(self.name)
    for field in fields:
    match = regex.match(field)
    if match:
    handler = getattr(self,
    match.lastgroup,
    self.handle_unknown)
    handler(self, field, match)

    The handler functions themselves would have to be constrained to also use
    only named groups, but you gain by only having a single regex.match call on
    each field which could (if there are a lot of handlers) be significant.

    The calculation of regex could also of course be pulled out of parse to
    somewhere it only happens once for the class instead of once per instance.
    Duncan Booth, Apr 6, 2007
    #13
  14. Paul McGuire Guest

    On Apr 5, 4:08 pm, "" <>
    wrote:
    > I love event-based parsers so I have no problem with that
    > approach in general.


    You might find a pyparsing version of this to be to your liking. It
    is possible in the parser events (or "parse actions" as pyparsing
    calls them) to perform operations such as capitalization, string
    replacement, or string-to-integer conversion. To assign names to
    specific fields, one defines results names using setResultsName. A
    grammar for your file name might look something like (I'm just
    guessing from your code):

    from pyparsing import *
    def parseName2(name):
    """Parse filenames of the form:

    programmeTitle.year.series.episodeNum.episodeName.capper.ext
    """
    capper = oneOf("JJ JeffreyJacobs").replaceWith("Jeffrey C.
    Jacobs").setResultsName("capper")
    ext = Word(alphanums).setResultsName("ext")
    year = Word(nums,exact=4).setResultsName("year")
    capitalizeAll = lambda tokens : map(str.capitalize, tokens)
    title = Combine( OneOrMore( ~year +
    Word(alphas) ).setParseAction( capitalizeAll ), joinString=" " ) \
    .setResultsName("programme")
    seriesAndEpnum = Combine( OneOrMore( ~Literal("-") +
    Word(alphas) ).setParseAction( capitalizeAll ),
    joinString=" ").setResultsName("series") + \
    Word(nums).setResultsName("episodeNum")
    epname = Combine( OneOrMore( ~capper +
    Word(alphas) ).setParseAction( capitalizeAll ), joinString=" " ) \
    .setResultsName("episodeName")
    fileName = title + "." + year + "." + seriesAndEpnum + "." +
    epname + "." + capper + "." + ext
    parts = fileName.parseString(name)
    return parts.programme, parts.series, parts.episodeName,
    parts.episodeNum, parst.year, parts.capper, parts.ext

    In this example, the parse actions are capitalizeAll (easily
    implemented with a simple lambda), and replaceWith (which is included
    with pyparsing).

    -- Paul
    Paul McGuire, Apr 8, 2007
    #14
  15. Paul McGuire Guest

    On Apr 7, 9:55 pm, "Paul McGuire" <> wrote:
    > seriesAndEpnum = Combine( OneOrMore( ~Literal("-") +
    > Word(alphas) ).setParseAction( capitalizeAll ),
    > joinString=" ").setResultsName("series") + \
    > Word(nums).setResultsName("episodeNum")


    should be:

    seriesAndEpnum =
    Combine( OneOrMore( Word(alphas) ).setParseAction( capitalizeAll ),
    joinString=" ").setResultsName("series") + \
    "-" + Word(nums).setResultsName("episodeNum")

    (This example is hypothetical based on the limited info in your posted
    code, the purpose of this element was to try to emulate your case
    where two "variables" are defined in a single expression.)

    -- Paul
    Paul McGuire, Apr 8, 2007
    #15
  16. Dustan Guest

    On Apr 5, 2:51 pm, wrote:
    > I would like to gauge interest in the following proposal:
    >
    > Problem:
    >
    > Assignment statements cannot be used as expressions.
    >
    > Performing a list of mutually exclusive checks that require data
    > processing can cause excessive tabification. For example, consider
    > the following python snipet...
    >
    > temp = my_re1.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re2.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re3.match(exp)
    > if temp:
    > # do something with temp
    > else:
    > temp = my_re4.match(exp)
    >
    > # etc.
    >
    > Even with 2-space tabification, after about 20 matches, the
    > indentation will take up half an 80-column terminal screen.
    >
    > Details:
    >
    > Python distinguishes between an assignment statement and an equality
    > expression. This is to force disambiguation of assignment and
    > comparison so that a statement like:
    >
    > if x = 3:
    >
    > Will raise an expression rather than allowing the programmer to
    > accidentally overwrite x. Likewise,
    >
    > x == 3
    >
    > Will either return True, False or raise a NameError exception, which
    > can alert the author of any potential coding mistakes since if x = 3
    > (assignment) was meant, assignment being a statement returns nothing
    > (though it may raise an exception depending on the underlying
    > assignment function called).
    >
    > Because this forced disambiguation is a guiding virtue of the python
    > language, it would NOT be wise to change these semantics of the
    > language.
    >
    > Proposal:
    >
    > Add a new assignment-expression operator to disambiguate it completely
    > from existing operators.
    >
    > Although any number of glyph could be used for such a new operator, I
    > here propose using pascal/gnu make-like assignment. Specifically,
    >
    > let:
    >
    > x = 3
    >
    > Be a statement that returns nothing;
    >
    > let:
    >
    > x == 3
    >
    > Be an expression that, when x is a valid, in-scope name, returns True
    > or False;
    >
    > let:
    >
    > x := 3
    >
    > Be an expression that first assigned the value (3) to x, then returns
    > x.
    >
    > Thus...
    >
    > if x = 3:
    > # Rais exception
    > pass
    >
    > if x == 3:
    > # Execute IFF x has a value equivalent to 3
    > pass
    >
    > if x := 3:
    > # Executes based on value of x after assignment;
    > # since x will be 3 and non-zero and thus represents true, always
    > executed
    > pass
    >
    > Additional:
    >
    > Since python allows in-place operator assignment, (e.g. +=, *=, etc.),
    > allow for these forms again by prefixing each diglyph with a colon
    > :)), forming a triglyph.
    >
    > E.g.
    >
    > if x :+= 3:
    > # Executes IFF, after adding 3 to x, x represents a non-zero number.
    > pass
    >
    > Also note, that although the colon operator is used to denote the
    > beginning of a programme block, it should be easily distinguished from
    > the usage of : to denote a diglyph or triglyph assignment expression
    > as well as the trinary conditional expression. This is because
    > firstly, the statement(s) following a colon :)) in existing python
    > should never begin with an assignment operator. I.e.,
    >
    > if x: = y
    >
    > is currently not valid python. Any attempt at interpreting the
    > meaning of such an expression in the current implementation of python
    > is likely to fail. Secondly, the diglyph and triglyph expressions do
    > not contain spaces, further disambiguating them from existing python.
    >
    > Alternative proposals for dyglyph and triglyph representations for
    > assignment expressions are welcome.
    >
    > Implementation details:
    >
    > When the python interpreter parser encounters a diglyph or triglyph
    > beginning with a colon :)) and ending with an equals sign (=), perform
    > the assignment specified by glyph[1:] and then return the value of the
    > variable(s) on the left-hand side of the expression. The assignment
    > function called would be based on standard python lookup rules for the
    > corresponding glyph[1:] operation (the glyph without the leading
    > colon).
    >
    > Opposition:
    >
    > Adding any new operator to python could be considered code bloat.
    >
    > Using a colon in this way could still be ambiguous.
    >
    > Adding the ability to read triglyph operators in the python
    > interpreter parser would require too big a code revision.
    >
    > Usage is too obscure.
    >
    > Using an assignment expression would lead to multiple conceptual
    > instructions for a single python statement (e.g. first an assignment,
    > then an if based on the assignment would mean two operations for a
    > single if statement.)
    >
    > Comments:
    >
    > [Please comment]
    >
    > Jeffrey.


    If you really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    really really really really really really really really really really
    want to do something like this, just create a wrapper class:

    >>> class Wrapper(object):

    def __init__(self, obj):
    self.obj = obj
    def getit(self):
    return self.obj
    def setit(self, obj):
    self.obj = obj
    return obj


    >>> import random
    >>> x = Wrapper(0)
    >>> if x.setit(random.randrange(2)):

    print 'yes!'
    else:
    print 'hmmm...'

    hmmm...
    Dustan, Apr 8, 2007
    #16
  17. Dustan <> wrote:

    > >>> class Wrapper(object):

    > def __init__(self, obj):
    > self.obj = obj
    > def getit(self):
    > return self.obj
    > def setit(self, obj):
    > self.obj = obj
    > return obj


    Yeah, that's substantialy the same approach I posted as a Python
    Cookbook recipe almost six years ago, see
    <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66061> .

    My specific use case for that recipe was when using Python to code a
    "reference algorithm" found in a book, so that deep restructuring was
    unwanted -- a similar but opposite case is using Python to explore
    prototype algorithms that would later be recoded e.g. in C (here, too,
    you don't really want to refactor the Python code to use dictionaries
    "properly", so assign-and-test is handy).


    Alex
    Alex Martelli, Apr 8, 2007
    #17
  18. Dustan Guest

    On Apr 8, 10:56 am, (Alex Martelli) wrote:
    > Dustan <> wrote:
    > > >>> class Wrapper(object):

    > > def __init__(self, obj):
    > > self.obj = obj
    > > def getit(self):
    > > return self.obj
    > > def setit(self, obj):
    > > self.obj = obj
    > > return obj

    >
    > Yeah, that's substantialy the same approach I posted as a Python
    > Cookbook recipe almost six years ago, see
    > <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66061> .


    Indeed, I did discover that in my copy of the Python Cookbook some
    time ago. Perhaps I should have noted that.

    > My specific use case for that recipe was when using Python to code a
    > "reference algorithm" found in a book, so that deep restructuring was
    > unwanted -- a similar but opposite case is using Python to explore
    > prototype algorithms that would later be recoded e.g. in C (here, too,
    > you don't really want to refactor the Python code to use dictionaries
    > "properly", so assign-and-test is handy).
    >
    > Alex
    Dustan, Apr 8, 2007
    #18
  19. Adam Atlas Guest

    Hasn't this been discussed many many times before? I think Guido has
    been favourable to the idea of allowing :=, but that was a long time
    ago, and I don't think anything ever came of it.

    Personally, if anything, I'd like to see more use of the 'as' keyword
    as in Python 2.5's new 'with' statement. Assignment is basically what
    it adds to the statement, so if anything we should reuse it in other
    statements for consistency.

    if my_re1.match(exp) as temp:
    # do something with temp
    elif my_re2.match(exp) as temp:
    # do something with temp
    elif my_re3.match(exp) as temp:
    # do something with temp
    elif my_re4.match(exp) as temp:
    # do something with temp

    As others have mentioned, your particular instance is probably
    evidence that you need to restructure your code a little bit, but I do
    agree that "x = y; if x: ..." is a common enough idiom that it
    warrants a shortcut. And reusing "as", I think, is nice and readable,
    and it's an advantage that it doesn't require adding any new keywords
    or symbols.
    Adam Atlas, Apr 10, 2007
    #19
  20. Adam Atlas <> wrote:

    > Hasn't this been discussed many many times before? I think Guido has
    > been favourable to the idea of allowing :=, but that was a long time
    > ago, and I don't think anything ever came of it.
    >
    > Personally, if anything, I'd like to see more use of the 'as' keyword
    > as in Python 2.5's new 'with' statement. Assignment is basically what
    > it adds to the statement, so if anything we should reuse it in other
    > statements for consistency.
    >
    > if my_re1.match(exp) as temp:
    > # do something with temp
    > elif my_re2.match(exp) as temp:
    > # do something with temp
    > elif my_re3.match(exp) as temp:
    > # do something with temp
    > elif my_re4.match(exp) as temp:
    > # do something with temp
    >
    > As others have mentioned, your particular instance is probably
    > evidence that you need to restructure your code a little bit, but I do
    > agree that "x = y; if x: ..." is a common enough idiom that it
    > warrants a shortcut. And reusing "as", I think, is nice and readable,
    > and it's an advantage that it doesn't require adding any new keywords
    > or symbols.


    Actually, I agree with you. Unfortunately, I doubt python-dev will, but
    the chance is good enough that it's probably worth proposing there
    (ideally together with a patch to implement it, just to avoid any
    [otherwise likely] whines about this being difficult to implement:).


    Alex
    Alex Martelli, Apr 10, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raymond Hettinger

    Pre-PEP: reverse iteration methods

    Raymond Hettinger, Sep 24, 2003, in forum: Python
    Replies:
    34
    Views:
    818
    Stephen Horne
    Sep 25, 2003
  2. Christoph Becker-Freyseng

    PEP for new modules (I read PEP 2)

    Christoph Becker-Freyseng, Jan 15, 2004, in forum: Python
    Replies:
    3
    Views:
    360
    Gerrit Holl
    Jan 16, 2004
  3. Lie
    Replies:
    25
    Views:
    718
    Dafydd Hughes
    Dec 18, 2007
  4. Ivan Shmakov
    Replies:
    3
    Views:
    1,126
    Kari Hurtta
    Feb 13, 2012
  5. Jan Pokorný
    Replies:
    1
    Views:
    175
    Jan Pokorný
    Mar 11, 2012
Loading...

Share This Page