Multiple assignments simplification

Discussion in 'Python' started by bearophileHUGS@lycos.com, Oct 13, 2005.

  1. Guest

    The current version of ShedSkin (http://shedskin.sourceforge.net/
    experimental Python to C++ compiler) is rather alpha, and the
    development work is focused on debugging and implementing some more
    basic Python functionality. But hopefully in future versions more work
    will be spent to make the resulting C++ programs as fast as possible.

    One of the many possible details that may be improved, is the compiling
    to C++ of the Python "parallel" assignments. There are complex
    situations like this one:

    | a = 1
    | b = 2
    | c = 3
    | def fun(a):
    | global b
    | return c, b ** (b+a)
    | (a, b), c = fun(a), a
    | print a, b, c # prints 3 8 1


    Probably it isn't necessary to find an optimal solution for complex
    situations like this one, ShedSkin (SS) uses a basic and simple
    algorithm to translate all the complex cases.

    But maybe for simpler and more regular situations it can be useful to
    find better/optimal solution, like with a swap:

    a = 1
    b = 2
    a, b = b, a

    At the moment SS translates it as something like:

    int __0, __1, a, b;
    int __main() {
    a = 1;
    b = 2;
    __0 = b;
    __1 = a;
    a = __0;
    b = __1;

    SS just copies all the variables before the assignment.
    If such swap is inside a very long loop, then maybe a simplification
    can speed up a program a little (I don't know if C++ compilers can do
    such optimizations).

    This is another example of such "regular" situations:

    a, b, c, d, e, f = range(6)
    a, b, c, d, e = b, d, e, f, c
    print a, b, c, d, e, f
    At the moment SS translates its central part just as:

    __1 = b;
    __2 = d;
    __3 = e;
    __4 = f;
    __5 = c;
    a = __1;
    b = __2;
    c = __3;
    d = __4;
    e = __5;

    The two sides of the assignment aren't just permutations, because some
    variables can be different (like f), and some variables can be present
    two or more times (duplication), some other can be absent.
    A code like this can be faster (and hopefully still correct):

    a = b
    aux_1 = c
    c = e
    e = aux_1
    b = d
    d = f

    That assignment line of code can be represented as:
    [0, 1, 2, 3, 4], [1, 3, 4, 5, 2]
    (Numbers represent variables. The first list is always sorted,
    equivalent to range(n) ).

    Do you know some algorithm (or you can give some suggestions) to
    minimize the number of simple assignments needed for a "regular"
    situation like that?

    Note that in most situations the number of variables is quite small
    (but it can be big).
    (Also note that in the "regular" situation I've ignored the problem of
    mixed variable types, this is something SS has to take care too).

    Bye and thank you,
    bearophile
     
    , Oct 13, 2005
    #1
    1. Advertising

  2. <> wrote:

    > [snipped]
    >
    > Do you know some algorithm (or you can give some suggestions) to
    > minimize the number of simple assignments needed for a "regular"
    > situation like that?


    You can formulate the task as a graph-theoretic problem by representing the set of assignments as a
    digraph G(V,E), where:
    - V = set(LHS) | set(RHS): the vertex set V is the union of all left and right hand side
    expressions.
    - E = set((v1,v2) for "v1 = v2" in assignments): there is an edge from v1 to v2 for every assignment
    v1=v2.

    Now, every edge v1->v2 where in-degree(v1)==0 corresponds to a safe assignment: v1 is not assigned
    to any RHS, so it can be overwritten. After the assignment, both v1 and the edge (v1,v2) can be
    removed, decrementing the in-degree of v2. This happens iteratively as long as there are nodes with
    zero in-degree.

    At this point, all remaining nodes have in-degree >=1 and they form one or more strongly connected
    components. Since no RHS occurs more than once*, the out-degree of every vertex is less than or
    equal to 1. Therefore, for every component Ci,
    |Vi| >= sum(out-degree(v) for v in Vi) == |Ei|.

    Since for a strongly connected component |Vi| <= |Ei|, the previous relationship is actually
    equality |Vi| == |Ei|. Thus each component is a simple cycle v[1]->v[2]->...v[n]->v[1]. You can
    break the cycle by introducing an auxiliary variable x in an arbitrary edge, say v[n]->v[1]. Then
    the following assignments can take place: x = v[1]; v[1] = v[2]; v[2] = v[3]; ...; v[n-1] = v[n];
    v[n] = x

    So overall, you need just one auxiliary variable for each strongly component of G.

    HTH,
    George


    * More than one assignments with the same RHS [v=a, v=b] are useless since only the last has an
    effect. In any case, the useless assignments can be filtered out in the beginning.
     
    George Sakkis, Oct 13, 2005
    #2
    1. Advertising

  3. wrote:

    > I don't know if C++ compilers can do such optimizations.


    working on a Python to C/C++ translator without knowing what kind
    of optimizations a C/C++ compiler can do for you sounds like a great
    way to waste your time...

    (I would be rather bit surprised if any contemporary C or C++ compiler
    didn't generate optimal machine code for source code that contains swaps
    like the one you posted. it won't look at just the swap statement, however;
    the interesting thing is where the values came from, and what you're doing
    with the values later on. minimizing the number of assignment statements in
    the swap translation won't change a thing...)

    </F>
     
    Fredrik Lundh, Oct 13, 2005
    #3
  4. Guest

    Thank you George Sakkis for your fast and accurate answer. In my life I
    am encountering lot of graph-based solutions to my problems. I'll try
    to implement your solution as soon as possible.


    Fredrik Lundh>working on a Python to C/C++ translator without knowing
    what kind of optimizations a C/C++ compiler can do for you sounds like
    a great way to waste your time...<

    I am not the author of ShedSkin (Mark Dufour), I'm ignorant, he is much
    more expert than me about C++. So the possibile waste of time is just
    mine.

    Some experimental timings have shown me that different C++ compilers
    have different optimization capabilities, and such experiments show me
    that sometimes they (g++ of MinGW) aren't capable of doing some
    "obvious" optimizations (I haven't tested the assignments yet).


    Fredrik Lundh>the interesting thing is where the values came from, and
    what you're doing with the values later on. minimizing the number of
    assignment statements in the swap translation won't change a thing...)<

    I see. (It's just a detail, as I've said...)
    If you Fredrik have some free time and you are interested, you can
    probably help Mark Dufour improve that compiler. You can just email the
    author or post things in the Sourceforge site, etc.

    Bear hugs,
    bearophile
     
    , Oct 13, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John
    Replies:
    5
    Views:
    387
  2. Beemer Biker
    Replies:
    0
    Views:
    2,601
    Beemer Biker
    Jan 26, 2009
  3. Stef Mientki

    can't find the right simplification

    Stef Mientki, Apr 24, 2009, in forum: Python
    Replies:
    2
    Views:
    228
    Lie Ryan
    Apr 24, 2009
  4. Beginner
    Replies:
    2
    Views:
    444
    André Freitas
    Jul 11, 2009
  5. golgor
    Replies:
    4
    Views:
    899
    golgor
    Dec 5, 2009
Loading...

Share This Page