Multiple assignments simplification

B

bearophileHUGS

The current version of ShedSkin (http://shedskin.sourceforge.net/
experimental Python to C++ compiler) is rather alpha, and the
development work is focused on debugging and implementing some more
basic Python functionality. But hopefully in future versions more work
will be spent to make the resulting C++ programs as fast as possible.

One of the many possible details that may be improved, is the compiling
to C++ of the Python "parallel" assignments. There are complex
situations like this one:

| a = 1
| b = 2
| c = 3
| def fun(a):
| global b
| return c, b ** (b+a)
| (a, b), c = fun(a), a
| print a, b, c # prints 3 8 1


Probably it isn't necessary to find an optimal solution for complex
situations like this one, ShedSkin (SS) uses a basic and simple
algorithm to translate all the complex cases.

But maybe for simpler and more regular situations it can be useful to
find better/optimal solution, like with a swap:

a = 1
b = 2
a, b = b, a

At the moment SS translates it as something like:

int __0, __1, a, b;
int __main() {
a = 1;
b = 2;
__0 = b;
__1 = a;
a = __0;
b = __1;

SS just copies all the variables before the assignment.
If such swap is inside a very long loop, then maybe a simplification
can speed up a program a little (I don't know if C++ compilers can do
such optimizations).

This is another example of such "regular" situations:

a, b, c, d, e, f = range(6)
a, b, c, d, e = b, d, e, f, c
print a, b, c, d, e, f
At the moment SS translates its central part just as:

__1 = b;
__2 = d;
__3 = e;
__4 = f;
__5 = c;
a = __1;
b = __2;
c = __3;
d = __4;
e = __5;

The two sides of the assignment aren't just permutations, because some
variables can be different (like f), and some variables can be present
two or more times (duplication), some other can be absent.
A code like this can be faster (and hopefully still correct):

a = b
aux_1 = c
c = e
e = aux_1
b = d
d = f

That assignment line of code can be represented as:
[0, 1, 2, 3, 4], [1, 3, 4, 5, 2]
(Numbers represent variables. The first list is always sorted,
equivalent to range(n) ).

Do you know some algorithm (or you can give some suggestions) to
minimize the number of simple assignments needed for a "regular"
situation like that?

Note that in most situations the number of variables is quite small
(but it can be big).
(Also note that in the "regular" situation I've ignored the problem of
mixed variable types, this is something SS has to take care too).

Bye and thank you,
bearophile
 
G

George Sakkis

[snipped]

Do you know some algorithm (or you can give some suggestions) to
minimize the number of simple assignments needed for a "regular"
situation like that?

You can formulate the task as a graph-theoretic problem by representing the set of assignments as a
digraph G(V,E), where:
- V = set(LHS) | set(RHS): the vertex set V is the union of all left and right hand side
expressions.
- E = set((v1,v2) for "v1 = v2" in assignments): there is an edge from v1 to v2 for every assignment
v1=v2.

Now, every edge v1->v2 where in-degree(v1)==0 corresponds to a safe assignment: v1 is not assigned
to any RHS, so it can be overwritten. After the assignment, both v1 and the edge (v1,v2) can be
removed, decrementing the in-degree of v2. This happens iteratively as long as there are nodes with
zero in-degree.

At this point, all remaining nodes have in-degree >=1 and they form one or more strongly connected
components. Since no RHS occurs more than once*, the out-degree of every vertex is less than or
equal to 1. Therefore, for every component Ci,
|Vi| >= sum(out-degree(v) for v in Vi) == |Ei|.

Since for a strongly connected component |Vi| <= |Ei|, the previous relationship is actually
equality |Vi| == |Ei|. Thus each component is a simple cycle v[1]->v[2]->...v[n]->v[1]. You can
break the cycle by introducing an auxiliary variable x in an arbitrary edge, say v[n]->v[1]. Then
the following assignments can take place: x = v[1]; v[1] = v[2]; v[2] = v[3]; ...; v[n-1] = v[n];
v[n] = x

So overall, you need just one auxiliary variable for each strongly component of G.

HTH,
George


* More than one assignments with the same RHS [v=a, v=b] are useless since only the last has an
effect. In any case, the useless assignments can be filtered out in the beginning.
 
F

Fredrik Lundh

I don't know if C++ compilers can do such optimizations.

working on a Python to C/C++ translator without knowing what kind
of optimizations a C/C++ compiler can do for you sounds like a great
way to waste your time...

(I would be rather bit surprised if any contemporary C or C++ compiler
didn't generate optimal machine code for source code that contains swaps
like the one you posted. it won't look at just the swap statement, however;
the interesting thing is where the values came from, and what you're doing
with the values later on. minimizing the number of assignment statements in
the swap translation won't change a thing...)

</F>
 
B

bearophileHUGS

Thank you George Sakkis for your fast and accurate answer. In my life I
am encountering lot of graph-based solutions to my problems. I'll try
to implement your solution as soon as possible.


Fredrik Lundh>working on a Python to C/C++ translator without knowing
what kind of optimizations a C/C++ compiler can do for you sounds like
a great way to waste your time...<

I am not the author of ShedSkin (Mark Dufour), I'm ignorant, he is much
more expert than me about C++. So the possibile waste of time is just
mine.

Some experimental timings have shown me that different C++ compilers
have different optimization capabilities, and such experiments show me
that sometimes they (g++ of MinGW) aren't capable of doing some
"obvious" optimizations (I haven't tested the assignments yet).


Fredrik Lundh>the interesting thing is where the values came from, and
what you're doing with the values later on. minimizing the number of
assignment statements in the swap translation won't change a thing...)<

I see. (It's just a detail, as I've said...)
If you Fredrik have some free time and you are interested, you can
probably help Mark Dufour improve that compiler. You can just email the
author or post things in the Sourceforge site, etc.

Bear hugs,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top