replacing dict with an OrderedDict

Ulrich Eckhardt · Jan 9, 2012

Am 09.01.2012 13:10, schrieb Lie Ryan:

I was just suggesting that what the OP thinks he wants is quite
likely not what he actually wants.

Rest assured that the OP has a rather good idea of what he wants and
why, the latter being something you don't know, because he never
bothered to explain it and you never asked. Please don't think he's an
idiot just because he wants something that doesn't make sense to you.

*le sigh*

Uli

Roy Smith · Jan 9, 2012

Ian Kelly said:
Randomizing the order is not a bad idea, but you also need to be able
to run the tests in a consistent order, from a specific random seed.
In the real world, test conflicts and dependencies do happen, and if
we observe a failure, make a change, rerun the tests and observe
success, we need to be able to be sure that we actually fixed the bug,
and that it didn't pass only because it was run in a different order.

I've seen this argument play out multiple times on this group.
Executive summary:

OP: "I want to do X"

Peanut Gallery: "You're not supposed to do that"

Here's my commentary on that.

The classic unittest philosophy says that tests should be independent of
each other, which means they should be able to be run in arbitrary
order. Some people advocate that the test framework should
intentionally randomize the order, to flush out inter-test dependencies
that the author didn't realize existed (or intend).

Test independence is a good thing. Many people don't understand this
when writing tests, and inadvertently write tests that depend on each
other. I've worked with those systems. They're a bear to debug.
You've got some test suite that runs for 15 minutes and dies at case 37
of 43. If you try to run case 37 by itself, it won't run, and you can't
figure out what state cases 1-36 were supposed to leave the system in to
make 37 work. You could sink days or weeks into debugging this kind of
crap. BTDT.

That being said, the unittest module, while designed to support the "all
tests must be independent" philosophy, is a useful piece of software for
lots of things. It provides an easy to use framework for writing tests,
lots of convenient assertions, reporting, test discovery, etc, etc.

If somebody (i.e. the classic "consenting adult" of the Python world)
wants to take advantage of that to write a test suite where the tests
*do* depend on each other, and have to be run in a certain order,
there's nothing wrong with that. As long as they understand the
consequences of their actions, don't try to preach unittest religion to
them. They're in the best position to know if what they're trying to do
is the best thing for their particular situation.

Neil Cerutti · Jan 9, 2012

If somebody (i.e. the classic "consenting adult" of the Python
world) wants to take advantage of that to write a test suite
where the tests *do* depend on each other, and have to be run
in a certain order, there's nothing wrong with that. As long
as they understand the consequences of their actions, don't try
to preach unittest religion to them. They're in the best
position to know if what they're trying to do is the best thing
for their particular situation.

If a question springs from an idea that is usually a bad
practice, then it should be challenged. The possible broken-nose
of a questioner is a small price to pay for the education of the
peanut gallery.

If a questioner does not wish to defend what they are doing, he
or she has that right, of course.

Ulrich Eckhardt · Jan 9, 2012

Am 09.01.2012 15:35, schrieb Roy Smith:

The classic unittest philosophy says that tests should be independent of
each other, which means they should be able to be run in arbitrary
order. Some people advocate that the test framework should
intentionally randomize the order, to flush out inter-test dependencies
that the author didn't realize existed (or intend).

While I agree with the idea, I'd like to add that independence is an
illusion. You already have possible dependencies if you run tests in the
same process/OS-installation/computer/parallel universe. If you now
happen to influence one test with another and the next run randomizes
the tests differently, you will never see the fault again. Without this
reproducability, you don't gain anything but the bad stomach feeling
that something is wrong.

Test independence is a good thing. Many people don't understand this
when writing tests, and inadvertently write tests that depend on each
other. I've worked with those systems. They're a bear to debug.
You've got some test suite that runs for 15 minutes and dies at case 37
of 43. If you try to run case 37 by itself, it won't run, and you can't
figure out what state cases 1-36 were supposed to leave the system in to
make 37 work. You could sink days or weeks into debugging this kind of
crap. BTDT.

I'm sorry to hear that, debugging other peoples' feces is never a task
to wish for. That said, there are two kinds of dependencies and in at
least this discussion there hasn't been any mentioning of the
differences yet, but those differences are important.

Your unfortunate case is where test X creates persistent state that must
be present in order for test X+1 to produce meaningful results. This
kind of dependency obviously blows, as it means you can't debug test X+1
separately. I'd call this operational dependency.

This kind of dependency is IMHO a bug in the tests themselves. The unit
testing framework could help you find those bugs by allowing random
order of execution for the test cases.

There is another dependency and that I'd call a logical dependency. This
occurs when e.g. test X tests for an API presence and test Y tests the
API behaviour. In other words, Y has no chance to succeed if X already
failed. Unfortunately, there is no way to express this relation, there
is no "@unittest.depends(test_X)" to decorate test_Y with (Not yet!).
Each test would be the root of a tree of tests that it depends on. If a
dependency fails already, you can either skip the tree or at least mark
the following test failures as implicit failures, so that you can easily
distinguish them from the root cause.

This kind of dependency is quite normal although not directly supported
by the unittest module. As a workaround, being able to define an order
allows you to move the dependencies further to the top, so they are
tested first.

To sum it up, in order to catch operational dependencies, you need a
random order while in order to clearly express logical dependencies in
the output, you want a fixed order. Neither is the one true way, though
my gut feeling is that the fixed order is overall more useful.

As long as they understand the consequences of their actions, don't
try to preach unittest religion to them. They're in the best
position to know if what they're trying to do is the best thing for
their particular situation.

Amen!

Uli

Ian Kelly · Jan 9, 2012

There is another dependency and that I'd call a logical dependency. This
occurs when e.g. test X tests for an API presence and test Y tests the API
behaviour. In other words, Y has no chance to succeed if X already failed.
Unfortunately, there is no way to express this relation, there is no
"@unittest.depends(test_X)" to decorate test_Y with (Not yet!). Each test
would be the root of a tree of tests that it depends on. If a dependency
fails already, you can either skip the tree or at least mark the following
test failures as implicit failures, so that you can easily distinguish them
from the root cause.

I can see where that could be useful. On the other hand, if such a
decorator were included in unittest, I can already envision people
abusing it to explicitly enshrine their operational dependencies,
maybe even taking it as encouragement to write their tests in that
fashion.

Roy Smith · Jan 10, 2012

Some people advocate that the test framework should
intentionally randomize the order, to flush out inter-test dependencies
that the author didn't realize existed (or intend).

If you now
happen to influence one test with another and the next run randomizes
the tests differently, you will never see the fault again. Without this
reproducability, you don't gain anything but the bad stomach feeling
that something is wrong.[/QUOTE]

The standard solution to that is to print out the PRNG initialization
state and provide a way in your test harness to re-initialize it to that
state. I've done things like that in test scenarios where it is
difficult or impossible to cover the problem space deterministically.

Your unfortunate case is where test X creates persistent state that must
be present in order for test X+1 to produce meaningful results. This
kind of dependency obviously blows, as it means you can't debug test X+1
separately. I'd call this operational dependency.

This kind of dependency is IMHO a bug in the tests themselves.

For the most part, I'm inclined to agree. However, there are scenarios
where having each test build the required state from scratch is
prohibitively expensive. Imagine if you worked at NASA wanted to run
test_booster_ignition(), test_booster_cutoff(),
test_second_stage_ignition(), and test_self_destruct(). I suppose you
could run them in random order, but you'd use up a lot of rockets that
way.

Somewhat more seriously, let's say you wanted to do test queries against
a database with 100 million records in it. You could rebuild the
database from scratch for each test, but doing so might take hours per
test. Sometimes, real life is just *so* inconvenient.

There is another dependency and that I'd call a logical dependency. This
occurs when e.g. test X tests for an API presence and test Y tests the
API behaviour. In other words, Y has no chance to succeed if X already
failed.

Sure. I run into that all the time. A trivial example would be the
project I'm working on now. I've come to realize that a long unbroken
string of E's means, "Dummy, you forgot to bring the application server
up before you ran the tests". It would be nicer if the test suite could
have run a single test which proved it could create a TCP connection and
when that failed, just stop.

Terry Reedy · Jan 10, 2012

The standard solution to that is to print out the PRNG initialization
state and provide a way in your test harness to re-initialize it to that
state. I've done things like that in test scenarios where it is
difficult or impossible to cover the problem space deterministically.

For the most part, I'm inclined to agree. However, there are scenarios
where having each test build the required state from scratch is
prohibitively expensive. Imagine if you worked at NASA wanted to run
test_booster_ignition(), test_booster_cutoff(),
test_second_stage_ignition(), and test_self_destruct(). I suppose you
could run them in random order, but you'd use up a lot of rockets that
way.

Somewhat more seriously, let's say you wanted to do test queries against
a database with 100 million records in it. You could rebuild the
database from scratch for each test, but doing so might take hours per
test. Sometimes, real life is just *so* inconvenient.

Sure. I run into that all the time. A trivial example would be the
project I'm working on now. I've come to realize that a long unbroken
string of E's means, "Dummy, you forgot to bring the application server
up before you ran the tests". It would be nicer if the test suite could
have run a single test which proved it could create a TCP connection and
when that failed, just stop.

Many test cases in the Python test suite have multiple asserts. I
believe both resource sharing and sequential dependencies are reasons. I
consider 'one assert (test) per testcase' to be on a par with 'one class
per file'.

Lie Ryan · Jan 10, 2012

Somewhat more seriously, let's say you wanted to do test queries against
a database with 100 million records in it. You could rebuild the
database from scratch for each test, but doing so might take hours per
test. Sometimes, real life is just*so* inconvenient.

All serious database has rollback feature when they're available to
quickly revert database state in the setUp/cleanUp phase.

Lie Ryan · Jan 10, 2012

Am 09.01.2012 13:10, schrieb Lie Ryan:

Rest assured that the OP has a rather good idea of what he wants and
why, the latter being something you don't know, because he never
bothered to explain it and you never asked. Please don't think he's an
idiot just because he wants something that doesn't make sense to you.

The OP explained the "why" clearly in his first post, he wanted to see
his test results ordered in a certain way to make debugging easier, to
quote the OP:

"""
.... I just want to take the first test that fails and analyse that
instead of guessing the point to start debugging from the N failed tests.
"""

and then he goes on concluding that he need to reorder the tests itself
and to replace __dict__ with OrderedDict. While it is possible to
replace __dict__ with OrderedDict and it is possible to reorder the
test, those are not his original problem, and the optimal solution to
his original problem differs from the optimal solution to what he think
he will need.

I had said this before and I'm saying it again: the problem is a test
result displaying issue, not testing order issue.

Lie Ryan · Jan 10, 2012

There is another dependency and that I'd call a logical dependency. This
occurs when e.g. test X tests for an API presence and test Y tests the
API behaviour. In other words, Y has no chance to succeed if X already
failed. Unfortunately, there is no way to express this relation, there
is no "@unittest.depends(test_X)" to decorate test_Y with (Not yet!).

The skipIf decorator exists precisely for this purpose. Generally,
testing availability of resources (like existence of an API) should be
done outside of the testing code. In other words, test_X should never be
a test in the first place, it should be part of the setting up of the
tests; the tests themselves should be completely independent of each other.

Roy Smith · Jan 10, 2012

Lie Ryan said:
All serious database has rollback feature when they're available to
quickly revert database state in the setUp/cleanUp phase.

I guess MongoDB is not a serious database?

Tim Wintle · Jan 10, 2012

I guess MongoDB is not a serious database?

That's opening up a can of worms

.... anyway, cassandra is far better.

Tim

Lie Ryan · Jan 10, 2012

I guess MongoDB is not a serious database?

I guess there are always those oddball cases, but if you choose MongoDB
then you already know the consequences that it couldn't be as easily
unit-tested. And in any case, it is generally a bad idea to unittest
with a database that contains 100 million items, that's for performance
testing. So your point is?

Ulrich Eckhardt · Jan 10, 2012

Am 10.01.2012 13:31, schrieb Lie Ryan:

While it is possible to replace __dict__ with OrderedDict and it is
possible to reorder the test, those are not his original problem, and
the optimal solution to his original problem differs from the optimal
solution to what he think he will need.

Oh, and you know what is optimal in his environment? You really think
you know better what to do based on the little information provided?

I had said this before and I'm saying it again: the problem is a test
result displaying issue, not testing order issue.

There are good reasons for not reordering the results of the tests that
would be sacrificed by reordering them afterwards.

If you'd just step off your high horse you might actually learn
something instead of just pissing people off.

Uli

Roy Smith · Jan 11, 2012

I guess MongoDB is not a serious database?

I guess there are always those oddball cases, but if you choose MongoDB
then you already know the consequences that it couldn't be as easily
unit-tested. And in any case, it is generally a bad idea to unittest
with a database that contains 100 million items, that's for performance
testing. So your point is?[/QUOTE]

My point is that in the real world, what is practical and efficient and
sound business is not always what is theoretically correct.

Ulrich Eckhardt · Jan 18, 2012

Am 06.01.2012 12:44, schrieb Peter Otten:
[running unit tests in the order of their definition]

class Loader(unittest.TestLoader):
def getTestCaseNames(self, testCaseClass):
"""Return a sequence of method names found within testCaseClass
sorted by co_firstlineno.
"""
def first_lineno(name):
method = getattr(testCaseClass, name)
return method.im_func.__code__.co_firstlineno

function_names = super(Loader, self).getTestCaseNames(testCaseClass)
function_names.sort(key=first_lineno)
return function_names

After using this a bit, it works great. I have up to now only found a
single problem, and that is that with decorated functions it doesn't
even get at the actual line number of the real code, so sorting by that
number doesn't work. An example for this is
"@unittest.expectedFailure(...)".

I can easily ignore this though, just wanted to give this feedback

Thanks again!

Uli

Using an OrderedDict for __dict__ in Python 3 using __prepare__	0	Jan 9, 2012
Using a subclass for __dict__	0	Feb 13, 2014
Non hashable object (without __dict__)	0	Apr 20, 2011
Having an issue with empty arrays	2	Apr 27, 2023
Need an if statement	8	Jun 13, 2023
name lookup failure using metaclasses with unittests	7	Apr 10, 2013
Replacing old data with new data using python	1	Mar 28, 2013
__slots__ vs __dict__	5	May 12, 2004

replacing dict with an OrderedDict

Ulrich Eckhardt

Roy Smith

Neil Cerutti

Ulrich Eckhardt

Ian Kelly

Roy Smith

Terry Reedy

Lie Ryan

Lie Ryan

Lie Ryan

Roy Smith

Tim Wintle

Lie Ryan

Ulrich Eckhardt

Roy Smith

Ulrich Eckhardt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

replacing __dict__ with an OrderedDict

Ulrich Eckhardt

Roy Smith

Neil Cerutti

Ulrich Eckhardt

Ian Kelly

Roy Smith

Terry Reedy

Lie Ryan

Lie Ryan

Lie Ryan

Roy Smith

Tim Wintle

Lie Ryan

Ulrich Eckhardt

Roy Smith

Ulrich Eckhardt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

replacing dict with an OrderedDict