Problem with apsw and garbage collection

N

Nikolaus Rath

While the first execute("VACUUM") call succeeds, the second does not but
raises an apsw.BusyError (meaning that sqlite thinks that it cannot get
an exclusive lock on the database).

I suspect that the reason for that is that the cursor object that is
created in the function is not destroyed when the function is left with
raise (rather than return), which in turn prevents sqlite from obtaining
the lock.

However, if I exchange the VACUUM command by something else (e.g. CREATE
TABLE), the program runs fine. I think this casts some doubt on the
above explanation, since, AFAIK sqlite always locks the entire file and
should therefore have the some problem as before.


Can someone explain what exactly is happening here?


Best,


-Nikolaus
 
N

Nikolaus Rath

Nikolaus Rath said:
Hi,

Please consider this example:
[....]

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:

---------snip---------
#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
print "Initializing"

def __del__(self):
print "Destructing"

def dostuff(fail):
obj = testclass()

if fail:
raise TypeError

print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"
---------snip---------


Prints out:


---------snip---------
Calling dostuff
Initializing
Destructing
dostuff returned
Calling dostuff
Initializing
dostuff returned
Destructing
---------snip---------


Is there a way to have the obj variable (that is created in dostuff())
destroyed earlier than at the end of the program? As you can see, I
already tried to explicitly call the garbage collector, but this does
not help.


Best,


-Nikolaus
 
M

MRAB

Nikolaus said:
Nikolaus Rath said:
Hi,

Please consider this example:
[....]

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:

---------snip---------
#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
print "Initializing"

def __del__(self):
print "Destructing"

def dostuff(fail):
obj = testclass()

if fail:
raise TypeError

print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"
---------snip---------


Prints out:


---------snip---------
Calling dostuff
Initializing
Destructing
dostuff returned
Calling dostuff
Initializing
dostuff returned
Destructing
---------snip---------


Is there a way to have the obj variable (that is created in dostuff())
destroyed earlier than at the end of the program? As you can see, I
already tried to explicitly call the garbage collector, but this does
not help.
Are the objects retained because there's a reference to the stack
frame(s) in the traceback?
 
P

Piet van Oostrum

Nikolaus Rath said:
NR> Is there a way to have the obj variable (that is created in dostuff())
NR> destroyed earlier than at the end of the program? As you can see, I
NR> already tried to explicitly call the garbage collector, but this does
NR> not help.

The exact time of the destruction of objects is an implementation detail
and should not be relied upon.
 
A

Aahz

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:

Correct. You need to get rid of the stack trace somehow; the simplest
way is to wrap things in layers of functions (i.e. return from the
function with try/except and *don't* save the traceback). Note that if
your goal is to ensure finalization rather than recovering memory, you
need to do that explicitly rather than relying on garbage collection.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
 
M

Mike Kazantsev

Nikolaus Rath said:
Hi,

Please consider this example:
[....]

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:
....

Is there a way to have the obj variable (that is created in dostuff())
destroyed earlier than at the end of the program? As you can see, I
already tried to explicitly call the garbage collector, but this does
not help.

Strange thing is that no one suggested contextlib, which made _exactly_
for this purpose:


#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
self.alive = True # just for example
print "Initializing"

def __del__(self):
if self.alive:
# try..except wrapper would suffice here,
# so destruction won't raise ex, if already done
print "Destructing"
self.alive = False

def __enter__(self): pass
def __exit__(self, ex_type, ex_val, ex_trace):
self.__del__()
if not ex_type is None:
raise RuntimeError(ex_val)


def dostuff(fail):
with testclass() as obj:
# some stuff
if fail:
raise TypeError
# some more stuff
print "success"


print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"


And it doesn't matter where you use "with", it creates a volatile
context, which destructs before anything else happens on higher level.

Another simplified case, similar to yours is file objects:


with open(tmp_path, 'w') as file:
# write_ops
os.rename(tmp_path, path)

So whatever happens inside "with", file should end up closed, else
os.rename might replace valid path with zero-length file.

It should be easy to use cursor with contextlib, consider using
contextmanager decorator:


from contextlib import contextmanager

@contextmanager
def get_cursor():
try:
cursor = conn.cursor()
yield cursor
except Exception as ex: raise ex
finally: cursor.close()

with get_cursor() as cursor:
# whatever ;)



--
Mike Kazantsev // fraggod.net

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAko0kMEACgkQASbOZpzyXnGKJgCff3+ee2+jN3mcNXn7BeGjMyD7
ZskAoMgsKubIDUjJAgkO79Qib23rH3Lq
=jfYI
-----END PGP SIGNATURE-----
 
L

Lawrence D'Oliveiro

The exact time of the destruction of objects is an implementation detail
and should not be relied upon.

That may be true in Java and other corporate-herd-oriented languages, but we
know that dynamic languages like Perl and Python make heavy use of
reference-counting wherever they can. If it's easy to satisfy yourself that
the lifetime of an object will be delimited in this way, I don't see why you
can't rely upon it.
 
S

Steven D'Aprano

That may be true in Java and other corporate-herd-oriented languages,
but we know that dynamic languages like Perl and Python make heavy use
of reference-counting wherever they can. If it's easy to satisfy
yourself that the lifetime of an object will be delimited in this way, I
don't see why you can't rely upon it.

Reference counting is an implementation detail used by CPython but not
IronPython or Jython. I don't know about the dozen or so other minor/new
implementations, like CLPython, PyPy, Unladen Swallow or CapPython.

In other words, if you want to write *Python* code rather than CPython
code, don't rely on ref-counting.
 
L

Lawrence D'Oliveiro

Steven said:
That may be true in Java and other corporate-herd-oriented languages,
but we know that dynamic languages like Perl and Python make heavy use
of reference-counting wherever they can. If it's easy to satisfy
yourself that the lifetime of an object will be delimited in this way, I
don't see why you can't rely upon it.

Reference counting is an implementation detail used by CPython but not
[implementations built on runtimes designed for corporate-herd-oriented
languages, like] IronPython or Jython.

I rest my case.
 
P

Paul Rubin

Lawrence D'Oliveiro said:
Reference counting is an implementation detail used by CPython but not
[implementations built on runtimes designed for corporate-herd-oriented
languages, like] IronPython or Jython.

I rest my case.

You're really being pretty ignorant. I don't know of any serious Lisp
system that uses reference counting, both for performance reasons and
to make sure cyclic structures are reclaimed properly. Lisp is
certainly not a corporate herd language.

Even CPython doesn't rely completely on reference counting (it has a
fallback gc for cyclic garbage). Python introduced the "with"
statement to get away from the kludgy CPython programmer practice of
opening files and relying on the file being closed when the last
reference went out of scope.
 
S

Steven D'Aprano

The exact time of the destruction of objects is an implementation
detail and should not be relied upon.

That may be true in Java and other corporate-herd-oriented languages,
but we know that dynamic languages like Perl and Python make heavy use
of reference-counting wherever they can. If it's easy to satisfy
yourself that the lifetime of an object will be delimited in this way,
I don't see why you can't rely upon it.

Reference counting is an implementation detail used by CPython but not
[implementations built on runtimes designed for corporate-herd-oriented
languages, like] IronPython or Jython.

I rest my case.

CLPython and Unladen Swallow do not use reference counting. I suppose you
might successfully argue that Lisp is a corporate-herd-oriented language,
and that Google (the company behind Unladen Swallow) is a corporate-herd.
But PyPy doesn't use reference counting either. Perhaps you think that
Python is a language designed for corporate-herds too?
 
L

Lawrence D'Oliveiro

Lawrence D'Oliveiro said:
Reference counting is an implementation detail used by CPython but not
[implementations built on runtimes designed for corporate-herd-oriented
languages, like] IronPython or Jython.

I rest my case.

You're really being pretty ignorant. I don't know of any serious Lisp
system that uses reference counting, both for performance reasons and
to make sure cyclic structures are reclaimed properly.

Both of which, oddly enough, more modern dynamic languages like Python
manage perfectly well.
 
C

Charles Yeomans

Lawrence D'Oliveiro said:
Reference counting is an implementation detail used by CPython but
not
[implementations built on runtimes designed for corporate-herd-
oriented
languages, like] IronPython or Jython.

I rest my case.

You're really being pretty ignorant. I don't know of any serious Lisp
system that uses reference counting, both for performance reasons and
to make sure cyclic structures are reclaimed properly. Lisp is
certainly not a corporate herd language.

Even CPython doesn't rely completely on reference counting (it has a
fallback gc for cyclic garbage). Python introduced the "with"
statement to get away from the kludgy CPython programmer practice of
opening files and relying on the file being closed when the last
reference went out of scope.

I'm curious as you why you consider this practice to be kludgy; my
experience with RAII is pretty good.

Charles Yeomans
 
S

Steven D'Aprano

I'm curious as you why you consider this practice to be kludgy; my
experience with RAII is pretty good.

Because it encourages harmful laziness. Laziness is only a virtue when it
leads to good code for little effort, but in this case, it leads to non-
portable code. Worse, if your data structures include cycles, it also
leads to resource leaks.
 
S

Steven D'Aprano

Lawrence D'Oliveiro said:
Reference counting is an implementation detail used by CPython but
not [implementations built on runtimes designed for
corporate-herd-oriented languages, like] IronPython or Jython.

I rest my case.

You're really being pretty ignorant. I don't know of any serious Lisp
system that uses reference counting, both for performance reasons and
to make sure cyclic structures are reclaimed properly.

Both of which, oddly enough, more modern dynamic languages like Python
manage perfectly well.

*Python* doesn't have a ref counter. That's an implementation detail of
*CPython*. There is nothing in the specifications for the language Python
which requires a ref counter.

CPython's ref counter is incapable of dealing with cyclic structures, and
so it has a second garbage collector specifically for that purpose. The
only reason Python manages perfectly well is by NOT relying on a ref
counter: some implementations don't have one at all, and the one which
does, uses a second gc.

Additionally, while I'm a fan of the simplicity of CPython's ref counter,
one serious side effect of it is that it requires the GIL, which
essentially means CPython is crippled on multi-core CPUs compared to non-
ref counting implementations.
 
C

Charles Yeomans

Because it encourages harmful laziness. Laziness is only a virtue
when it
leads to good code for little effort, but in this case, it leads to
non-
portable code. Worse, if your data structures include cycles, it also
leads to resource leaks.


Memory management may be an "implementation detail", but it is
unfortunately one that illustrates the so-called law of leaky
abstractions. So I think that one has to write code that follows the
memory management scheme of whatever language one uses. For code
written for CPython only, as mine is, RAII is an appropriate idiom and
not kludgy at all. Under your assumptions, its use would be wrong, of
course.

Charles Yeomans
 
S

Steven D'Aprano

Memory management may be an "implementation detail", but it is
unfortunately one that illustrates the so-called law of leaky
abstractions. So I think that one has to write code that follows the
memory management scheme of whatever language one uses. For code
written for CPython only, as mine is, RAII is an appropriate idiom and
not kludgy at all. Under your assumptions, its use would be wrong, of
course.


CPython isn't a language, it's an implementation.

I'm unable to find anything in the Python Reference which explicitly
states that files will be closed when garbage collected, except for one
brief mention in tempfile.TemporaryFile:

"Return a file-like object that can be used as a temporary storage area.
The file is created using mkstemp(). It will be destroyed as soon as it
is closed (including an implicit close when the object is garbage
collected)."

http://docs.python.org/library/tempfile.html

In practical terms, it's reasonably safe to assume Python will close
files when garbage collected (it would be crazy not to!) but that's not
explicitly guaranteed anywhere I can see. In any case, there is no
guarantee *when* files will be closed -- for long-lasting processes that
open and close a lot of files, or for data structures with cycles, you
might easily run out of file descriptors.


The docs for file give two recipes for recommended ways of dealing with
files:

http://docs.python.org/library/stdtypes.html#file.close

Both of them close the file, one explicitly, the other implicitly. In
both cases, they promise to close the file as soon as you are done with
it. Python the language does not.


The tutorials explicitly recommends closing the file when you're done:

"When you’re done with a file, call f.close() to close it and free up any
system resources taken up by the open file."

http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files


In summary: relying on immediate closure of files is implementation
specific behaviour. By all means do so, with your eyes open and with full
understanding that you're relying on platform-specific behaviour with no
guarantee of when, or even if, files will be closed.
 
P

Piet van Oostrum

Charles Yeomans said:
CY> Memory management may be an "implementation detail", but it is
CY> unfortunately one that illustrates the so-called law of leaky
CY> abstractions. So I think that one has to write code that follows the
CY> memory management scheme of whatever language one uses. For code written
CY> for CPython only, as mine is, RAII is an appropriate idiom and not kludgy
CY> at all. Under your assumptions, its use would be wrong, of course.

I dare to say that, even in CPython it is doomed to disappear, but we
don't know yet on what timescale.
 
A

Aahz

Additionally, while I'm a fan of the simplicity of CPython's ref counter,
one serious side effect of it is that it requires the GIL, which
essentially means CPython is crippled on multi-core CPUs compared to non-
ref counting implementations.

Your bare "crippled" is an unfair overstatement. What you meant to
write was that computational multi-threaded applications that don't use
NumPy are crippled. Otherwise you're simply spreading FUD.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top