sync databse table based on current directory data without losignprevious values

Discussion in 'Python' started by Íßêïò Ãêñ33ê, Mar 6, 2013.

  1. I'am using this snipper to read a current directory and insert all filenames into a databse and then display them.

    But what happens when files are get removed form the directory?
    The inserted records into databse remain.
    How can i update the databse to only contain the existing filenames without losing the previous stored data?

    Here is what i ahve so far:

    ==================================
    path = "/home/nikos/public_html/data/files/"

    #read the containing folder and insert new filenames
    for result in os.walk(path):
    for filename in result[2]:
    try:
    #find the needed counter for the page URL
    cur.execute('''SELECT URL FROM files WHERE URL = %s''', (filename,) )
    data = cur.fetchone() #URL is unique, so should only be one

    if not data:
    #first time for file; primary key is automatic, hit is defaulted
    cur.execute('''INSERT INTO files (URL, host, lastvisit) VALUES (%s, %s, %s)''', (filename, host, date) )
    except MySQLdb.Error, e:
    print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
    ======================

    Thank you.
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #1
    1. Advertising

  2. Íßêïò Ãêñ33ê

    Lele Gaifax Guest

    Îίκος ΓκÏ33κ <> writes:

    > How can i update the databse to only contain the existing filenames without losing the previous stored data?


    Basically you need to keep a list (or better, a set) containing all
    current filenames that you are going to insert, and finally do another
    "inverse" loop where you scan all the records and delete those that are
    not present anymore.

    Of course, this assume you have a "bidirectional" identity between the
    filenames you are loading and the records you are inserting, which is
    not the case in the code you show:

    > #read the containing folder and insert new filenames
    > for result in os.walk(path):
    > for filename in result[2]:


    Here "filename" is just that, not the full path: this could result in
    collisions, if your are actually loading a *tree* instead of a flat
    directory, that is multiple source files are squeezed into a single
    record in your database (imagine "/foo/index.html" and
    "/foo/subdir/index.html").

    With that in mind, I would do something like the following:

    # Compute a set of current fullpaths
    current_fullpaths = set()
    for root, dirs, files in os.walk(path):
    for fullpath in files:
    current_fullpaths.add(os.path.join(root, file))

    # Load'em
    for fullpath in current_fullpaths:

    try:
    #find the needed counter for the page URL
    cur.execute('''SELECT URL FROM files WHERE URL = %s''', (fullpath,) )
    data = cur.fetchone() #URL is unique, so should only be one

    if not data:
    #first time for file; primary key is automatic, hit is defaulted
    cur.execute('''INSERT INTO files (URL, host, lastvisit) VALUES (%s, %s, %s)''', (fullpath, host, date) )
    except MySQLdb.Error, e:
    print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )

    # Delete spurious
    cur.execute('''SELECT url FROM files''')
    for rec in cur:
    fullpath = rec[0]
    if fullpath not in current_fullpaths:
    other_cur.execute('''DELETE FROM files WHERE url = %s''', (fullpath,))

    Of course here I am assuming a lot (a typical thing we do to answer your
    questions :), in particular that the "url" field content matches the
    filesystem layout, which may not be the case. Adapt it to your usecase.

    hope this helps,
    ciao, lele.
    --
    nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
    real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
    | -- Fortunato Depero, 1929.
    Lele Gaifax, Mar 6, 2013
    #2
    1. Advertising

  3. Τη ΤετάÏτη, 6 ΜαÏτίου 2013 10:19:06 Ï€.μ. UTC+2, ο χÏήστης Lele Gaifax έγÏαψε:
    > Îίκος ΓκÏ33κ <> writes:
    >
    >
    >
    > > How can i update the databse to only contain the existing filenames without losing the previous stored data?

    >
    >
    >
    > Basically you need to keep a list (or better, a set) containing all
    >
    > current filenames that you are going to insert, and finally do another
    >
    > "inverse" loop where you scan all the records and delete those that are
    >
    > not present anymore.
    >
    >
    >
    > Of course, this assume you have a "bidirectional" identity between the
    >
    > filenames you are loading and the records you are inserting, which is
    >
    > not the case in the code you show:
    >
    >
    >
    > > #read the containing folder and insert new filenames

    >
    > > for result in os.walk(path):

    >
    > > for filename in result[2]:

    >
    >
    >
    > Here "filename" is just that, not the full path: this could result in
    >
    > collisions, if your are actually loading a *tree* instead of a flat
    >
    > directory, that is multiple source files are squeezed into a single
    >
    > record in your database (imagine "/foo/index.html" and
    >
    > "/foo/subdir/index.html").
    >
    >
    >
    > With that in mind, I would do something like the following:
    >
    >
    >
    > # Compute a set of current fullpaths
    >
    > current_fullpaths = set()
    >
    > for root, dirs, files in os.walk(path):
    >
    > for fullpath in files:
    >
    > current_fullpaths.add(os.path.join(root, file))
    >
    >
    >
    > # Load'em
    >
    > for fullpath in current_fullpaths:
    >
    >
    >
    > try:
    >
    > #find the needed counter for the page URL
    >
    > cur.execute('''SELECT URL FROM files WHERE URL = %s''', (fullpath,) )
    >
    > data = cur.fetchone() #URL is unique, so should only be one
    >
    >
    >
    > if not data:
    >
    > #first time for file; primary key is automatic, hit is defaulted
    >
    > cur.execute('''INSERT INTO files (URL, host, lastvisit) VALUES (%s, %s, %s)''', (fullpath, host, date) )
    >
    > except MySQLdb.Error, e:
    >
    > print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
    >
    >
    >
    > # Delete spurious
    >
    > cur.execute('''SELECT url FROM files''')
    >
    > for rec in cur:
    >
    > fullpath = rec[0]
    >
    > if fullpath not in current_fullpaths:
    >
    > other_cur.execute('''DELETE FROM files WHERE url = %s''', (fullpath,))
    >
    >
    >
    > Of course here I am assuming a lot (a typical thing we do to answer your
    >
    > questions :), in particular that the "url" field content matches the
    >
    > filesystem layout, which may not be the case. Adapt it to your usecase.
    >
    >
    >
    > hope this helps,
    >
    > ciao, lele.
    >
    > --
    >
    > nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
    >
    > real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
    >
    > | -- Fortunato Depero, 1929.


    You are fantastic! Your straightforward logic amazes me!

    Thank you very much for making things clear to me!!

    But there is a slight problem when iam trying to run the code iam presenting this error ehre you can see its output here:

    http://superhost.gr/cgi-bin/files.py
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #3
  4. Τη ΤετάÏτη, 6 ΜαÏτίου 2013 10:19:06 Ï€.μ. UTC+2, ο χÏήστης Lele Gaifax έγÏαψε:
    > Îίκος ΓκÏ33κ <> writes:
    >
    >
    >
    > > How can i update the databse to only contain the existing filenames without losing the previous stored data?

    >
    >
    >
    > Basically you need to keep a list (or better, a set) containing all
    >
    > current filenames that you are going to insert, and finally do another
    >
    > "inverse" loop where you scan all the records and delete those that are
    >
    > not present anymore.
    >
    >
    >
    > Of course, this assume you have a "bidirectional" identity between the
    >
    > filenames you are loading and the records you are inserting, which is
    >
    > not the case in the code you show:
    >
    >
    >
    > > #read the containing folder and insert new filenames

    >
    > > for result in os.walk(path):

    >
    > > for filename in result[2]:

    >
    >
    >
    > Here "filename" is just that, not the full path: this could result in
    >
    > collisions, if your are actually loading a *tree* instead of a flat
    >
    > directory, that is multiple source files are squeezed into a single
    >
    > record in your database (imagine "/foo/index.html" and
    >
    > "/foo/subdir/index.html").
    >
    >
    >
    > With that in mind, I would do something like the following:
    >
    >
    >
    > # Compute a set of current fullpaths
    >
    > current_fullpaths = set()
    >
    > for root, dirs, files in os.walk(path):
    >
    > for fullpath in files:
    >
    > current_fullpaths.add(os.path.join(root, file))
    >
    >
    >
    > # Load'em
    >
    > for fullpath in current_fullpaths:
    >
    >
    >
    > try:
    >
    > #find the needed counter for the page URL
    >
    > cur.execute('''SELECT URL FROM files WHERE URL = %s''', (fullpath,) )
    >
    > data = cur.fetchone() #URL is unique, so should only be one
    >
    >
    >
    > if not data:
    >
    > #first time for file; primary key is automatic, hit is defaulted
    >
    > cur.execute('''INSERT INTO files (URL, host, lastvisit) VALUES (%s, %s, %s)''', (fullpath, host, date) )
    >
    > except MySQLdb.Error, e:
    >
    > print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
    >
    >
    >
    > # Delete spurious
    >
    > cur.execute('''SELECT url FROM files''')
    >
    > for rec in cur:
    >
    > fullpath = rec[0]
    >
    > if fullpath not in current_fullpaths:
    >
    > other_cur.execute('''DELETE FROM files WHERE url = %s''', (fullpath,))
    >
    >
    >
    > Of course here I am assuming a lot (a typical thing we do to answer your
    >
    > questions :), in particular that the "url" field content matches the
    >
    > filesystem layout, which may not be the case. Adapt it to your usecase.
    >
    >
    >
    > hope this helps,
    >
    > ciao, lele.
    >
    > --
    >
    > nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
    >
    > real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
    >
    > | -- Fortunato Depero, 1929.


    You are fantastic! Your straightforward logic amazes me!

    Thank you very much for making things clear to me!!

    But there is a slight problem when iam trying to run the code iam presenting this error ehre you can see its output here:

    http://superhost.gr/cgi-bin/files.py
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #4
  5. Íßêïò Ãêñ33ê

    Lele Gaifax Guest

    Îίκος ΓκÏ33κ <> writes:

    > Thank you very much for making things clear to me!!


    You're welcome, even more if you spend 1 second to trim your answers
    removing unneeded citation :)

    >
    > But there is a slight problem when iam trying to run the code iam presenting this error ehre you can see its output here:
    >
    > http://superhost.gr/cgi-bin/files.py


    Sorry, this seems completely unrelated, and from the little snippet that
    appear on that page I cannot understand what's going on there.

    ciao, lele.
    --
    nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
    real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
    | -- Fortunato Depero, 1929.
    Lele Gaifax, Mar 6, 2013
    #5
  6. Its about the following line of code:

    current_fullpaths.add( os.path.join(root, files) )


    that presents the following error:

    <type 'exceptions.AttributeError'>: 'list' object has no attribute 'startswith'
    args = ("'list' object has no attribute 'startswith'",)
    message = "'list' object has no attribute 'startswith'"

    join calls some module that find difficulty when parsing its line:

    /usr/lib64/python2.6/posixpath.py in join(a='/home/nikos/public_html/data/files/', *p=(['\xce\x9a\xcf\x8d\xcf\x81\xce\xb9\xce\xb5 \xce\x99\xce\xb7\xcf\x83\xce\xbf\xcf\x8d \xce\xa7\xcf\x81\xce\xb9\xcf\x83\xcf\x84\xce\xad \xce\x95\xce\xbb\xce\xad\xce\xb7\xcf\x83\xce\xbf\xce\xbd \xce\x9c\xce\xb5.mp3', '\xce\xa0\xce\xb5\xcf\x81\xce\xaf \xcf\x84\xcf\x89\xce\xbd \xce\x9b\xce\xbf\xce\xb3\xce\xb9\xcf\x83\xce\xbc\xcf\x8e\xce\xbd.mp3'],))
    63 path = a
    64 for b in p:
    65 if b.startswith('/'):
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #6
  7. Its about the following line of code:

    current_fullpaths.add( os.path.join(root, files) )


    that presents the following error:

    <type 'exceptions.AttributeError'>: 'list' object has no attribute 'startswith'
    args = ("'list' object has no attribute 'startswith'",)
    message = "'list' object has no attribute 'startswith'"

    join calls some module that find difficulty when parsing its line:

    /usr/lib64/python2.6/posixpath.py in join(a='/home/nikos/public_html/data/files/', *p=(['\xce\x9a\xcf\x8d\xcf\x81\xce\xb9\xce\xb5 \xce\x99\xce\xb7\xcf\x83\xce\xbf\xcf\x8d \xce\xa7\xcf\x81\xce\xb9\xcf\x83\xcf\x84\xce\xad \xce\x95\xce\xbb\xce\xad\xce\xb7\xcf\x83\xce\xbf\xce\xbd \xce\x9c\xce\xb5.mp3', '\xce\xa0\xce\xb5\xcf\x81\xce\xaf \xcf\x84\xcf\x89\xce\xbd \xce\x9b\xce\xbf\xce\xb3\xce\xb9\xcf\x83\xce\xbc\xcf\x8e\xce\xbd.mp3'],))
    63 path = a
    64 for b in p:
    65 if b.startswith('/'):
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #7
  8. Perhaps because my filenames is in greek letters that thsi error is presented but i'am not sure.....

    Maybe we can join root+files and store it to the set() someway differenyl....
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #8
  9. Perhaps because my filenames is in greek letters that thsi error is presented but i'am not sure.....

    Maybe we can join root+files and store it to the set() someway differenyl....
    Íßêïò Ãêñ33ê, Mar 6, 2013
    #9
  10. Set x to to None and del x doesn't release memory in python 2.7.1(HPUX 11.23, ia64)

    Hello there,

    I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.

    I discovered following behavior whereby the python process doesn't seem to release memory utilized even after a variable is set to None, and "deleted". I use glance tool to monitor the memory utilized by this process. Obviously after the for loop is executed, the memory used by this process has hiked to a few MB. However, after "del" is executed to both I and str variables, the memory of that process still stays at where it was.

    Any idea why?


    >>> for i in range(100000L):

    ... str=str+"%s"%(i,)
    ...

    >>> i=None
    >>> str=None
    >>> del i
    >>> del str
    Wong Wah Meng-R32813, Mar 6, 2013
    #10
  11. On Wednesday, March 6, 2013 9:43:34 AM UTC, Îίκος ΓκÏ33κ wrote:
    > Perhaps because my filenames is in greek letters that thsi error is presented but i'am not sure.....
    >
    >
    >
    > Maybe we can join root+files and store it to the set() someway differenyl.....


    well, the error refers to the line "if b.startswith('/'): " and states "'list' object has no attribute 'startswith'"

    so b is assigned to a list type and list does not have a 'startswith' method or attribute.

    I Thought .startswith() was a string method but if it's your own method then I apologize (though if it is, I personally would have made a class that inherited from list rather than adding it to list itself)

    can you show where you are assigning b (or if its meant to be a list or string object)
    Bryan Devaney, Mar 6, 2013
    #11
  12. On Wednesday, March 6, 2013 9:43:34 AM UTC, Îίκος ΓκÏ33κ wrote:
    > Perhaps because my filenames is in greek letters that thsi error is presented but i'am not sure.....
    >
    >
    >
    > Maybe we can join root+files and store it to the set() someway differenyl.....


    well, the error refers to the line "if b.startswith('/'): " and states "'list' object has no attribute 'startswith'"

    so b is assigned to a list type and list does not have a 'startswith' method or attribute.

    I Thought .startswith() was a string method but if it's your own method then I apologize (though if it is, I personally would have made a class that inherited from list rather than adding it to list itself)

    can you show where you are assigning b (or if its meant to be a list or string object)
    Bryan Devaney, Mar 6, 2013
    #12
  13. Re: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    On Wednesday, March 6, 2013 10:11:12 AM UTC, Wong Wah Meng-R32813 wrote:
    > Hello there,
    >
    >
    >
    > I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >
    >
    >
    > I discovered following behavior whereby the python process doesn't seem to release memory utilized even after a variable is set to None, and "deleted". I use glance tool to monitor the memory utilized by this process. Obviously after the for loop is executed, the memory used by this process has hiked to a few MB. However, after "del" is executed to both I and str variables, the memory of that process still stays at where it was.
    >
    >
    >
    > Any idea why?
    >
    >
    >
    > >>> for i in range(100000L):

    >
    > ... str=str+"%s"%(i,)
    >
    > ...
    >
    > >>> i=None

    >
    > >>> str=None

    >
    > >>> del i

    >
    > >>> del str


    Hi, I'm new here so I'm making mistakes too but I know they don't like it when you ask your question in someone else's question.

    that being said, to answer your question:

    Python uses a 'garbage collector'. When you delete something, all references are removed from the object in memory, the memory itself will not be freed until the next time the garbage collector runs. When that happens, all objects without references in memory are removed and the memory freed. If you wait a while you should see that memory free itself.
    Bryan Devaney, Mar 6, 2013
    #13
  14. Re: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    On Wednesday, March 6, 2013 10:11:12 AM UTC, Wong Wah Meng-R32813 wrote:
    > Hello there,
    >
    >
    >
    > I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >
    >
    >
    > I discovered following behavior whereby the python process doesn't seem to release memory utilized even after a variable is set to None, and "deleted". I use glance tool to monitor the memory utilized by this process. Obviously after the for loop is executed, the memory used by this process has hiked to a few MB. However, after "del" is executed to both I and str variables, the memory of that process still stays at where it was.
    >
    >
    >
    > Any idea why?
    >
    >
    >
    > >>> for i in range(100000L):

    >
    > ... str=str+"%s"%(i,)
    >
    > ...
    >
    > >>> i=None

    >
    > >>> str=None

    >
    > >>> del i

    >
    > >>> del str


    Hi, I'm new here so I'm making mistakes too but I know they don't like it when you ask your question in someone else's question.

    that being said, to answer your question:

    Python uses a 'garbage collector'. When you delete something, all references are removed from the object in memory, the memory itself will not be freed until the next time the garbage collector runs. When that happens, all objects without references in memory are removed and the memory freed. If you wait a while you should see that memory free itself.
    Bryan Devaney, Mar 6, 2013
    #14
  15. Íßêïò Ãêñ33ê

    Lele Gaifax Guest

    Îίκος ΓκÏ33κ <> writes:

    > Its about the following line of code:
    >
    > current_fullpaths.add( os.path.join(root, files) )


    I'm sorry, typo on my part.

    That should have been "fullpath", not "file" (and neither "files" as you
    wrongly reported back!):

    # Compute a set of current fullpaths
    current_fullpaths = set()
    for root, dirs, files in os.walk(path):
    for fullpath in files:
    current_fullpaths.add(os.path.join(root, fullpath))

    ciao, lele.
    --
    nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
    real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
    | -- Fortunato Depero, 1929.
    Lele Gaifax, Mar 6, 2013
    #15
  16. Íßêïò Ãêñ33ê

    Terry Reedy Guest

    Re: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    On 3/6/2013 5:11 AM, Wong Wah Meng-R32813 wrote:
    > Hello there,
    >
    > I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >
    > I discovered following behavior whereby the python process doesn't
    > seem to release memory utilized even after a variable is set to None,
    > and "deleted". I use glance tool to monitor the memory utilized by
    > this process. Obviously after the for loop is executed, the memory
    > used by this process has hiked to a few MB. However, after "del" is
    > executed to both I and str variables, the memory of that process
    > still stays at where it was.


    Whether memory freed by deleting an object is returned to and taken by
    the OS depends on the OS and other factors like like the size and layout
    of the freed memory, probably the history of memory use, and for
    CPython, the C compiler's malloc/free implementation. At various times,
    the Python memory handlers have been rewritten to encourage/facilitate
    memory return, but Python cannot control the process.

    > for i in range(100000L):
    > str=str+"%s"%(i,)
    > i=None; str=None # not necessary
    > del i; del str


    Reusing built-in names for unrelated purposes is generally a bad idea,
    although the final deletion does restore access to the builtin.

    --
    Terry Jan Reedy
    Terry Reedy, Mar 6, 2013
    #16
  17. On 06/03/2013 07:45, Îίκος ΓκÏ33κ wrote:
    > I'am using this snipper to read a current directory and insert all filenames into a databse and then display them.
    >
    > But what happens when files are get removed form the directory?
    > The inserted records into databse remain.
    > How can i update the databse to only contain the existing filenames without losing the previous stored data?
    >
    > Here is what i ahve so far:
    >
    > ==================================
    > path = "/home/nikos/public_html/data/files/"
    >
    > #read the containing folder and insert new filenames
    > for result in os.walk(path):


    You were told yesterday at least twice that os.walk returns a tuple but
    you still insist on refusing to take any notice of our replies when it
    suits you, preferring instead to waste everbody's time with these
    questions. Or are you trying to get into the Guinness Book of World
    Records for the laziest bastard on the planet?

    > for filename in result[2]:
    > try:
    > #find the needed counter for the page URL
    > cur.execute('''SELECT URL FROM files WHERE URL = %s''', (filename,) )
    > data = cur.fetchone() #URL is unique, so should only be one
    >
    > if not data:
    > #first time for file; primary key is automatic, hit is defaulted
    > cur.execute('''INSERT INTO files (URL, host, lastvisit) VALUES (%s, %s, %s)''', (filename, host, date) )
    > except MySQLdb.Error, e:
    > print ( "Query Error: ", sys.exc_info()[1].excepinfo()[2] )
    > ======================
    >
    > Thank you.
    >


    --
    Cheers.

    Mark Lawrence
    Mark Lawrence, Mar 6, 2013
    #17
  18. RE: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    Apologies as after I have left the group for a while I have forgotten how not to post a question on top of another question. Very sorry and appreciateyour replies.

    I tried explicitly calling gc.collect() and didn't manage to see the memoryfootprint reduced. I probably haven't left the process idle long enough tosee the internal garbage collection takes place but I will leave it idle for more than 8 hours and check again. Thanks!

    -----Original Message-----
    From: Python-list [mailto:python-list-bounces+wahmeng=] On Behalf Of Bryan Devaney
    Sent: Wednesday, March 06, 2013 6:25 PM
    To:
    Cc:
    Subject: Re: Set x to to None and del x doesn't release memory in python 2.7.1 (HPUX 11.23, ia64)

    On Wednesday, March 6, 2013 10:11:12 AM UTC, Wong Wah Meng-R32813 wrote:
    > Hello there,
    >
    >
    >
    > I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >
    >
    >
    > I discovered following behavior whereby the python process doesn't seem to release memory utilized even after a variable is set to None, and "deleted". I use glance tool to monitor the memory utilized by this process. Obviously after the for loop is executed, the memory used by this process has hiked to a few MB. However, after "del" is executed to both I and str variables, the memory of that process still stays at where it was.
    >
    >
    >
    > Any idea why?
    >
    >
    >
    > >>> for i in range(100000L):

    >
    > ... str=str+"%s"%(i,)
    >
    > ...
    >
    > >>> i=None

    >
    > >>> str=None

    >
    > >>> del i

    >
    > >>> del str


    Hi, I'm new here so I'm making mistakes too but I know they don't like it when you ask your question in someone else's question.

    that being said, to answer your question:

    Python uses a 'garbage collector'. When you delete something, all references are removed from the object in memory, the memory itself will not be freed until the next time the garbage collector runs. When that happens, all objects without references in memory are removed and the memory freed. If you wait a while you should see that memory free itself.
    --
    http://mail.python.org/mailman/listinfo/python-list
    Wong Wah Meng-R32813, Mar 6, 2013
    #18
  19. RE: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    Thanks for youre reply. I built python 2.7.1 binary myself on the HP box and I wasn't aware there is any configuration or setup that I need to modify in order to activate or engage the garbage collection (or even setting the memory size used). Probably you are right it leaves it to the OS itself (inthis case HP-UX) to clean it up as after python removes the reference to the address of the variables the OS still thinks the python process should still owns it until the process exits.

    Regards,
    Wah Meng

    -----Original Message-----
    From: Python-list [mailto:python-list-bounces+wahmeng=] On Behalf Of Terry Reedy
    Sent: Wednesday, March 06, 2013 7:00 PM
    To:
    Subject: Re: Set x to to None and del x doesn't release memory in python 2.7.1 (HPUX 11.23, ia64)

    On 3/6/2013 5:11 AM, Wong Wah Meng-R32813 wrote:
    > Hello there,
    >
    > I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >
    > I discovered following behavior whereby the python process doesn't
    > seem to release memory utilized even after a variable is set to None,
    > and "deleted". I use glance tool to monitor the memory utilized by
    > this process. Obviously after the for loop is executed, the memory
    > used by this process has hiked to a few MB. However, after "del" is
    > executed to both I and str variables, the memory of that process still
    > stays at where it was.


    Whether memory freed by deleting an object is returned to and taken by the OS depends on the OS and other factors like like the size and layout of thefreed memory, probably the history of memory use, and for CPython, the C compiler's malloc/free implementation. At various times, the Python memory handlers have been rewritten to encourage/facilitate memory return, but Python cannot control the process.

    > for i in range(100000L):
    > str=str+"%s"%(i,)
    > i=None; str=None # not necessary
    > del i; del str


    Reusing built-in names for unrelated purposes is generally a bad idea, although the final deletion does restore access to the builtin.

    --
    Terry Jan Reedy

    --
    http://mail.python.org/mailman/listinfo/python-list
    Wong Wah Meng-R32813, Mar 6, 2013
    #19
  20. Íßêïò Ãêñ33ê

    Dave Angel Guest

    Re: Set x to to None and del x doesn't release memory in python2.7.1 (HPUX 11.23, ia64)

    On 03/06/2013 05:25 AM, Bryan Devaney wrote:
    > On Wednesday, March 6, 2013 10:11:12 AM UTC, Wong Wah Meng-R32813 wrote:
    >> Hello there,
    >>
    >>
    >>
    >> I am using python 2.7.1 built on HP-11.23 a Itanium 64 bit box.
    >>
    >>
    >>
    >> I discovered following behavior whereby the python process doesn't seem to release memory utilized even after a variable is set to None, and "deleted". I use glance tool to monitor the memory utilized by this process. Obviously after the for loop is executed, the memory used by this process has hiked to a few MB. However, after "del" is executed to both I and str variables, the memory of that process still stays at where it was.
    >>
    >> <SNIP>
    >>

    >
    > Python uses a 'garbage collector'. When you delete something, all references are removed from the object in memory, the memory itself will not be freed until the next time the garbage collector runs. When that happens, all objects without references in memory are removed and the memory freed. If you wait a while you should see that memory free itself.
    >


    Actually, no. The problem with monitoring memory usage from outside the
    process is that memory "ownership" is hierarchical, and each hierarchy
    deals in bigger chunks. So when the CPython runtime calls free() on a
    particular piece of memory, the C runtime may or may not actually
    release the memory for use by other processes. Since the C runtime
    grabs big pieces from the OS, and parcels out little pieces to CPython,
    a particular big piece can only be freed if ALL the little pieces are
    free. And even then, it may or may not choose to do so.

    Completely separate from that are the two mechanisms that CPython uses
    to free its pieces. It does reference counting, and it does garbage
    collecting. In this case, only the reference counting is relevant, as
    when it's done there's no garbage left to collect. When an object is no
    longer referenced by anything, its count will be zero, and it will be
    freed by calling the C library function. GC is only interesting when
    there are cycles in the references, such as when a list contains as one
    of its elements a tuple, which in turn contains the original list.
    Sound silly? No, it's quite common once complex objects are created
    which reference each other. The counts don't go to zero, and the
    objects wait for garbage collection.

    OP: There's no need to set to None and also to del the name. Since
    there's only one None object, keeping another named reference to that
    object has very little cost.



    --
    DaveA
    Dave Angel, Mar 6, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sreenivasan alakappan

    non-form based databse application

    sreenivasan alakappan, Apr 2, 2004, in forum: C++
    Replies:
    6
    Views:
    334
    Christopher Benson-Manica
    Apr 8, 2004
  2. Jim Florence
    Replies:
    4
    Views:
    652
    Jim Florence
    Jun 24, 2006
  3. Sirisha
    Replies:
    1
    Views:
    398
    Peter Bradley
    Feb 20, 2007
  4. =?Utf-8?B?cGVsZWdrMQ==?=

    page data retriview -databse or xml (ASP.NET2)

    =?Utf-8?B?cGVsZWdrMQ==?=, Jul 16, 2007, in forum: ASP .Net
    Replies:
    0
    Views:
    418
    =?Utf-8?B?cGVsZWdrMQ==?=
    Jul 16, 2007
  5. Trans
    Replies:
    2
    Views:
    461
    Trans
    Dec 12, 2005
Loading...

Share This Page