help in debugging file.seek, file.read

Discussion in 'Python' started by Prateek, Apr 29, 2007.

  1. Prateek

    Prateek Guest

    I have a wierd sort of problem.
    I'm writing a bunch of sets to a file (each element is a fixed length
    string).

    I was originally using the built-in sets type but due to a processing
    issue, I had to shift to a Python implementation (see
    http://groups.google.com/group/comp...ead/thread/77e06005e897653c/12270083be9a67f6).
    I'm using Raymond Hettinger's very graciously provided TransactionSet
    recipe from http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511496

    My old code was pretty speedy but the new code is quite slow and
    according to some hotshot profiles it has nothing to do with the new
    TransactionSet. *Some* of the file.seek and file.read calls
    occasionally block for insane amounts (>10s) of time when reading no
    more than 45 bytes of data from the file. So when I'm running load-
    tests, this eventually happens like so:

    Each i_id statement is the time taken for 100 successive commits.
    Each RTH (Read Table Header) statement is the time taken for a single
    read call for 45 bytes of data
    # sz__TABLE_HEADER_FORMAT__ is 45
    hdr = os.read(f.fileno(), sz__TABLE_HEADER_FORMAT__)
    #hdr = f.read(sz__TABLE_HEADER_FORMAT__) # Tried this as well

    Loading items...
    i_id: 0 0.000111103057861
    i_id: 100 1.01557397842
    i_id: 200 1.14013886452
    i_id: 300 1.16142892838
    i_id: 400 1.16356801987
    i_id: 500 1.36410307884
    i_id: 600 1.34421014786
    i_id: 700 1.30385017395
    i_id: 800 1.48079919815
    i_id: 900 1.41147589684
    RTH: 0.582525968552
    RTH: 2.77490496635
    i_id: 1000 5.16863512993
    i_id: 1100 1.73725795746
    i_id: 1200 1.56621193886
    i_id: 1300 1.81338000298
    i_id: 1400 1.69464302063
    i_id: 1500 1.74725604057
    i_id: 1600 2.35222291946
    i_id: 1700 1.85096788406
    i_id: 1800 2.20518493652
    i_id: 1900 1.94831299782
    i_id: 2000 2.03350806236
    i_id: 2100 2.32529306412
    i_id: 2200 2.44498205185
    RTH: 0.105868816376
    i_id: 2300 3.65522289276
    i_id: 2400 4.2119910717
    i_id: 2500 4.21354198456
    RTH: 0.115046024323
    RTH: 0.122591972351
    RTH: 2.88115119934
    RTH: 10.5908679962
    i_id: 2600 18.8498170376
    i_id: 2700 2.42577004433
    i_id: 2800 2.47392010689
    i_id: 2900 2.88293218613

    So I have no idea why this is happening (it is also happening with
    seek operations).
    Any guidance on how to debug this?

    thanks,
    Prateek
    Prateek, Apr 29, 2007
    #1
    1. Advertising

  2. Prateek

    Prateek Guest

    Sorry, I forgot to mention - the RTH line only prints when the time
    taken is > 0.1 second (so that I don't pollute the output with other
    calls that complete normally)
    Prateek, Apr 29, 2007
    #2
    1. Advertising

  3. Prateek

    Prateek Guest

    On Apr 30, 3:20 am, Prateek <> wrote:
    > Sorry, I forgot to mention - the RTH line only prints when the time
    > taken is > 0.1 second (so that I don't pollute the output with other
    > calls that complete normally)


    I have some more information on this problem. Turns out the issue is
    with buffer syncing.
    I was already doing a flush operation on the file after every commit
    (which flushes the user buffers).

    I added an os.fsync call which fixed the erratic behavior. But the
    code is still horribly slow... 122s vs around 100s (without the
    fsync).
    Since fsync flushes the kernel buffers, I'm assuming this has
    something to do with my O/S (Mac OS 10.4.9 Intel - Mac Book Pro
    2.33Ghz Core 2 Duo, 2GB RAM).

    Can anybody help?
    Prateek, Apr 30, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Amy G

    Using .seek(0) to read.

    Amy G, Jan 10, 2004, in forum: Python
    Replies:
    1
    Views:
    331
    Tim Roberts
    Jan 12, 2004
  2. Prabhu Ramachandran

    Trouble with file.seek/file.tell on Win32?

    Prabhu Ramachandran, Aug 15, 2004, in forum: Python
    Replies:
    0
    Views:
    430
    Prabhu Ramachandran
    Aug 15, 2004
  3. Martin Durai
    Replies:
    11
    Views:
    323
    James Edward Gray II
    Nov 26, 2007
  4. Robert Klemme

    DATA.seek allows to read file

    Robert Klemme, Jan 11, 2011, in forum: Ruby
    Replies:
    2
    Views:
    155
    James Edward Gray II
    Jan 11, 2011
  5. Replies:
    3
    Views:
    127
    Andreas Perstinger
    May 14, 2013
Loading...

Share This Page