File handling problem.

Discussion in 'Python' started by subhakolkata1234@gmail.com, May 2, 2009.

  1. Guest

    Dear Group,

    I am using Python2.6 and has created a file where I like to write some
    statistical values I am generating. The statistical values are
    generating in a nice way, but as I am going to write it, it is not
    taking it, the file is opening or closing properly but the values are
    not getting stored. It is picking up arbitrary values from the
    generated set of values and storing it. Is there any solution for it?

    Best Regards,
    SBanerjee.
     
    , May 2, 2009
    #1
    1. Advertising

  2. a écrit :
    > Dear Group,
    >
    > I am using Python2.6 and has created a file where I like to write some
    > statistical values I am generating. The statistical values are
    > generating in a nice way, but as I am going to write it, it is not
    > taking it, the file is opening or closing properly but the values are
    > not getting stored. It is picking up arbitrary values from the
    > generated set of values and storing it. Is there any solution for it?
    >
    > Best Regards,
    > SBanerjee.
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    >
    >

    Hello

    Could you post excerpt of your file-handling code ?
    It might be a buffering problem (although when the file closes, I think
    buffers get flushed), else it's really weird...

    Regards,
    pascal
     
    Pascal Chambon, May 2, 2009
    #2
    1. Advertising

  3. On Sat, 02 May 2009 01:26:14 -0700, subhakolkata1234 wrote:

    > Dear Group,
    >
    > I am using Python2.6 and has created a file where I like to write some
    > statistical values I am generating. The statistical values are
    > generating in a nice way, but as I am going to write it, it is not
    > taking it, the file is opening or closing properly but the values are
    > not getting stored. It is picking up arbitrary values from the generated
    > set of values and storing it. Is there any solution for it?


    Yes. Find the bug in your program and fix it.

    If you'd like some help finding the bug, you'll need to give us a little
    bit more information. This website might help you:

    http://www.catb.org/~esr/faqs/smart-questions.html



    --
    Steven
     
    Steven D'Aprano, May 2, 2009
    #3
  4. Chris Rebert Guest

    On Sun, May 3, 2009 at 9:51 AM, SUBHABRATA BANERJEE
    <> wrote:
    > Dear Group,
    >
    >
    >
    > I am working on a code like the following:
    >
    >
    >
    > from decimal import*
    >
    > #SAMPLE TEST PROGRAM FOR FILE
    >
    > def sample_file_test(n):
    >
    >     #FILE FOR STORING PROBABILITY VALUES
    >
    >     open_file=open("/python26/Newfile1.txt","r+")


    Is there a reason you must output the results to the same file the
    input came from? It's possible this is part of your problems.

    >
    >     #OPENING OF ENGLISH CORPUS
    >
    >     open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")
    >
    >     #READING THE ENGLISH CORPUS
    >
    >     corp_read=open_corp_eng.read()
    >
    >     #CONVERTING THE CORPUS FILE IN WORDS
    >
    >     corp_word=corp_read.split()
    >
    >     #EXTRACTING WORDS FROM CORPUS FILE OF WORDS
    >
    >     for word in corp_word:
    >
    >         #COUNTING THE WORD
    >
    >         count1=corp_word.count(word)


    Note: Your program is currently O(N^2) rather than O(N) because you
    re-count the number of occurrences of each word /on every occurrence
    of the word/.

    >         #COUNTING TOTAL NUMBER OF WORDS
    >
    >         count2=len(corp_word)
    >
    >         #COUNTING PROBABILITY OF WORD
    >
    >         count1_dec=Decimal(count1)
    >
    >         count2_dec=Decimal(count2)
    >
    >         getcontext().prec = 6
    >
    >         prob_count=count1_dec/count2_dec
    >
    >         print prob_count
    >
    >         string_of_prob_count=str(prob_count)
    >
    >         file_input_val=open_file.write(string_of_prob_count)
    >
    >         open_file.close()


    You shouldn't be closing the file until the /entire loop/ has finished
    writing to the file. So the previous line should be dedented.

    >
    >
    >
    > The problems I am getting:
    >
    > (i)                  The probability values are not being stored properly in
    > file.


    Also, you're currently not putting any separator between consecutive
    entires, so it's all going to run together as one long line.
    Have you considered using one of the std lib modules to output the
    file in a well-defined human-readable format such as JSON or CSV?

    > (ii)                “Newfile1.txt†is storing not the series of values but
    > an arbitrary value from series 0.00000143096
    >
    > (iii)               As I was testing it again it gave me another error
    >
    > Traceback (most recent call last):
    >
    >   File "<pyshell#2>", line 1, in <module>
    >
    >     sample_file_test(1)
    >
    >   File "C:\Python26\testprogramforfiles1.py", line 25, in sample_file_test
    >
    >     file_input_val=open_file.write(string_of_prob_count)
    >
    > ValueError: I/O operation on closed file



    Cheers,
    Chris
    --
    http://blog.rebertia.com
     
    Chris Rebert, May 3, 2009
    #4
  5. On Sun, 3 May 2009 22:21:13 +0530, SUBHABRATA BANERJEE
    <> declaimed the following in
    gmane.comp.python.general:

    <snipping out blank lines>

    >
    > from decimal import*
    > #SAMPLE TEST PROGRAM FOR FILE
    > def sample_file_test(n):


    Is that "n" supposed to mean something? I don't see it used anywhere
    in the following stuff.

    > #FILE FOR STORING PROBABILITY VALUES
    > open_file=open("/python26/Newfile1.txt","r+")


    This is your output file, no? So why are you opening it for read
    (with direct positioning option)... Oh, and just out of curiosity --
    that isn't the Python install directory you are using for your data
    files, is it?

    > #OPENING OF ENGLISH CORPUS
    > open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")


    Same comment about directory

    > #READING THE ENGLISH CORPUS
    > corp_read=open_corp_eng.read()


    Let's see, read the entire file into one large string...

    > #CONVERTING THE CORPUS FILE IN WORDS
    > corp_word=corp_read.split()


    Then create a list of words (so you now have, essentially, two
    copies of the text in memory). And forgive me, but those names aren't
    the most illuminative: open_corp_eng sounds more like a function that is
    meant to open something, not the result from opening it...

    > #EXTRACTING WORDS FROM CORPUS FILE OF WORDS
    > for word in corp_word:
    > #COUNTING THE WORD
    > count1=corp_word.count(word)


    If any given word appears in the text more than once, you end up
    repeating this counting operation each time it appears.

    > #COUNTING TOTAL NUMBER OF WORDS
    > count2=len(corp_word)


    This value should not change and should be obtained outside the word
    loop.

    > #COUNTING PROBABILITY OF WORD
    > count1_dec=Decimal(count1)
    > count2_dec=Decimal(count2)
    > getcontext().prec = 6
    > prob_count=count1_dec/count2_dec


    Is there some particular reason for using decimal package here (and
    again, count2_dec should be computed outside the loop.

    > print prob_count
    > string_of_prob_count=str(prob_count)
    > file_input_val=open_file.write(string_of_prob_count)


    Does .write() even return a value? What do you expect
    "file_input_val" to contain after that statement? And, as others have
    mentioned, .write() does not add newlines or other whitespace, so all
    your output would be one long string.

    > open_file.close()
    >

    Uhm, the first word it calculates and writes a probability for will
    be followed by closing the output file -- I suspect this should be
    outside the for loop.

    Also note that you will be repeating words in the output as there is
    no provision to create unique entries.

    Does the following seem to do what you need?

    -=-=-=-=-=-=-=-=-
    """
    wordprob.py relative probabilities for word appearance in
    text
    may 3 2009 dennis lee bieber

    an alternate approach to a problem posted on C.L.P
    """

    import sys

    def loadData(fid):
    words = {}
    fin = open(fid, "r")
    #only read one line at a time
    for ln in fin:
    #treat upper and lower case words as same
    for wd in ln.lower().split():
    #increment count of specific word
    words[wd] = words.get(wd, 0) + 1
    fin.close()
    return words

    def computeProbabilities(wordCounts):
    #get total count of words (and make it float)
    total = float(sum(wordCounts.values()))
    probs = {}
    #for each word, compute the probability
    for (wd, wc) in wordCounts.items():
    probs[wd] = wc / total
    return probs

    def writeResults(fid, probs):
    if type(fid) == type("string"):
    fout = open(fid, "w")
    else:
    #otherwise assume supplied fid is an open stream
    fout = fid
    #convert dictionary to list for sorting
    ordered = probs.items()
    #sort into descending probability (most common first)
    ordered.sort(key=lambda x: x[1], reverse=True)
    for (wd, wp) in ordered:
    #write the word followed by probability, newline
    fout.write("%s : %s\n" % (wd, wp))
    if fout != fid:
    fout.close()

    if __name__ == "__main__":
    theData = loadData(YOUR_FILE_NAME_HERE)
    theProbs = computeProbabilities(theData)
    writeResults(sys.stdout, theProbs)
    ## use a file name if desired output is other than screen
    --
    Wulfraed Dennis Lee Bieber KD6MOG

    HTTP://wlfraed.home.netcom.com/
    (Bestiaria Support Staff: )
    HTTP://www.bestiaria.com/
     
    Dennis Lee Bieber, May 3, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. uwb
    Replies:
    4
    Views:
    366
  2. Rajen
    Replies:
    21
    Views:
    567
    Dave Thompson
    Sep 21, 2006
  3. Mark Tarver
    Replies:
    22
    Views:
    1,315
    J Kenneth King
    Apr 26, 2009
  4. Peter
    Replies:
    34
    Views:
    1,942
    James Kanze
    Oct 17, 2009
  5. Iñaki Baz Castillo
    Replies:
    1
    Views:
    190
    Iñaki Baz Castillo
    Apr 15, 2008
Loading...

Share This Page