File handling problem.

  • Thread starter subhakolkata1234
  • Start date
S

subhakolkata1234

Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the
generated set of values and storing it. Is there any solution for it?

Best Regards,
SBanerjee.
 
P

Pascal Chambon

(e-mail address removed) a écrit :
Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the
generated set of values and storing it. Is there any solution for it?

Best Regards,
SBanerjee.
Hello

Could you post excerpt of your file-handling code ?
It might be a buffering problem (although when the file closes, I think
buffers get flushed), else it's really weird...

Regards,
pascal
 
S

Steven D'Aprano

Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the generated
set of values and storing it. Is there any solution for it?

Yes. Find the bug in your program and fix it.

If you'd like some help finding the bug, you'll need to give us a little
bit more information. This website might help you:

http://www.catb.org/~esr/faqs/smart-questions.html
 
C

Chris Rebert

Dear Group,



I am working on a code like the following:



from decimal import*

#SAMPLE TEST PROGRAM FOR FILE

def sample_file_test(n):

    #FILE FOR STORING PROBABILITY VALUES

    open_file=open("/python26/Newfile1.txt","r+")

Is there a reason you must output the results to the same file the
input came from? It's possible this is part of your problems.
    #OPENING OF ENGLISH CORPUS

    open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")

    #READING THE ENGLISH CORPUS

    corp_read=open_corp_eng.read()

    #CONVERTING THE CORPUS FILE IN WORDS

    corp_word=corp_read.split()

    #EXTRACTING WORDS FROM CORPUS FILE OF WORDS

    for word in corp_word:

        #COUNTING THE WORD

        count1=corp_word.count(word)

Note: Your program is currently O(N^2) rather than O(N) because you
re-count the number of occurrences of each word /on every occurrence
of the word/.
        #COUNTING TOTAL NUMBER OF WORDS

        count2=len(corp_word)

        #COUNTING PROBABILITY OF WORD

        count1_dec=Decimal(count1)

        count2_dec=Decimal(count2)

        getcontext().prec = 6

        prob_count=count1_dec/count2_dec

        print prob_count

        string_of_prob_count=str(prob_count)

        file_input_val=open_file.write(string_of_prob_count)

        open_file.close()

You shouldn't be closing the file until the /entire loop/ has finished
writing to the file. So the previous line should be dedented.
The problems I am getting:

(i)                  The probability values are not being stored properly in
file.

Also, you're currently not putting any separator between consecutive
entires, so it's all going to run together as one long line.
Have you considered using one of the std lib modules to output the
file in a well-defined human-readable format such as JSON or CSV?
(ii)                “Newfile1.txt†is storing not the series of values but
an arbitrary value from series 0.00000143096

(iii)               As I was testing it again it gave me another error

Traceback (most recent call last):

  File "<pyshell#2>", line 1, in <module>

    sample_file_test(1)

  File "C:\Python26\testprogramforfiles1.py", line 25, in sample_file_test

    file_input_val=open_file.write(string_of_prob_count)

ValueError: I/O operation on closed file


Cheers,
Chris
 
D

Dennis Lee Bieber

On Sun, 3 May 2009 22:21:13 +0530, SUBHABRATA BANERJEE
<[email protected]> declaimed the following in
gmane.comp.python.general:

said:
from decimal import*
#SAMPLE TEST PROGRAM FOR FILE
def sample_file_test(n):

Is that "n" supposed to mean something? I don't see it used anywhere
in the following stuff.
#FILE FOR STORING PROBABILITY VALUES
open_file=open("/python26/Newfile1.txt","r+")

This is your output file, no? So why are you opening it for read
(with direct positioning option)... Oh, and just out of curiosity --
that isn't the Python install directory you are using for your data
files, is it?
#OPENING OF ENGLISH CORPUS
open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")

Same comment about directory
#READING THE ENGLISH CORPUS
corp_read=open_corp_eng.read()

Let's see, read the entire file into one large string...
#CONVERTING THE CORPUS FILE IN WORDS
corp_word=corp_read.split()

Then create a list of words (so you now have, essentially, two
copies of the text in memory). And forgive me, but those names aren't
the most illuminative: open_corp_eng sounds more like a function that is
meant to open something, not the result from opening it...
#EXTRACTING WORDS FROM CORPUS FILE OF WORDS
for word in corp_word:
#COUNTING THE WORD
count1=corp_word.count(word)

If any given word appears in the text more than once, you end up
repeating this counting operation each time it appears.
#COUNTING TOTAL NUMBER OF WORDS
count2=len(corp_word)

This value should not change and should be obtained outside the word
loop.
#COUNTING PROBABILITY OF WORD
count1_dec=Decimal(count1)
count2_dec=Decimal(count2)
getcontext().prec = 6
prob_count=count1_dec/count2_dec

Is there some particular reason for using decimal package here (and
again, count2_dec should be computed outside the loop.
print prob_count
string_of_prob_count=str(prob_count)
file_input_val=open_file.write(string_of_prob_count)

Does .write() even return a value? What do you expect
"file_input_val" to contain after that statement? And, as others have
mentioned, .write() does not add newlines or other whitespace, so all
your output would be one long string.
open_file.close()
Uhm, the first word it calculates and writes a probability for will
be followed by closing the output file -- I suspect this should be
outside the for loop.

Also note that you will be repeating words in the output as there is
no provision to create unique entries.

Does the following seem to do what you need?

-=-=-=-=-=-=-=-=-
"""
wordprob.py relative probabilities for word appearance in
text
may 3 2009 dennis lee bieber

an alternate approach to a problem posted on C.L.P
"""

import sys

def loadData(fid):
words = {}
fin = open(fid, "r")
#only read one line at a time
for ln in fin:
#treat upper and lower case words as same
for wd in ln.lower().split():
#increment count of specific word
words[wd] = words.get(wd, 0) + 1
fin.close()
return words

def computeProbabilities(wordCounts):
#get total count of words (and make it float)
total = float(sum(wordCounts.values()))
probs = {}
#for each word, compute the probability
for (wd, wc) in wordCounts.items():
probs[wd] = wc / total
return probs

def writeResults(fid, probs):
if type(fid) == type("string"):
fout = open(fid, "w")
else:
#otherwise assume supplied fid is an open stream
fout = fid
#convert dictionary to list for sorting
ordered = probs.items()
#sort into descending probability (most common first)
ordered.sort(key=lambda x: x[1], reverse=True)
for (wd, wp) in ordered:
#write the word followed by probability, newline
fout.write("%s : %s\n" % (wd, wp))
if fout != fid:
fout.close()

if __name__ == "__main__":
theData = loadData(YOUR_FILE_NAME_HERE)
theProbs = computeProbabilities(theData)
writeResults(sys.stdout, theProbs)
## use a file name if desired output is other than screen
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top