File handling problem.

subhakolkata1234 · May 2, 2009

Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the
generated set of values and storing it. Is there any solution for it?

Best Regards,
SBanerjee.

Pascal Chambon · May 2, 2009

(e-mail address removed) a écrit :

Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the
generated set of values and storing it. Is there any solution for it?

Best Regards,
SBanerjee.

Hello

Could you post excerpt of your file-handling code ?
It might be a buffering problem (although when the file closes, I think
buffers get flushed), else it's really weird...

Regards,
pascal

Steven D'Aprano · May 2, 2009

Dear Group,

I am using Python2.6 and has created a file where I like to write some
statistical values I am generating. The statistical values are
generating in a nice way, but as I am going to write it, it is not
taking it, the file is opening or closing properly but the values are
not getting stored. It is picking up arbitrary values from the generated
set of values and storing it. Is there any solution for it?

Yes. Find the bug in your program and fix it.

If you'd like some help finding the bug, you'll need to give us a little
bit more information. This website might help you:

http://www.catb.org/~esr/faqs/smart-questions.html

Chris Rebert · May 3, 2009

Dear Group,

I am working on a code like the following:

from decimal import*

#SAMPLE TEST PROGRAM FOR FILE

def sample_file_test(n):

Â Â Â #FILE FOR STORING PROBABILITY VALUES

Â Â Â open_file=open("/python26/Newfile1.txt","r+")

Is there a reason you must output the results to the same file the
input came from? It's possible this is part of your problems.

Â Â Â #OPENING OF ENGLISH CORPUS

Â Â Â open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")

Â Â Â #READING THE ENGLISH CORPUS

Â Â Â corp_read=open_corp_eng.read()

Â Â Â #CONVERTING THE CORPUS FILE IN WORDS

Â Â Â corp_word=corp_read.split()

Â Â Â #EXTRACTING WORDS FROM CORPUS FILE OF WORDS

Â Â Â for word in corp_word:

Â Â Â Â Â Â Â #COUNTING THE WORD

Â Â Â Â Â Â Â count1=corp_word.count(word)

Note: Your program is currently O(N^2) rather than O(N) because you
re-count the number of occurrences of each word /on every occurrence
of the word/.

Â Â Â Â Â Â Â #COUNTING TOTAL NUMBER OF WORDS

Â Â Â Â Â Â Â count2=len(corp_word)

Â Â Â Â Â Â Â #COUNTING PROBABILITY OF WORD

Â Â Â Â Â Â Â count1_dec=Decimal(count1)

Â Â Â Â Â Â Â count2_dec=Decimal(count2)

Â Â Â Â Â Â Â getcontext().prec = 6

Â Â Â Â Â Â Â prob_count=count1_dec/count2_dec

Â Â Â Â Â Â Â print prob_count

Â Â Â Â Â Â Â string_of_prob_count=str(prob_count)

Â Â Â Â Â Â Â file_input_val=open_file.write(string_of_prob_count)

Â Â Â Â Â Â Â open_file.close()

You shouldn't be closing the file until the /entire loop/ has finished
writing to the file. So the previous line should be dedented.

The problems I am getting:

(i)Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â The probability values are not being stored properly in
file.

Also, you're currently not putting any separator between consecutive
entires, so it's all going to run together as one long line.
Have you considered using one of the std lib modules to output the
file in a well-defined human-readable format such as JSON or CSV?

(ii)Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â â€œNewfile1.txtâ€ is storing not the series of values but
an arbitrary value from series 0.00000143096

(iii)Â Â Â Â Â Â Â Â Â Â Â Â Â Â As I was testing it again it gave me another error

Traceback (most recent call last):

Â File "<pyshell#2>", line 1, in <module>

Â Â Â sample_file_test(1)

Â File "C:\Python26\testprogramforfiles1.py", line 25, in sample_file_test

Â Â Â file_input_val=open_file.write(string_of_prob_count)

ValueError: I/O operation on closed file

Cheers,
Chris

Dennis Lee Bieber · May 3, 2009

On Sun, 3 May 2009 22:21:13 +0530, SUBHABRATA BANERJEE
<[email protected]> declaimed the following in
gmane.comp.python.general:

said:
from decimal import*
#SAMPLE TEST PROGRAM FOR FILE
def sample_file_test(n):

Is that "n" supposed to mean something? I don't see it used anywhere
in the following stuff.

#FILE FOR STORING PROBABILITY VALUES
open_file=open("/python26/Newfile1.txt","r+")

This is your output file, no? So why are you opening it for read
(with direct positioning option)... Oh, and just out of curiosity --
that isn't the Python install directory you are using for your data
files, is it?

#OPENING OF ENGLISH CORPUS
open_corp_eng=open("/python26/TOTALENGLISHCORPUS1.txt","r")

Same comment about directory

#READING THE ENGLISH CORPUS
corp_read=open_corp_eng.read()

Let's see, read the entire file into one large string...

#CONVERTING THE CORPUS FILE IN WORDS
corp_word=corp_read.split()

Then create a list of words (so you now have, essentially, two
copies of the text in memory). And forgive me, but those names aren't
the most illuminative: open_corp_eng sounds more like a function that is
meant to open something, not the result from opening it...

#EXTRACTING WORDS FROM CORPUS FILE OF WORDS
for word in corp_word:
#COUNTING THE WORD
count1=corp_word.count(word)

If any given word appears in the text more than once, you end up
repeating this counting operation each time it appears.

#COUNTING TOTAL NUMBER OF WORDS
count2=len(corp_word)

This value should not change and should be obtained outside the word
loop.

#COUNTING PROBABILITY OF WORD
count1_dec=Decimal(count1)
count2_dec=Decimal(count2)
getcontext().prec = 6
prob_count=count1_dec/count2_dec

Is there some particular reason for using decimal package here (and
again, count2_dec should be computed outside the loop.

print prob_count
string_of_prob_count=str(prob_count)
file_input_val=open_file.write(string_of_prob_count)

Does .write() even return a value? What do you expect
"file_input_val" to contain after that statement? And, as others have
mentioned, .write() does not add newlines or other whitespace, so all
your output would be one long string.

open_file.close()

Uhm, the first word it calculates and writes a probability for will
be followed by closing the output file -- I suspect this should be
outside the for loop.

Also note that you will be repeating words in the output as there is
no provision to create unique entries.

Does the following seem to do what you need?

-=-=-=-=-=-=-=-=-
"""
wordprob.py relative probabilities for word appearance in
text
may 3 2009 dennis lee bieber

an alternate approach to a problem posted on C.L.P
"""

import sys

def loadData(fid):
words = {}
fin = open(fid, "r")
#only read one line at a time
for ln in fin:
#treat upper and lower case words as same
for wd in ln.lower().split():
#increment count of specific word
words[wd] = words.get(wd, 0) + 1
fin.close()
return words

def computeProbabilities(wordCounts):
#get total count of words (and make it float)
total = float(sum(wordCounts.values()))
probs = {}
#for each word, compute the probability
for (wd, wc) in wordCounts.items():
probs[wd] = wc / total
return probs

def writeResults(fid, probs):
if type(fid) == type("string"):
fout = open(fid, "w")
else:
#otherwise assume supplied fid is an open stream
fout = fid
#convert dictionary to list for sorting
ordered = probs.items()
#sort into descending probability (most common first)
ordered.sort(key=lambda x: x[1], reverse=True)
for (wd, wp) in ordered:
#write the word followed by probability, newline
fout.write("%s : %s\n" % (wd, wp))
if fout != fid:
fout.close()

if __name__ == "__main__":
theData = loadData(YOUR_FILE_NAME_HERE)
theProbs = computeProbabilities(theData)
writeResults(sys.stdout, theProbs)
## use a file name if desired output is other than screen
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/

file handling issues	8	Sep 6, 2013
How to create PDF file in Batch	5	May 11, 2022
File Handling Problem	5	Sep 4, 2009
Getchar() problem	8	Jan 2, 2022
Exception Handling Practices / Patterns	7	Aug 24, 2013
Having a problem in one of my assignments - NEED HELP	0	Jul 20, 2023
Problem with file handling	5	Sep 27, 2010
Rearranging .ply file via C++ String Parsing	0	Dec 14, 2019

File handling problem.

subhakolkata1234

Pascal Chambon

Steven D'Aprano

Chris Rebert

Dennis Lee Bieber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads