Help With EOF character: URGENT

Discussion in 'Python' started by dont bother, Feb 22, 2004.

  1. dont bother

    dont bother Guest

    Hi Buddies,
    I am facing this problem and I dont know what to use
    as EOF in python:
    I want to read a file, and put all the individual
    words in a dictionary with their index:
    For example if the file is:

    Hello there I am doing fine
    How are you?

    So I want to make an index like this:

    1 Hello
    2 there
    3 I
    4 am
    5 doing
    6 fine
    7 How
    8 are
    9 you
    10 ?

    In order to do this: I have written a small code which
    is here:
    -------------------------------------------------------
    # python code for creating dictionary of words from an
    #input file
    ------------------------------------------------------

    import os
    import sys
    try:
    fread = open('training_data', 'r')
    except IOError:
    print 'Cant open file for reading'
    sys.exit(0)
    print 'Okay reading the file'
    s=""
    a=fread.read(1)
    while (a!="\003"):
    #while 1:
    s=s+a

    if(a=='\012'): #newline
    #print s
    #print 'The Line Ends'
    fwrite=open('dictionary', 'a')
    fwrite.write(s)
    s=""


    if(a=='\040'): #blank character
    #print s
    fwrite=open('dictionary', 'a')
    fwrite.write(s)
    fwrite.write("\n")
    s=""
    a=fread.read(1)

    print 'Wrote to Dictionary\n'
    fwrite.close()
    fread.close()


    ---------------------------------------------------

    My problem is that I dont know what to use in place of
    EOF. I have tried using Octal "\003" and "\004" but
    that does not work. The code keeps on running. I want
    it to stop reading when the EOF has reached.
    Can someone help me out on this?
    Also, I have to create a list: (A Map kind of thing
    with an index associated with each word). Can some one
    offer a tip or snippet on that.
    I will be really grateful.

    Thanks
    Dont


    __________________________________
    Do you Yahoo!?
    Yahoo! Mail SpamGuard - Read only the mail you want.
    http://antispam.yahoo.com/tools
     
    dont bother, Feb 22, 2004
    #1
    1. Advertisements

  2. dont bother

    Jorge Godoy Guest

    Have you tried just using "while a:"? When it can't read anything -- i.e.
    EOF was found -- it will return False and the loop will end.


    Be seeing you,
     
    Jorge Godoy, Feb 23, 2004
    #2
    1. Advertisements

  3. There's no "EOF character" in Python. When the end of a file is
    reached, reading from it returns an empty string. To process
    a file one character at a time, you can do

    while 1:
    c = f.read(1)
    if not c:
    break
    # process c here

    In your case you seem to be dealing with words, so you can
    take advantage of two Python features: (1) You can read
    a line at a time with the readline() method. (2) You can
    split a string into words with the split() method of strings.

    while 1:
    line = f.readline()
    if not line:
    break
    words = line.split()
    for word in words:
    # process word here

    If you have a recent enough Python (>= 2.2 I think), you can
    also iterate directly over the file, which will iterate over
    its lines, so the above reduces to just

    for line in f:
    words = line.split()
    for word in words:
    # process word here

    Note: The readline() method, and also "for line in f", returns
    lines including the newline character on the end. That doesn't
    matter here, because line.split() gets rid of all the whitespace,
    but you need to be aware of it if you do other things with
    the line. You can use

    line = line.strip()

    to remove the newline if you need to.
     
    Greg Ewing (using news.cis.dfn.de), Feb 23, 2004
    #3
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.