String Fomat Conversion

Discussion in 'Python' started by mcg, Jan 27, 2005.

  1. mcg

    mcg Guest

    Investigating python day 1:

    Data in file:
    x y
    1 2
    3 4
    5 6


    Want to read file into an array of pairs.

    in c: scanf("%d %d",&x,&y)---store x y in array, loop.

    How do I do this in python??
    In the actual application, the pairs are floating pt i.e. -1.003
    mcg, Jan 27, 2005
    #1
    1. Advertising

  2. On 26 Jan 2005 20:53:02 -0800, mcg <> wrote:
    > Investigating python day 1:
    >
    > Data in file:
    > x y
    > 1 2
    > 3 4
    > 5 6
    >
    > Want to read file into an array of pairs.
    >
    > in c: scanf("%d %d",&x,&y)---store x y in array, loop.
    >
    > How do I do this in python??
    > In the actual application, the pairs are floating pt i.e. -1.003


    f = file('input', 'r')
    labels = f.readline() # consume the first line of the file.

    Easy Option:
    for line in f.readlines():
    x, y = line.split()
    x = float(x)
    y = float(y)

    Or, more concisely:
    for line in f.readlines():
    x, y = map(float, line.split())

    Regards,
    Stephen Thorne
    Stephen Thorne, Jan 27, 2005
    #2
    1. Advertising

  3. Stephen Thorne wrote:
    > f = file('input', 'r')
    > labels = f.readline() # consume the first line of the file.
    >
    > Easy Option:
    > for line in f.readlines():
    > x, y = line.split()
    > x = float(x)
    > y = float(y)
    >
    > Or, more concisely:
    > for line in f.readlines():
    > x, y = map(float, line.split())


    Somewhat more memory efficient:

    lines_iter = iter(file('input'))
    labels = lines_iter.next()
    for line in lines_iter:
    x, y = [float(f) for f in line.split()]

    By using the iterator instead of readlines, I read only one line from
    the file into memory at once, instead of all of them. This may or may
    not matter depending on the size of your files, but using iterators is
    generally more scalable, though of course it's not always possible.

    I also opted to use a list comprehension instead of map, but this is
    totally a matter of personal preference -- the performance differences
    are probably negligible.

    Steve
    Steven Bethard, Jan 27, 2005
    #3
  4. mcg wrote:
    > Investigating python day 1:
    >
    > Data in file:
    > x y
    > 1 2
    > 3 4
    > 5 6
    >
    >
    > Want to read file into an array of pairs.
    >
    > in c: scanf("%d %d",&x,&y)---store x y in array, loop.
    >
    > How do I do this in python??
    > In the actual application, the pairs are floating pt i.e. -1.003
    >


    Either do what the other posters wrote, or if you really like scanf
    try the following Python module:

    Scanf --- a pure Python scanf-like module
    http://hkn.eecs.berkeley.edu/~dyoo/python/scanf/

    Bye,
    Dennis
    Dennis Benzinger, Jan 27, 2005
    #4
  5. On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
    <> wrote:
    > Stephen Thorne wrote:
    > > f = file('input', 'r')
    > > labels = f.readline() # consume the first line of the file.
    > >
    > > Easy Option:
    > > for line in f.readlines():
    > > x, y = line.split()
    > > x = float(x)
    > > y = float(y)
    > >
    > > Or, more concisely:
    > > for line in f.readlines():
    > > x, y = map(float, line.split())

    >
    > Somewhat more memory efficient:
    >
    > lines_iter = iter(file('input'))
    > labels = lines_iter.next()
    > for line in lines_iter:
    > x, y = [float(f) for f in line.split()]
    >
    > By using the iterator instead of readlines, I read only one line from
    > the file into memory at once, instead of all of them. This may or may
    > not matter depending on the size of your files, but using iterators is
    > generally more scalable, though of course it's not always possible.


    I just did a teensy test. All three options used exactly the same
    amount of total memory.

    I did all I did in the name of clarity, considering the OP was on his
    first day with python. How I would actually write it would be:

    inputfile = file('input','r')
    inputfile.readline()
    data = [map(float, line.split()) for line in inputfile]

    Notice how you don't have to call iter() on it, you can treat it as an
    iterable to begin with.

    Stephen.
    Stephen Thorne, Jan 27, 2005
    #5
  6. Stephen Thorne wrote:
    > I did all I did in the name of clarity, considering the OP was on his
    > first day with python. How I would actually write it would be:
    >
    > inputfile = file('input','r')
    > inputfile.readline()
    > data = [map(float, line.split()) for line in inputfile]
    >
    > Notice how you don't have to call iter() on it, you can treat it as an
    > iterable to begin with.


    Beware of mixing iterator methods and readline:

    http://docs.python.org/lib/bltin-file-objects.html

    next( )
    ...In order to make a for loop the most efficient way of looping
    over the lines of a file (a very common operation), the next() method
    uses a hidden read-ahead buffer. As a consequence of using a read-ahead
    buffer, combining next() with other file methods (like readline()) does
    not work right.

    I haven't tested your code in particular, but this warning was enough to
    make me generally avoid mixing iter methods and other methods.

    Steve
    Steven Bethard, Jan 27, 2005
    #6
  7. Steven Bethard <> wrote:
    ...
    > Beware of mixing iterator methods and readline:


    _mixing_, yes. But -- starting the iteration after some other kind of
    reading (readline, or read(N), etc) -- is OK...


    > http://docs.python.org/lib/bltin-file-objects.html
    >
    > next( )
    > ...In order to make a for loop the most efficient way of looping
    > over the lines of a file (a very common operation), the next() method
    > uses a hidden read-ahead buffer. As a consequence of using a read-ahead
    > buffer, combining next() with other file methods (like readline()) does
    > not work right.
    >
    > I haven't tested your code in particular, but this warning was enough to
    > make me generally avoid mixing iter methods and other methods.


    Yeah, I know... it's hard to explain exactly what IS a problem and what
    isn't -- not to mention that this IS to some extent a matter of the file
    object's implementation and the docs can't/don't want to constrain the
    implementer's future freedom, should it turn out to matter. Sigh.

    In the Nutshell (2nd ed), which is not normative and thus gives me a tad
    more freedom, I have tried to be a tiny bit more specific, taking
    advantage, also, of the fact that I'm now addressing the 2.3 and 2.4
    implementations, only. Quoting from my current draft (pardon the XML
    markup...):

    """
    interrupting such a loop prematurely (e.g., with <c>break</c>), or
    calling <r>f</r><c>.next()</c> instead of <r>f</r><c>.readline()</c>,
    leaves the file's current position at an arbitrary value. If you want
    to switch from using <r>f</r> as an iterator to calling other reading
    methods on <r>f</r>, be sure to set the file's current position to a
    known value by appropriately calling <r>f</r><c>.seek</c>.
    """

    I hope this concisely indicates that the problem (in today's current
    implementations) is only with switching FROM iteration TO other
    approaches to reading, and (if the file is seekable) there's nothing so
    problematic here that a good old 'seek' won't cure...


    Alex
    Alex Martelli, Jan 27, 2005
    #7
  8. Alex Martelli wrote:
    > Steven Bethard <> wrote:
    > ...
    >
    >>Beware of mixing iterator methods and readline:

    >

    [snip]
    >
    > I hope this concisely indicates that the problem (in today's current
    > implementations) is only with switching FROM iteration TO other
    > approaches to reading, and (if the file is seekable) there's nothing so
    > problematic here that a good old 'seek' won't cure...


    Thanks for the clarification!

    Steve
    Steven Bethard, Jan 27, 2005
    #8
  9. mcg

    Jeff Shannon Guest

    Stephen Thorne wrote:

    > On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
    > <> wrote:
    >
    >>By using the iterator instead of readlines, I read only one line from
    >>the file into memory at once, instead of all of them. This may or may
    >>not matter depending on the size of your files, but using iterators is
    >>generally more scalable, though of course it's not always possible.

    >
    > I just did a teensy test. All three options used exactly the same
    > amount of total memory.


    I would presume that, for a small file, the entire contents of the
    file will be sucked into the read buffer implemented by the underlying
    C file library. An iterator will only really save memory consumption
    when the file size is greater than that buffer's size.

    Actually, now that I think of it, there's probably another copy of the
    data at Python level. For readlines(), that copy is the list object
    itself. For iter and iter.next(), it's in the iterator's read-ahead
    buffer. So perhaps memory savings will occur when *that* buffer size
    is exceeded. It's also quite possible that both buffers are the same
    size...

    Anyhow, I'm sure that the fact that they use the same size for your
    test is a reflection of buffering. The next question is, which
    provides the most *conceptual* simplicity? (The answer to that one, I
    think, depends on how your brain happens to see things...)

    Jeff Shannon
    Technician/Programmer
    Credit International
    Jeff Shannon, Jan 27, 2005
    #9
  10. mcg

    enigma Guest

    Do you really need to use the iter function here? As far as I can
    tell, a file object is already an iterator. The file object
    documentation says that, "[a] file object is its own iterator, for
    example iter(f) returns f (unless f is closed)." It doesn't look like
    it makes a difference one way or the other, I'm just curious.
    enigma, Jan 27, 2005
    #10
  11. enigma wrote:
    > Do you really need to use the iter function here? As far as I can
    > tell, a file object is already an iterator. The file object
    > documentation says that, "[a] file object is its own iterator, for
    > example iter(f) returns f (unless f is closed)." It doesn't look like
    > it makes a difference one way or the other, I'm just curious.


    Nope, you're right -- that's just my obsessive-compulsive disorder
    kicking in. ;) A lot of objects aren't their own iterators, so I tend
    to ask for an iterator with iter() when I know I want one. But for
    files, this definitely isn't necessary:

    py> file('temp.txt', 'w').write("""\
    .... x y
    .... 1 2
    .... 3 4
    .... 5 6
    .... """)
    py> f = file('temp.txt')
    py> f.next()
    'x y\n'
    py> for line in f:
    .... print [float(f) for f in line.split()]
    ....
    [1.0, 2.0]
    [3.0, 4.0]
    [5.0, 6.0]

    And to illustrate Alex Martelli's point that using readline, etc. before
    using the file as an iterator is fine:

    py> f = file('temp.txt')
    py> f.readline()
    'x y\n'
    py> for line in f:
    .... print [float(f) for f in line.split()]
    ....
    [1.0, 2.0]
    [3.0, 4.0]
    [5.0, 6.0]

    But using readline, etc. after using the file as an iterator is *not*
    fine, generally:

    py> f = file('temp.txt')
    py> f.next()
    'x y\n'
    py> f.readline()
    ''

    In this case, if my understanding's right, the entire file contents have
    been read into the iterator buffer, so readline thinks the entire file's
    been read in and gives you '' to indicate this.

    Steve
    Steven Bethard, Jan 27, 2005
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stefan Duenser

    Conversion: String to std_ulogic_vector

    Stefan Duenser, Dec 9, 2004, in forum: VHDL
    Replies:
    2
    Views:
    658
  2. =?ISO-8859-1?Q?Hel=E9ne?=

    Problem with string and base64binary conversion

    =?ISO-8859-1?Q?Hel=E9ne?=, Jul 21, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    2,888
  3. James Tillery
    Replies:
    2
    Views:
    484
    James Tillery
    Oct 25, 2004
  4. Alexander Eisenhuth
    Replies:
    5
    Views:
    535
    Bob Gailer
    Jul 25, 2003
  5. , India
    Replies:
    2
    Views:
    456
    Fraser Ross
    Sep 15, 2009
Loading...

Share This Page