Is pythonic version of scanf() or sscanf() planned?

Discussion in 'Python' started by ryniek90, Oct 3, 2009.

  1. ryniek90

    ryniek90 Guest

    Hi

    I know that in python, we can do the same with regexps or *.split()*,
    but thats longer and less practical method than *scanf()*. I also found
    that ( http://code.activestate.com/recipes/502213/ ), but the code
    doesn't looks so simple for beginners. So, whether it is or has been
    planned the core Python implementation of *scanf()* ? (prefered as a
    batteries included method)

    Thanks for any helpful answers.
    ryniek90, Oct 3, 2009
    #1
    1. Advertising

  2. On 2009-10-03, ryniek90 <> wrote:

    > So, whether it is or has been planned the core Python
    > implementation of *scanf()* ?


    One of the fist things I remember being taught as a C progrmmer
    was to never use scanf. Programs that use scanf tend to fail
    in rather spectacular ways when presented with simple typos and
    other forms of unexpected input.

    Given the bad behavior and general fragility of scanf(), I
    doubt there's much demand for something equally broken for
    Python.

    > Thanks for any helpful answers.


    Not sure if mine was helpful...
    Grant Edwards, Oct 4, 2009
    #2
    1. Advertising

  3. On Sun, 4 Oct 2009 01:17:18 +0000 (UTC),
    Grant Edwards <> wrote:
    > On 2009-10-03, ryniek90 <> wrote:
    >
    >> So, whether it is or has been planned the core Python
    >> implementation of *scanf()* ?

    >
    > One of the fist things I remember being taught as a C progrmmer
    > was to never use scanf. Programs that use scanf tend to fail
    > in rather spectacular ways when presented with simple typos and
    > other forms of unexpected input.


    That's right. One shouldn't use scanf() if the input is unpredictable
    osr comes from people, because the pitfalls are many and hard to avoid.
    However, for input that is formatted, scanf() is perfectly fine, and
    nice and fast.

    fgets() with sscanf() is better to control if your input is not as
    guaranteed.

    > Given the bad behavior and general fragility of scanf(), I
    > doubt there's much demand for something equally broken for
    > Python.


    scanf() is not broken. It's just hard to use correctly for unpredictable
    input.

    Having something equivalent in Python would be nice where most or all of
    your input is numerical, i.e. floats or integers. Using the re module,
    or splitting and converting everything with int() or float() slows down
    your program rather spectacularly. If those conversions could be done
    internally, and before storing the input as Python strings, the speed
    improvements could be significant.

    There is too much storing, splitting, regex matching and converting
    going on if you need to read numerical data from columns in a file.
    scanf() and friends make this sort of task rather quick and easy.

    For example, if your data is the output of a numerical analysis program
    or data coming from a set of measuring probes, it often takes the form
    of one or more columns of numbers, and there can be many of them. If you
    want to take one of these output files, and transform the data, Python
    can be terribly slow.

    It doesn't have to be scanf(), but something that would allow the direct
    reading of text input as numerical data would be nice.

    On the other hand, if something really needs to be fast, I generally
    write it in C anyway :)

    Martien
    --
    |
    Martien Verbruggen | Unix is user friendly. It's just
    | selective about its friends.
    |
    Martien Verbruggen, Oct 4, 2009
    #3
  4. ryniek90

    Simon Forman Guest

    On Sun, Oct 4, 2009 at 5:29 AM, Martien Verbruggen
    <> wrote:
    > On Sun, 4 Oct 2009 01:17:18 +0000 (UTC),
    >        Grant Edwards <> wrote:
    >> On 2009-10-03, ryniek90 <> wrote:
    >>
    >>> So, whether it is or has been planned the core Python
    >>> implementation of *scanf()* ?

    >>
    >> One of the fist things I remember being taught as a C progrmmer
    >> was to never use scanf.  Programs that use scanf tend to fail
    >> in rather spectacular ways when presented with simple typos and
    >> other forms of unexpected input.

    >
    > That's right. One shouldn't use scanf() if the input is unpredictable
    > osr comes from people, because the pitfalls are many and hard to avoid.
    > However, for input that is formatted, scanf() is perfectly fine, and
    > nice and fast.
    >
    > fgets() with sscanf() is better to control if your input is not as
    > guaranteed.
    >
    >> Given the bad behavior and general fragility of scanf(), I
    >> doubt there's much demand for something equally broken for
    >> Python.

    >
    > scanf() is not broken. It's just hard to use correctly for unpredictable
    > input.
    >
    > Having something equivalent in Python would be nice where most or all of
    > your input is numerical, i.e. floats or integers. Using the re module,
    > or splitting and converting everything with int() or float() slows down
    > your program rather spectacularly. If those conversions could be done
    > internally, and before storing the input as Python strings, the speed
    > improvements could be significant.
    >
    > There is too much storing, splitting, regex matching and converting
    > going on if you need to read numerical data from columns in a file.
    > scanf() and friends make this sort of task rather quick and easy.
    >
    > For example, if your data is the output of a numerical analysis program
    > or data coming from a set of measuring probes, it often takes the form
    > of one or more columns of numbers, and there can be many of them. If you
    > want to take one of these output files, and transform the data, Python
    > can be terribly slow.
    >
    > It doesn't have to be scanf(), but something that would allow the direct
    > reading of text input as numerical data would be nice.
    >
    > On the other hand, if something really needs to be fast, I generally
    > write it in C anyway :)
    >
    > Martien


    I haven't tried it but couldn't you use scanf from ctypes?
    Simon Forman, Oct 4, 2009
    #4
  5. On Sun, 4 Oct 2009 13:18:22 -0400,
    Simon Forman <> wrote:
    > On Sun, Oct 4, 2009 at 5:29 AM, Martien Verbruggen
    ><> wrote:
    >> On Sun, 4 Oct 2009 01:17:18 +0000 (UTC),
    >>        Grant Edwards <> wrote:
    >>> On 2009-10-03, ryniek90 <> wrote:
    >>>
    >>>> So, whether it is or has been planned the core Python
    >>>> implementation of *scanf()* ?


    >>> Given the bad behavior and general fragility of scanf(), I
    >>> doubt there's much demand for something equally broken for
    >>> Python.

    >>
    >> scanf() is not broken. It's just hard to use correctly for unpredictable
    >> input.
    >>
    >> Having something equivalent in Python would be nice where most or all of
    >> your input is numerical, i.e. floats or integers. Using the re module,
    >> or splitting and converting everything with int() or float() slows down
    >> your program rather spectacularly. If those conversions could be done
    >> internally, and before storing the input as Python strings, the speed
    >> improvements could be significant.


    > I haven't tried it but couldn't you use scanf from ctypes?


    I have just tried it. I wasn't aware of ctypes, being relatively new to
    Python. :)

    However, using ctypes makes the simple test program I wrote
    actually slower, rather than faster. Probably the extra conversions
    needed between ctypes internal types and Python's eat op more time.
    Built in scanf()-like functionality would not need to convert the same
    information two or three times. it would parse the bytes coming in from
    the input stream directly, and set the values of the appropriate Python
    variable directly.

    Contrive an example:
    Assume an input file with two integers, and three floats per line,
    separated by a space. output should be the same two integers, followed
    by the average of the three floats.

    In pure python, now, there is string manipulation (file.readline(), and
    split()) and conversion of floats going on:

    from sys import *
    for line in stdin:
    a, b, u, v, w = line.split()
    print a, " ", b, " ", (float(u) + float(v) + float(w))/3.0

    (17.57s user 0.07s system 99% cpu 17.728 total)

    With ctypes, it becomes something like:

    from sys import *
    from ctypes import *
    from ctypes.util import find_library

    libc = cdll.LoadLibrary(find_library('c'))
    a = c_int()
    b = c_int()
    u = c_float()
    v = c_float()
    w = c_float()
    for line in stdin:
    libc.sscanf(line, "%d%d%f%f%f",
    byref(a), byref(b), byref(u), byref(v), byref(w))
    print "{0} {1} {2}".format(a.value, b.value,
    (u.value + v.value + w.value)/3.0)

    (22.21s user 0.10s system 98% cpu 22.628)

    We no longer need split(), and the three conversions from string to
    float(), but now we have the 5 c_types(), and the .value dereferences at
    the end. And that makes it slower, unfortunately. (Maybe I'm still doing
    things a bit clumsily and it could be faster)

    It's not really a big deal: As I said before, if I really need the
    speed, I'll write C:

    #include <stdio.h>
    int main(void)
    {
    int a, b;
    float u, v, w;

    while (scanf("%d%d%f%f%f", &a, &b, &u, &v, &w) == 5)
    printf("%d %d %f\n", a, b, (u + v + w)/3.0);

    return 0;
    }

    (5.96s user 0.06s system 99% cpu 6.042 total)

    Martien
    --
    |
    Martien Verbruggen | There is no reason anyone would want a
    | computer in their home. -- Ken Olson,
    | president DEC, 1977
    Martien Verbruggen, Oct 4, 2009
    #5
  6. ryniek90

    TerryP Guest

    In the last 4 years, I have never missed functions like .*scanf() or
    atoi().

    It's probably a greeaaat thing that Python provides nether as built
    ins (per se).
    TerryP, Oct 4, 2009
    #6
  7. On Sun, 4 Oct 2009 15:48:16 -0700 (PDT), TerryP <>
    declaimed the following in gmane.comp.python.general:

    > In the last 4 years, I have never missed functions like .*scanf() or
    > atoi().
    >
    > It's probably a greeaaat thing that Python provides nether as built
    > ins (per se).


    Uhm... Isn't the second one spelled "int()" <G>
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Oct 6, 2009
    #7
  8. ryniek90

    ryniek90 Guest

    On 6 Paź, 06:37, Dennis Lee Bieber <> wrote:
    > On Sun, 4 Oct 2009 15:48:16 -0700 (PDT), TerryP <>
    > declaimed the following in gmane.comp.python.general:
    >
    > > In the last 4 years, I have never missed functions like .*scanf() or
    > > atoi().

    >
    > > It's probably a greeaaat thing that Python provides nether as built
    > > ins (per se).

    >
    >         Uhm... Isn't the second one spelled "int()" <G>
    > --
    >         Wulfraed         Dennis Lee Bieber               KD6MOG
    >              HTTP://wlfraed.home.netcom.com/




    Ok thanks all for answers. Not counting .split() methods and regexps,
    there's nothing interesting.
    But I remember that lambda function also was unwelcome in Python, but
    finally it is and is doing well. So maybe someone, someday decide to
    put in Python an alternative, really great implementation of scanf() ?
    ryniek90, Oct 8, 2009
    #8
  9. ryniek90

    Ben Sizer Guest

    On Oct 3, 11:06 pm, ryniek90 <> wrote:
    > Hi
    >
    > I know that in python, we can do the same with regexps or *.split()*,
    > but thats longer and less practical method than *scanf()*. I also found
    > that (http://code.activestate.com/recipes/502213/), but the code
    > doesn't looks so simple for beginners. So, whether it is or has been
    > planned the core Python implementation of *scanf()* ? (prefered as a
    > batteries included method)


    Perhaps struct.unpack is close to what you need? Admittedly that
    doesn't read from a file, but that might not be a problem in most
    cases.

    --
    Ben Sizer
    Ben Sizer, Oct 8, 2009
    #9
  10. 2009/10/8 ryniek90 <>:
    > Ok thanks all for answers. Not counting .split() methods and regexps,
    > there's nothing interesting.
    > But I remember that lambda function also was unwelcome in Python, but
    > finally it is and is doing well. So maybe someone, someday decide to
    > put in Python an alternative, really great implementation of scanf() ?


    Write one, post it on Google Code, the Python cookbook or somewhere,
    and if the world beats a path to your door then we'll talk.

    --
    Cheers,
    Simon B.
    Simon Brunning, Oct 8, 2009
    #10
  11. ryniek90

    Terry Reedy Guest

    ryniek90 wrote:
    > On 6 Paź, 06:37, Dennis Lee Bieber <> wrote:
    >> On Sun, 4 Oct 2009 15:48:16 -0700 (PDT), TerryP <>
    >> declaimed the following in gmane.comp.python.general:
    >>
    >>> In the last 4 years, I have never missed functions like .*scanf() or
    >>> atoi().
    >>> It's probably a greeaaat thing that Python provides nether as built
    >>> ins (per se).

    >> Uhm... Isn't the second one spelled "int()" <G>
    >> --
    >> Wulfraed Dennis Lee Bieber KD6MOG
    >> HTTP://wlfraed.home.netcom.com/

    >
    >
    >
    > Ok thanks all for answers. Not counting .split() methods and regexps,
    > there's nothing interesting.
    > But I remember that lambda function also was unwelcome in Python, but
    > finally it is and is doing well. So maybe someone, someday decide to
    > put in Python an alternative, really great implementation of scanf() ?


    scanf does three things: parses string fields out of text, optionally
    converts strings to numbers, and puts the results into pointed-to boxes.
    Since Python does not have pointer types, a python function cannot very
    well do the last, but has to return the tuple of objects. However, if a
    format string has named rather than positional fields, a Python function
    could either return a dict or set sttributes on an object. That could be
    useful.

    If I were doing this, I would look into using the new str.format()
    strings rather than %-formatted strings.
    Terry Reedy, Oct 8, 2009
    #11
  12. On Thu, 8 Oct 2009 09:33:56 -0700 (PDT), Ben Sizer <>
    declaimed the following in gmane.comp.python.general:

    >
    > Perhaps struct.unpack is close to what you need? Admittedly that
    > doesn't read from a file, but that might not be a problem in most
    > cases.
    >

    I suspect the biggest drawback is that it doesn't do string->numeric
    conversions, so one still has to run int(), float(), whatever() on the
    fields.

    It works great though if one needs to split up fixed-width records
    which may not have delimiters, or is working with binary records in
    which the data is already numeric.

    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Oct 8, 2009
    #12
  13. ryniek90 wrote:
    > So maybe someone, someday decide to
    > put in Python an alternative, really great implementation of scanf() ?


    My idea of a "great scanf() function" would be a clever combination of
    re.match(), int(), and float().

    j
    Joshua Kugler, Oct 9, 2009
    #13
  14. ryniek90

    TerryP Guest

    On Oct 9, 5:59 pm, Joshua Kugler <> wrote:
    > ryniek90 wrote:
    > > So maybe someone, someday decide to
    > > put in Python an alternative, really great implementation ofscanf() ?

    >
    > My idea of a "greatscanf() function" would be a clever combination of
    > re.match(), int(), and float().
    >
    > j


    Actually, the Python documentation has something interesting:
    http://docs.python.org/3.1/library/re.html#simulating-scanf
    TerryP, Oct 12, 2009
    #14
  15. ryniek90

    r Guest

    On Oct 3, 8:17 pm, Grant Edwards <> wrote:
    (--snip--)
    > One of the fist things I remember being taught as a C progrmmer
    > was to never use scanf.  Programs that use scanf tend to fail
    > in rather spectacular ways when presented with simple typos and
    > other forms of unexpected input.  
    >
    > Given the bad behavior and general fragility of scanf(), I
    > doubt there's much demand for something equally broken for
    > Python.


    I don't think you can blame scanf() for that. More the "bad behavior"
    of humans and "uncanny" ability of human fingers to press the the
    wrong damn keys.
    r, Oct 12, 2009
    #15
  16. ryniek90

    Aahz Guest

    In article <>,
    ryniek90 <> wrote:
    >
    >But I remember that lambda function also was unwelcome in Python, but
    >finally it is and is doing well. So maybe someone, someday decide to
    >put in Python an alternative, really great implementation of scanf() ?


    How long have you been using Python? lambda has been there almost from
    the beginning.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "To me vi is Zen. To use vi is to practice zen. Every command is a
    koan. Profound to the user, unintelligible to the uninitiated. You
    discover truth everytime you use it."
    Aahz, Oct 13, 2009
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Carl J. Van Arsdall
    Replies:
    4
    Views:
    488
    Bruno Desthuilliers
    Feb 7, 2006
  2. Rob Thorpe

    Correct behaviour of scanf and sscanf

    Rob Thorpe, Mar 14, 2005, in forum: C Programming
    Replies:
    6
    Views:
    439
    Dan Pop
    Mar 15, 2005
  3. effbiae

    sscanf and scanf behave differently

    effbiae, Jan 19, 2006, in forum: C Programming
    Replies:
    2
    Views:
    344
    Keith Thompson
    Jan 19, 2006
  4. ryniek90
    Replies:
    0
    Views:
    232
    ryniek90
    Oct 13, 2009
  5. Bill Cunningham

    scanf and sscanf

    Bill Cunningham, Jul 8, 2013, in forum: C Programming
    Replies:
    14
    Views:
    331
    Keith Thompson
    Jul 9, 2013
Loading...

Share This Page