Enormous Input and Output Test

Discussion in 'Python' started by n00m, Oct 3, 2009.

  1. n00m

    n00m Guest

    Hi, py.folk!

    I need your help to understand how
    http://www.spoj.pl/problems/INOUTEST/
    can be passed in Python.

    I see two guys who managed to get accepted:
    http://www.spoj.pl/ranks/INOUTEST/lang=PYTH

    My code for this is:
    ===========================================
    import psyco
    psyco.full()

    import sys

    def noo(b):
    b = b.split()
    return str(int(b[0]) * int(b[1])) + '\n'

    def foo():
    ##sys.stdin = open('D:/1583.txt', 'rt')
    a = sys.stdin.readlines()
    a = a[1:int(a[0]) + 1]
    a = map(noo, a)
    sys.stdout.writelines(a)

    foo()
    ===========================================

    But it gets "Time Limit Exceeded" verdict.
    Any ideas?
    n00m, Oct 3, 2009
    #1
    1. Advertising

  2. n00m

    Chris Rebert Guest

    On Sat, Oct 3, 2009 at 6:54 AM, n00m <> wrote:
    > Hi, py.folk!
    >
    > I need your help to understand how
    > http://www.spoj.pl/problems/INOUTEST/
    > can be passed in Python.

    <snip>
    > def foo():
    >    ##sys.stdin = open('D:/1583.txt', 'rt')
    >    a = sys.stdin.readlines()


    That line is probably a Very Bad Idea (TM) as it reads the *entire*
    enormous file into memory *at once*. It would probably be much better
    to iterate over the file, thus only reading one individual line at a
    time. I'm betting the massive malloc()ing involved with .readlines()
    is a large part of the slowness.

    <snip>
    > But it gets "Time Limit Exceeded" verdict.
    > Any ideas?


    Cheers,
    Chris
    --
    http://blog.rebertia.com
    Chris Rebert, Oct 4, 2009
    #2
    1. Advertising

  3. n00m

    Terry Reedy Guest

    n00m wrote:
    > Hi, py.folk!
    >
    > I need your help to understand how
    > http://www.spoj.pl/problems/INOUTEST/
    > can be passed in Python.
    >
    > I see two guys who managed to get accepted:
    > http://www.spoj.pl/ranks/INOUTEST/lang=PYTH
    >
    > My code for this is:
    > ===========================================
    > import psyco
    > psyco.full()
    >
    > import sys
    >
    > def noo(b):
    > b = b.split()
    > return str(int(b[0]) * int(b[1])) + '\n'
    >
    > def foo():
    > ##sys.stdin = open('D:/1583.txt', 'rt')
    > a = sys.stdin.readlines()
    > a = a[1:int(a[0]) + 1]
    > a = map(noo, a)
    > sys.stdout.writelines(a)
    >
    > foo()
    > ===========================================
    >
    > But it gets "Time Limit Exceeded" verdict.
    > Any ideas?


    Don't waste your time with problem sites that judge raw-clock time over
    (and before) accuracy, thereby greatly favoring low-level languages and
    hack tricks over clear high-level code.
    Terry Reedy, Oct 4, 2009
    #3
  4. n00m

    n00m Guest

    On Oct 4, 2:29 am, Chris Rebert <> wrote:

    >
    > That line is probably a Very Bad Idea (TM) as it reads the *entire*
    > enormous file into memory *at once*. It would probably be much better
    > to iterate over the file, thus only reading one individual line at a
    > time. I'm betting the massive malloc()ing involved with .readlines()
    > is a large part of the slowness.


    Certainly not.
    The culprit is line "a = map(noo, a)".
    Without it execution time = 2.59s (I've just checked it).
    n00m, Oct 4, 2009
    #4
  5. n00m

    n00m Guest

    PS
    Time Limit for this problem = 20s
    n00m, Oct 4, 2009
    #5
  6. n00m

    alex23 Guest

    On Oct 3, 11:54 pm, n00m <> wrote:
    > I need your help to understand howhttp://www.spoj.pl/problems/INOUTEST/
    > can be passed in Python.
    >
    > My code for this is:
    > ===========================================
    > import psyco
    > psyco.full()
    >
    > import sys
    >
    > def noo(b):
    >     b = b.split()
    >     return str(int(b[0]) * int(b[1])) + '\n'
    >
    > def foo():
    >     ##sys.stdin = open('D:/1583.txt', 'rt')
    >     a = sys.stdin.readlines()
    >     a = a[1:int(a[0]) + 1]
    >     a = map(noo, a)
    >     sys.stdout.writelines(a)
    >
    > foo()
    > ===========================================
    >
    > But it gets "Time Limit Exceeded" verdict.
    > Any ideas?


    map() is really at its most efficient when used with built-ins, not
    user defined functions. In those cases, I believe you're better off
    using a for-loop or list comprehension. Also: are you sure they
    support psyco? It's not part of stdlib...

    I thought something simpler might work:

    import sys

    sys.stdin.readline()

    for line in sys.stdin.readlines():
    nums = map(int, line.split())
    print nums[0] * nums[1]

    But although this processes 5.5MB < 2 secs on my PC (which meets the
    2.5MB/sec criteria outlined in INTEST), I'm getting the same result
    that you are.

    Do you know how big the input data set actually is?
    alex23, Oct 4, 2009
    #6
  7. n00m

    n00m Guest

    > Do you know how big the input data set actually is?

    Of course, I don't know exact size of input.
    It's several MBs, I guess. And mind the fact:
    their testing machines are PIII (750MHz).
    n00m, Oct 4, 2009
    #7
  8. n00m

    n00m Guest

    PS
    Yes, they support psyco since long time ago
    (otherwise I'd get Compilitation Error verdict).
    I used Psyco there many many times.
    n00m, Oct 4, 2009
    #8
  9. n00m

    alex23 Guest

    On Oct 4, 1:58 pm, n00m <> wrote:
    > > Do you know how big the input data set actually is?

    >
    > Of course, I don't know exact size of input.
    > It's several MBs, I guess. And mind the fact:
    > their testing machines are PIII (750MHz).


    Well, then, that's moved the problem from "challenging" to
    "ludicrous". Being asked to conform to one constraint with no
    knowledge of the others seems kind of ridiculous.

    On my machine, the above code handles ~50MB in ~10sec.
    alex23, Oct 4, 2009
    #9
  10. n00m

    n00m Guest

    > On my machine, the above code handles ~50MB in ~10sec.

    Means their input > 40-50MB
    2.
    I just see: two guys did it in Python
    and I feel myself curious "how on earth?".
    n00m, Oct 4, 2009
    #10
  11. n00m

    n00m Guest

    This time limits too:

    =====================================================
    import psyco
    psyco.full()

    import sys

    def foo():
    ##sys.stdin = open('D:/1583.txt', 'rt')
    a = sys.stdin.readlines()
    a = a[1:int(a[0]) + 1]
    for ai in a:
    x, y = ai.split()
    sys.stdout.write(str(int(x) * int(y)) + '\n')

    foo()
    =====================================================
    n00m, Oct 4, 2009
    #11
  12. n00m

    n00m Guest

    And *without* Psyco the above code gets TLE verdict...

    A kind of mystery :(
    n00m, Oct 4, 2009
    #12
  13. n00m

    John Yeung Guest

    On Oct 3, 11:58 pm, n00m <> wrote:
    > > Do you know how big the input data set actually is?

    >
    > Of course, I don't know exact size of input.
    > It's several MBs, I guess. And mind the fact:
    > their testing machines are PIII (750MHz).


    You know the maximum size of the input, if you can trust the problem
    definition. The maximum number of lines in the input is 10**6 + 1.
    The first line is at most 7 characters, plus EOL. The subsequent
    lines are at most 13 characters each, plus EOL.

    John
    John Yeung, Oct 4, 2009
    #13
  14. n00m

    n00m Guest

    It can be not so "simple".
    There can be multiple input files,
    with *total* size ~30-50-80 MB.
    n00m, Oct 4, 2009
    #14
  15. n00m

    n00m Guest

    And this code time limits (no matter with or without Psyco):

    import psyco
    psyco.full()

    import sys

    def foo():
    ##sys.stdin = open('D:/1583.txt', 'rt')
    sys.stdin.readline()
    while 1:
    try:
    x, y = sys.stdin.readline().split()
    sys.stdout.write(str(int(x) * int(y)) + '\n')
    except:
    break

    foo()
    n00m, Oct 4, 2009
    #15
  16. n00m

    John Yeung Guest

    On Oct 4, 1:50 am, n00m <> wrote:
    > It can be not so "simple".
    > There can be multiple input files,
    > with *total* size ~30-50-80 MB.


    According to one of the global moderators, the 20s time limit is for
    each input file:

    https://www.spoj.pl/forum/viewtopic.php?f=6&t=4667

    John
    John Yeung, Oct 4, 2009
    #16
  17. n00m

    n00m Guest

    I've given up :)
    n00m, Oct 4, 2009
    #17
  18. n00m

    John Yeung Guest

    On Oct 4, 4:20 am, n00m <> wrote:
    > I've given up :)


    Well, that numerix user (who already had the top Python solution) just
    submitted a ton of new ones to that problem, apparently trying to get
    a faster time. I don't think he can squeeze much more out of that
    stone, but unlike us, he's routinely under 11s. Crazy.

    I wish they had an option to let you keep running your program past
    the limit (at least two or three times the limit) to give you more
    feedback, even if they still consider your solution unacceptable.
    Especially in the tutorial section, which doesn't seem to contribute
    to your score anyway.

    John
    John Yeung, Oct 4, 2009
    #18
  19. n00m

    Bearophile Guest

    Terry Reedy:

    > Don't waste your time with problem sites that judge raw-clock time over
    > (and before) accuracy, thereby greatly favoring low-level languages and
    > hack tricks over clear high-level code.


    I usually don't like to solve the kind of problems shown by those
    sites because those problems are too much artificial (and often too
    much difficult). But sometimes I have written some solutions.

    But those sites never judge "raw" running time over accuracy: in most
    or all such sites the programs are tested with tons of possible
    inputs, and if even one output is a bit wrong, the program is totally
    refused. This is a hard rule that encourages programmers to take a
    very good care of program correctness.

    Some sites add a little more "noise" in the inputs, simulating a bit
    more real-world inputs, while most of those online contests give clean
    inputs (the input bounds are well specified in the problem statement).

    From what I've seen from some of the best solutions submitted to those
    sites (some sites allow people to see the source of the contest
    entries), the programs usually don't (need to) use "hack tricks" as
    you say (even if probably some program uses them). Using hacks is
    often unsafe so people usually prefer safer ways to code, because just
    a little bug may fully compromise the score of the program.

    I agree that the timing scores in such sites often encourage low level
    languages, like C (and sometimes C++, that's a multilevel language),
    but on the other hand such languages exist, C is used in many real-
    world places, so designing sites where people compete with such
    languages is legit. C allows people to understand better what's going
    on inside the computer, this is valuable and positive. Bashing low-
    level languages is silly. Even CPython is written in C. A good
    programmer has to know both higher and lower level languages.

    And in real-life sometimes you need performance. This thread shows
    that a normal Python program is not up to those timings for the
    enormous input problem (even if there are ways to write a Python
    program to solve this problem). People at Google are trying to create
    a 5 times faster Python (Unladen Swallow project) because they use lot
    of real-world Python code and they think Python is slow. I've found
    plenty of situations where CPython code is not fast enough for my
    purposes.

    Bye,
    bearophile
    Bearophile, Oct 4, 2009
    #19
  20. n00m

    n00m Guest

    > but unlike us, he's routinely under 11s. Crazy.

    No wonder!
    50-80%% of users from the 1st page in ranklist are
    super-extra-brilliant (young students) programmers.
    They are winners of numerous competitions, national
    and international olympiads on informatics, etc.
    Some of them are/were even true wunderkinders.
    E.g. Tomek Czajka from Poland (now he lives and works,
    in some university, in America)
    n00m, Oct 4, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lonifasiko
    Replies:
    20
    Views:
    872
    Patrick May
    Oct 21, 2006
  2. cyberco
    Replies:
    6
    Views:
    676
    John Machin
    Nov 20, 2006
  3. ais523

    Are enormous malloc()s conforming?

    ais523, Dec 13, 2006, in forum: C Programming
    Replies:
    3
    Views:
    291
    Spiros Bousbouras
    Dec 13, 2006
  4. Anarki
    Replies:
    1
    Views:
    282
    Alf P. Steinbach
    Aug 12, 2007
  5. Skybuck Flying

    Call oddities: &Test() vs &Test vs Test

    Skybuck Flying, Oct 4, 2009, in forum: C Programming
    Replies:
    1
    Views:
    680
    Skybuck Flying
    Oct 4, 2009
Loading...

Share This Page