building an online judge to evaluate Python programs

Discussion in 'Python' started by Jabba Laci, Sep 20, 2013.

  1. Jabba Laci

    Jabba Laci Guest

    Hi,

    In our school I have an introductory Python course. I have collected a
    large list of exercises for the students and I would like them to be
    able to test their solutions with an online judge (
    http://en.wikipedia.org/wiki/Online_judge ). At the moment I have a
    very simple web application that is similar to Project Euler: you
    provide the ID of the exercise and the output of the program, and it
    tells you if it's correct or not. However, it can only be used with
    programs that produce an output (usually a short string or a number).

    In the next step I would like to do the following. The user can upload
    his/her script, and the system tests it with various inputs and tells
    you if it's OK or not (like checkio.org for instance). How to get
    started with this?

    There are several questions:
    * What is someone sends an infinite loop? There should be a time limit.
    * What is someone sends a malicious code? The script should be run in a sandbox.

    All tips are appreciated.

    Thanks,

    Laszlo
     
    Jabba Laci, Sep 20, 2013
    #1
    1. Advertisements

  2. Jabba Laci

    Aseem Bansal Guest

    However, it can only be used with programs that produce an output

    Just interested, what else are you thinking of checking?
     
    Aseem Bansal, Sep 20, 2013
    #2
    1. Advertisements

  3. Jabba Laci

    Jabba Laci Guest

    Let's take this simple exercise:

    "Write a function that receives a list and decides whether the list is
    sorted or not."

    Here the output of the function is either True or False, so I cannot
    test it with my current method.

    Laszlo
     
    Jabba Laci, Sep 20, 2013
    #3
  4. Jabba Laci

    John Gordon Guest

    Make a master input file and a master output file for each exercise. If
    the student program's output matches the master output when run from the
    master input, then it is correct.
     
    John Gordon, Sep 20, 2013
    #4
  5. Jabba Laci

    John Gordon Guest

    You could run the judge as a background process, and kill it after ten
    seconds if it hasn't finished.
    You could run the judge from its own account that doesn't have access to
    anything else. For extra security, make the judge program itself owned by
    a separate account (but readable/executable by the judge account.)

    I suppose you'd have to disable mail access from the judge account too.
    Not sure how to easily do that.
     
    John Gordon, Sep 20, 2013
    #5
  6. Jabba Laci

    Jabba Laci Guest

    Jabba Laci, Sep 20, 2013
    #6
  7. At edX, I wrote CodeJail (https://github.com/edx/codejail) to use
    AppArmor to run Python securely.

    For grading Python programs, we use a unit-test like series of
    challenges. The student writes problems as functions (or classes), and
    we execute them with unit tests (not literally unittest, but a similar
    idea). We also tokenize the code to check for simple things like, did
    you use a while loop when the requirement was to write a recursive
    function. The grading code is not open-source, unfortunately, because
    it is part of the MIT courseware.

    --Ned.
     
    Ned Batchelder, Sep 20, 2013
    #7
  8. Jabba Laci

    Jabba Laci Guest

    Hi Ned,

    Could you please post here your AppArmor profile for restricted Python scripts?

    Thanks,

    Laszlo
     
    Jabba Laci, Sep 21, 2013
    #8
  9. Laszlo, the instructions are in the README, including the AppArmor
    profile. It isn't much:

    #include <tunables/global>

    <SANDENV>/bin/python {
    #include <abstractions/base>
    #include <abstractions/python>

    <SANDENV>/** mr,
    # If you have code that the sandbox must be able to access, add lines
    # pointing to those directories:
    /the/path/to/your/sandbox-packages/** r,

    /tmp/codejail-*/ rix,
    /tmp/codejail-*/** rix,
    }

    Note that there are other protections beyond AppArmor, setrlimits is also used to limit some resource use.

    --Ned.

    BTW: Top-posting makes it harder to follow threads of conversations, better form is to add your comments below the person you're replying to.
     
    Ned Batchelder, Sep 21, 2013
    #9
  10. As long as the student doesn't have access to the master in/out data,
    but only examples...

    Hearsay in my junior year at college was of a senior who couldn't
    manage to get his program to work -- so he basically embedded lots of
    output statements which basically wrote the expected output, based on
    access to the test input data.
     
    Dennis Lee Bieber, Sep 22, 2013
    #10
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.