Taint (like in Perl) as a Python module: taint.py

Discussion in 'Python' started by Johann C. Rocholl, Feb 5, 2007.

  1. The following is my first attempt at adding a taint feature to Python
    to prevent os.system() from being called with untrusted input. What do
    you think of it?

    # taint.py - Emulate Perl's taint feature in Python
    # Copyright (C) 2007 Johann C. Rocholl <>
    #
    # Permission is hereby granted, free of charge, to any person
    # obtaining a copy of this software and associated documentation files
    # (the "Software"), to deal in the Software without restriction,
    # including without limitation the rights to use, copy, modify, merge,
    # publish, distribute, sublicense, and/or sell copies of the Software,
    # and to permit persons to whom the Software is furnished to do so,
    # subject to the following conditions:
    #
    # The above copyright notice and this permission notice shall be
    # included in all copies or substantial portions of the Software.
    #
    # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
    # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
    # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
    # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
    # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
    # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
    # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    # SOFTWARE.


    """
    Emulate Perl's taint feature in Python

    This module replaces all functions in the os module (except stat) with
    wrappers that will raise an Exception called TaintError if any of the
    parameters is a tainted string.

    All strings are tainted by default, and you have to call untaint on a
    string to create a safe string from it.

    Stripping, zero-filling, and changes to lowercase or uppercase don't
    taint a safe string.

    If you combine strings with + or join or replace, the result will be a
    tainted string unless all its parts are safe.

    It is probably a good idea to run some checks on user input before you
    call untaint() on it. The safest way is to design a regex that matches
    legal input only. A regex that tries to match illegal input is very
    hard to prove complete.

    You can run the following examples with the command
    python taint.py -v
    to test if this module works as designed.

    >>> unsafe = 'test'
    >>> tainted(unsafe)

    True
    >>> os.system(unsafe)

    Traceback (most recent call last):
    TaintError
    >>> safe = untaint(unsafe)
    >>> tainted(safe)

    False
    >>> os.system(safe)

    256
    >>> safe + unsafe

    u'testtest'
    >>> safe.join([safe, unsafe])

    u'testtesttest'
    >>> tainted(safe + unsafe)

    True
    >>> tainted(safe + safe)

    False
    >>> tainted(unsafe.join([safe, safe]))

    True
    >>> tainted(safe.join([safe, unsafe]))

    True
    >>> tainted(safe.join([safe, safe]))

    False
    >>> tainted(safe.replace(safe, unsafe))

    True
    >>> tainted(safe.replace(safe, safe))

    False
    >>> tainted(safe.capitalize()) or tainted(safe.title())

    False
    >>> tainted(safe.lower()) or tainted(safe.upper())

    False
    >>> tainted(safe.strip()) or tainted(safe.rstrip()) or tainted(safe.lstrip())

    False
    >>> tainted(safe.zfill(8))

    False
    >>> tainted(safe.expandtabs())

    True
    """

    import os
    import types


    class TaintError(Exception):
    """
    This exception is raised when you try to call a function in the os
    module with a string parameter that isn't a SafeString.
    """
    pass


    class SafeString(unicode):
    """
    A string class that you must use for parameters to functions in
    the os module.
    """

    def __add__(self, other):
    """Create a safe string if the other string is also safe."""
    if tainted(other):
    return unicode.__add__(self, other)
    return untaint(unicode.__add__(self, other))

    def join(self, sequence):
    """Create a safe string if all components are safe."""
    for element in sequence:
    if tainted(element):
    return unicode.join(self, sequence)
    return untaint(unicode.join(self, sequence))

    def replace(self, old, new, *args):
    """Create a safe string if the replacement text is also
    safe."""
    if tainted(new):
    return unicode.replace(self, old, new, *args)
    return untaint(unicode.replace(self, old, new, *args))

    def strip(self, *args):
    return untaint(unicode.strip(self, *args))

    def lstrip(self, *args):
    return untaint(unicode.lstrip(self, *args))

    def rstrip(self, *args):
    return untaint(unicode.rstrip(self, *args))

    def zfill(self, *args):
    return untaint(unicode.zfill(self, *args))

    def capitalize(self):
    return untaint(unicode.capitalize(self))

    def title(self):
    return untaint(unicode.title(self))

    def lower(self):
    return untaint(unicode.lower(self))

    def upper(self):
    return untaint(unicode.upper(self))


    # Alias to the constructor of SafeString,
    # so that untaint('abc') gives you a safe string.
    untaint = SafeString


    def tainted(param):
    """
    Check if a string is tainted.
    If param is a sequence or dict, all elements will be checked.
    """
    if isinstance(param, (tuple, list)):
    for element in param:
    if tainted(element):
    return True
    elif isinstance(param, dict):
    return tainted(param.values())
    elif isinstance(param, (str, unicode)):
    return not isinstance(param, SafeString)
    else:
    return False


    def wrapper(function):
    """Create a new function that checks its parameters first."""
    def check_first(*args, **kwargs):
    """Check all parameters for unsafe strings, then call."""
    if tainted(args) or tainted(kwargs):
    raise TaintError
    return function(*args, **kwargs)
    return check_first


    def install_wrappers(module, innocent):
    """
    Replace each function in the os module with a wrapper that checks
    the parameters first, except if the name of the function is in the
    innocent list.
    """
    for name, function in module.__dict__.iteritems():
    if name in innocent:
    continue
    if type(function) in [types.FunctionType,
    types.BuiltinFunctionType]:
    module.__dict__[name] = wrapper(function)


    install_wrappers(os, innocent = ['stat'])


    if __name__ == '__main__':
    import doctest
    doctest.testmod()
    Johann C. Rocholl, Feb 5, 2007
    #1
    1. Advertising

  2. En Mon, 05 Feb 2007 19:13:04 -0300, Johann C. Rocholl
    <> escribió:

    > The following is my first attempt at adding a taint feature to Python
    > to prevent os.system() from being called with untrusted input. What do
    > you think of it?


    A simple reload(os) will drop all your wrapped functions, leaving the
    original ones.
    I suppose you don't intend to publish the SafeString class - but if anyone
    can get a SafeString instance in any way or another, he can convert
    *anything* into a SafeString trivially.
    And tainted() returns False by default?????

    Sorry but in general, this won't work :(

    --
    Gabriel Genellina
    Gabriel Genellina, Feb 6, 2007
    #2
    1. Advertising

  3. Johann C. Rocholl

    Ben Finney Guest

    "Gabriel Genellina" <> writes:

    > I suppose you don't intend to publish the SafeString class - but if
    > anyone can get a SafeString instance in any way or another, he can
    > convert *anything* into a SafeString trivially.


    The point (in Perl) of detecting taint isn't to prevent a programmer
    from deliberately removing the taint. It's to help the programmer find
    places in the code where taint accidentally remains.

    > And tainted() returns False by default?????
    > Sorry but in general, this won't work :(


    I'm inclined to agree that the default should be to flag an object as
    tainted unless known otherwise.

    --
    \ "On the other hand, you have different fingers." -- Steven |
    `\ Wright |
    _o__) |
    Ben Finney
    Ben Finney, Feb 6, 2007
    #3
  4. En Mon, 05 Feb 2007 23:01:51 -0300, Ben Finney
    <> escribió:

    > "Gabriel Genellina" <> writes:
    >
    >> I suppose you don't intend to publish the SafeString class - but if
    >> anyone can get a SafeString instance in any way or another, he can
    >> convert *anything* into a SafeString trivially.

    >
    > The point (in Perl) of detecting taint isn't to prevent a programmer
    > from deliberately removing the taint. It's to help the programmer find
    > places in the code where taint accidentally remains.


    I'm not convinced at all of the usefulness of tainting.
    How do you "untaint" a string? By checking some conditions?
    Let's say, you validate and untaint a string, regarding it's future usage
    on a command line, so you assume it's safe to use on os.system calls - but
    perhaps it still contains a sql injection trap (and being untainted you
    use it anyway!).
    Tainting may be useful for a short lived string, one that is used on the
    *same* process as it was created. And in this case, unit testing may be a
    good way to validate the string usage along the program.
    But if you store input text on a database or configuration file (username,
    password, address...) it may get used again by *another* process, maybe a
    *different* program, even months later. What to do? Validate all input for
    any possible type of unsafe usage before storing them in the database, so
    it is untainted? Maybe... but I'd say it's better to ensure things are
    *done* *safely* instead of trusting a flag. (Uhmm, perhaps it's like "have
    safe sex; use a condom" instead of "require an HIV certificate")

    That is:
    - for sql injection, use parametrized queries, don't build SQL statements
    by hand.
    - for html output, use any safe template engine, always quoting inputs.
    - for os.system and similar, validate the command line and arguments right
    before being executed.
    and so on.

    --
    Gabriel Genellina
    Gabriel Genellina, Feb 6, 2007
    #4
  5. Johann C. Rocholl

    Paul Rubin Guest

    "Gabriel Genellina" <> writes:
    > I'm not convinced at all of the usefulness of tainting.
    > How do you "untaint" a string? By checking some conditions?


    In perl? I don't think you can untaint a string, but you can make a
    new untainted string by extracting a regexp match from the tainted
    string's contents.

    > Let's say, you validate and untaint a string, regarding it's future
    > usage on a command line, so you assume it's safe to use on os.system
    > calls - but perhaps it still contains a sql injection trap (and being
    > untainted you use it anyway!).


    Well, ok, you didn't check it carefully enough, but at least you made
    an attempt. Taint checking is a useful feature in perl.

    > Tainting may be useful for a short lived string, one that is used on
    > the *same* process as it was created. And in this case, unit testing
    > may be a good way to validate the string usage along the program.


    Unit testing is completely overrated for security testing. It checks
    the paths through the program that you've written tests for. Taint
    checking catches errors in paths that you never realized existed.

    > - for sql injection, use parametrized queries, don't build SQL
    > statements by hand.
    > - for html output, use any safe template engine, always quoting inputs.
    > - for os.system and similar, validate the command line and arguments
    > right before being executed. and so on.


    Right, but it's easy to make errors and overlook things, and taint
    checking catches a lot of such mistakes.
    Paul Rubin, Feb 6, 2007
    #5
  6. On Feb 6, 3:01 am, Ben Finney <>
    wrote:
    > "Gabriel Genellina" <> writes:
    > > And tainted() returns False by default?????
    > > Sorry but in general, this won't work :(

    >
    > I'm inclined to agree that the default should be to flag an object as
    > tainted unless known otherwise.


    That's true. For example, my first attempt didn't prevent this:
    os.open(buffer('/etc/passwd'), os.O_RDONLY)

    Here's a stricter version:

    def tainted(param):
    """
    Check if a parameter is tainted. If it's a sequence or dict, all
    values will be checked (but not the keys).
    """
    if isinstance(param, unicode):
    return not isinstance(param, SafeString)
    elif isinstance(param, (bool, int, long, float, complex, file)):
    return False
    elif isinstance(param, (tuple, list)):
    for element in param:
    if tainted(element):
    return True
    elif isinstance(param, dict):
    return tainted(param.values())
    else:
    return True
    Johann C. Rocholl, Feb 6, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Patrick Kowalzick
    Replies:
    5
    Views:
    471
    Patrick Kowalzick
    Mar 14, 2006
  2. Ben
    Replies:
    17
    Views:
    229
  3. Mark J Fenbers

    Perl Taint issue

    Mark J Fenbers, Jan 28, 2004, in forum: Perl Misc
    Replies:
    4
    Views:
    105
    Mark J Fenbers
    Jan 28, 2004
  4. Replies:
    2
    Views:
    167
  5. Davy
    Replies:
    2
    Views:
    141
Loading...

Share This Page