Proposal: runtime validation statement

Discussion in 'Python' started by Paul Rubin, Jul 11, 2004.

  1. Paul Rubin

    Paul Rubin Guest

    I frequently find myself writing stuff like

    # compute frob function, x has to be nonnegative
    x = read_input_data()
    assert x >= 0, x # mis-use of "assert" statement
    frob = sqrt(x) + x/2. + 3.

    This is not really correct because the assert statement is supposed to
    validate the logical consistency of the program itself, not the input
    data. So, for example, when you compile with optimization on, assert
    statements become no-ops. And yet, it's generally desirable to
    validate input whenever you can, and raising an exception is
    frequently the right thing to do with bad data. A function like

    class ValidationError(Exception): pass
    def _validate(cond, message):
    if not cond: raise ValidationError, message

    takes care of it, of course, so it's slightly redundant to add a
    special statement like

    validate x >= 0, (x, "must not be negative")

    which works exactly like assert but raises a different exception and
    is never optimized away. But the same can be said of the print
    statement (use sys.stdout.write or a print function instead) and for
    that matter the addition operator (use "x - (-y)" instead of x+y), the
    bool type (use 1 and 0 instead of True and False), etc.

    We have to conclude that choosing what statements the language
    supports is not just a matter of making things possible, but also of
    steering what the common idioms should be. Using a user-defined
    function to check input means a couple more program-specific things to
    remember (the function itself and the exception class it raises),
    clutters up the code, etc. And so I've come to feel that a "validate"
    statement (maybe with some different keyword) like the above is in the
    Pythonic spirit and should be considered for some forthcoming release.

    Thoughts?
     
    Paul Rubin, Jul 11, 2004
    #1
    1. Advertising

  2. Paul Rubin

    F. GEIGER Guest

    Paul Rubin wrote:

    > I frequently find myself writing stuff like
    >
    > # compute frob function, x has to be nonnegative
    > x = read_input_data()
    > assert x >= 0, x # mis-use of "assert" statement
    > frob = sqrt(x) + x/2. + 3.
    >
    > This is not really correct because the assert statement is supposed to
    > validate the logical consistency of the program itself, not the input
    > data. So, for example, when you compile with optimization on, assert
    > statements become no-ops. And yet, it's generally desirable to
    > validate input whenever you can, and raising an exception is
    > frequently the right thing to do with bad data. A function like
    >
    > class ValidationError(Exception): pass
    > def _validate(cond, message):
    > if not cond: raise ValidationError, message
    >
    > takes care of it, of course, so it's slightly redundant to add a
    > special statement like
    >
    > validate x >= 0, (x, "must not be negative")
    >
    > which works exactly like assert but raises a different exception and
    > is never optimized away. But the same can be said of the print
    > statement (use sys.stdout.write or a print function instead) and for
    > that matter the addition operator (use "x - (-y)" instead of x+y), the
    > bool type (use 1 and 0 instead of True and False), etc.
    >
    > We have to conclude that choosing what statements the language
    > supports is not just a matter of making things possible, but also of
    > steering what the common idioms should be. Using a user-defined
    > function to check input means a couple more program-specific things to
    > remember (the function itself and the exception class it raises),
    > clutters up the code, etc. And so I've come to feel that a "validate"
    > statement (maybe with some different keyword) like the above is in the
    > Pythonic spirit and should be considered for some forthcoming release.
    >
    > Thoughts?


    I use assert to protect my software from me, i.e. to catch programming
    errors, e.g. to catch cases where I called a method the wrong way. If
    this can happen under production consitions too, then it's a user error,
    not a programming error.

    So, I use raise to protect my software from user errors.

    Your sample seems to be a case of the latter, i.e. you have to write
    some sort of exception handling anyway. Issuing an error message
    anywhere in your code might not be what you really want.

    Kind regards
    Franz GEIGER
     
    F. GEIGER, Jul 12, 2004
    #2
    1. Advertising

  3. Paul Rubin

    Dave Brueck Guest

    Paul Rubin wrote:
    [snip]
    > frequently the right thing to do with bad data. A function like
    >
    > class ValidationError(Exception): pass
    > def _validate(cond, message):
    > if not cond: raise ValidationError, message
    >
    > takes care of it, of course, so it's slightly redundant to add a
    > special statement like
    >
    > validate x >= 0, (x, "must not be negative")
    >
    > which works exactly like assert but raises a different exception and
    > is never optimized away. But the same can be said of the print
    > statement (use sys.stdout.write or a print function instead) and for
    > that matter the addition operator (use "x - (-y)" instead of x+y), the
    > bool type (use 1 and 0 instead of True and False), etc.
    >
    > We have to conclude that choosing what statements the language
    > supports is not just a matter of making things possible, but also of
    > steering what the common idioms should be. Using a user-defined
    > function to check input means a couple more program-specific things to
    > remember (the function itself and the exception class it raises),
    > clutters up the code, etc. And so I've come to feel that a "validate"
    > statement (maybe with some different keyword) like the above is in the
    > Pythonic spirit and should be considered for some forthcoming release.
    >
    > Thoughts?


    Agreed!

    I usually end up subclassing Exception and writing a validation function
    like you show above. At first I liked the fact that a module threw a
    module-specific family of exceptions that could be caught downstream,
    but after having used this approach for some time I've come to the
    conclusion that the vast majority of the time the exceptions thrown are
    of the generic "ValidationError" variety, and that having them defined
    in a module-specific way added no value. By extension, the validation
    function itself adds no value and is just a nuisance.

    Also, a developer-defined function doesn't stand out as well as a
    statement would - a statement sets it apart from normal function calls
    which are doing the actual work to solve the problem at hand - and it'd
    be easy for syntax-highlighting editors to color it differently too.

    IMO 'validate' isn't too bad a choice for a keyword. Sorta long but it's
    quick to type.

    -Dave
     
    Dave Brueck, Jul 12, 2004
    #3
  4. Paul Rubin

    Ville Vainio Guest

    >>>>> "Paul" == Paul Rubin <http://> writes:

    Paul> takes care of it, of course, so it's slightly redundant to
    Paul> add a special statement like

    Paul> validate x >= 0, (x, "must not be negative")

    Paul> which works exactly like assert but raises a different
    Paul> exception and is never optimized away. But the same can be
    Paul> said of the print statement (use sys.stdout.write or a print
    Paul> function instead) and for that matter the addition operator
    Paul> (use "x - (-y)" instead of x+y), the bool type (use 1 and 0
    Paul> instead of True and False), etc.

    Yes, and I (and many others, I feel) consider print statement a wart
    in the language. Let's not make any more of these... Too bad it's so
    widely used it can't be right out deprecated.

    Paul> steering what the common idioms should be. Using a
    Paul> user-defined function to check input means a couple more
    Paul> program-specific things to remember (the function itself and
    Paul> the exception class it raises), clutters up the code, etc.
    Paul> And so I've come to feel that a "validate" statement (maybe
    Paul> with some different keyword) like the above is in the
    Paul> Pythonic spirit and should be considered for some
    Paul> forthcoming release.

    Any specific reason not to make it a builtin function instead of
    statement? I wouldn't mind a validation function that could also
    verify the data types of the arguments, which could then be used for
    code completion assistance and type inference... Since we don't know
    when a "real" type declarations happen. Expecting them to hit 2.5 is
    probably a bit too optimistic ;-).

    def a(x,y):
    validate((x,int), (y,str), x > int(y))

    (validate checks every tuple with isinstance(t[0],t[1]), every arg to
    validate that is "false" in the pythonic falsehoos sense2 fails the
    validation)

    Before something like that goes "official", help tools and IDEs can't
    use the type information.

    --
    Ville Vainio http://tinyurl.com/2prnb
     
    Ville Vainio, Jul 12, 2004
    #4
  5. Paul Rubin

    Ville Vainio Guest

    >>>>> "Dave" == Dave Brueck <> writes:

    Dave> Also, a developer-defined function doesn't stand out as well
    Dave> as a statement would - a statement sets it apart from normal
    Dave> function calls which are doing the actual work to solve the
    Dave> problem at hand - and it'd be easy for syntax-highlighting
    Dave> editors to color it differently too.

    It's as easy to color a function.

    We have too much statements that don't need to be statements
    already. "validate" is obvious library stuff...

    --
    Ville Vainio http://tinyurl.com/2prnb
     
    Ville Vainio, Jul 12, 2004
    #5
  6. Paul Rubin

    Dave Brueck Guest

    Ville Vainio wrote:

    >>>>>>"Dave" == Dave Brueck <> writes:

    >
    >
    > Dave> Also, a developer-defined function doesn't stand out as well
    > Dave> as a statement would - a statement sets it apart from normal
    > Dave> function calls which are doing the actual work to solve the
    > Dave> problem at hand - and it'd be easy for syntax-highlighting
    > Dave> editors to color it differently too.
    >
    > It's as easy to color a function.
    >
    > We have too much statements that don't need to be statements
    > already. "validate" is obvious library stuff...


    I disagree - there's a clear distinction between solving the problem and
    e.g. validating inputs to the problem solver, and having such checks as
    a statement is a good way to implement that distinction. That's why
    'assert' as a statement makes sense to me too - it and validate are sort
    of "out of band" with getting the actual work done, but useful nonetheless.

    Whether or not a validate keyword is a good idea should be judged
    independently of your opinion of whether or not 'print' is a wart.

    It's definitely not "obvious library stuff" IMO - if nothing else,
    making you import a library just to validate parameters is goofy. It
    would be semi-tolerable (though less than ideal) as a builtin.

    -Dave
     
    Dave Brueck, Jul 12, 2004
    #6
  7. On Mon, 12 Jul 2004, Dave Brueck wrote:

    > I disagree - there's a clear distinction between solving the problem and
    > e.g. validating inputs to the problem solver, and having such checks as
    > a statement is a good way to implement that distinction. That's why
    > 'assert' as a statement makes sense to me too - it and validate are sort
    > of "out of band" with getting the actual work done, but useful nonetheless.


    I like to think 'print' falls into this same "out of band" category --
    assuming it's used for debug purposes. Perhaps 'print' should be
    deprecated for 'normal' uses (i.e. file IO, user interaction) in favor of
    file IO operators (what one should hope any more than trivial program uses
    anyways), and, in some future Python, tossed away in optimized bytecode
    (much like assert statements).
     
    Christopher T King, Jul 12, 2004
    #7
  8. Paul Rubin

    Ville Vainio Guest

    >>>>> "Dave" == Dave Brueck <> writes:

    Dave> distinction. That's why 'assert' as a statement makes sense
    Dave> to me too - it and validate are sort of "out of band" with
    Dave> getting the actual work done, but useful nonetheless.

    Perhaps calling it _validate could imply out-of-bandness? I don't like
    the idea of making the "less important" constructs statements, only
    the most fundamental things should be statements.

    And we already have assert.

    Dave> It's definitely not "obvious library stuff" IMO - if nothing
    Dave> else, making you import a library just to validate
    Dave> parameters is goofy. It would be semi-tolerable (though less
    Dave> than ideal) as a builtin.

    +1 on making it a builtin.

    --
    Ville Vainio http://tinyurl.com/2prnb
     
    Ville Vainio, Jul 12, 2004
    #8
  9. Paul Rubin

    Paul Rubin Guest

    "F. GEIGER" <> writes:
    > I use assert to protect my software from me, i.e. to catch programming
    > errors, e.g. to catch cases where I called a method the wrong way. If
    > this can happen under production consitions too, then it's a user
    > error, not a programming error.
    >
    > So, I use raise to protect my software from user errors.


    But I write a lot of code for my own use, which means the user and the
    programmer are the same person, and any user error is also a
    programming error. Lots of times also, the data came from some other
    part of the program, so if the data is invalid, that's still a
    programming error.

    > Your sample seems to be a case of the latter, i.e. you have to write
    > some sort of exception handling anyway. Issuing an error message
    > anywhere in your code might not be what you really want.


    If you want to keep running after an AssertionError, you have to handle
    that too, but that doesn't make the assert statement useless.
     
    Paul Rubin, Jul 13, 2004
    #9
  10. Paul Rubin

    Paul Rubin Guest

    Ville Vainio <> writes:
    > Yes, and I (and many others, I feel) consider print statement a wart
    > in the language. Let's not make any more of these... Too bad it's so
    > widely used it can't be right out deprecated.


    I can sympathize with the notion that print and assert are warts, but
    I think they're considered to be important to Python's
    newbie-friendliness or something like that. As such, "validate" ought
    to be considered about the same way.

    > Any specific reason not to make [validate] a builtin function
    > instead of statement?


    It's similar enough to assert that for consistency in the language I
    think it ought to be done the same way. But a builtin function would
    be ok.

    > I wouldn't mind a validation function that could also verify the
    > data types of the arguments, which could then be used for code
    > completion assistance and type inference... Since we don't know when
    > a "real" type declarations happen. Expecting them to hit 2.5 is
    > probably a bit too optimistic ;-).
    >
    > def a(x,y):
    > validate((x,int), (y,str), x > int(y))


    If the compiler is going to rely on something like that, then it
    should definitely be a statement, rather than a function that the user
    can shadow with his own function that does something completely
    different. But if there's going to be type declarations, they ought
    to go into some new construction that cleans up the current scoping
    mes at the same time:

    local x:int, y:str # type declarations
    assert x > int(y) # compiler can use this as advice

    > (validate checks every tuple with isinstance(t[0],t[1]), every arg to
    > validate that is "false" in the pythonic falsehoos sense2 fails the
    > validation)


    This is a little bit bogus: first of all, data validation should be
    able to check arbitrary conditions including those that happen to be
    tuples. Second, validation must always be performed and must throw an
    exception at runtime, while compiler advice can be optimized away.
    So your example is more like "assert" than what I meant by validate.
     
    Paul Rubin, Jul 13, 2004
    #10
  11. Paul Rubin

    Ville Vainio Guest

    >>>>> "Paul" == Paul Rubin <http://> writes:

    Paul> I can sympathize with the notion that print and assert are
    Paul> warts, but I think they're considered to be important to
    Paul> Python's newbie-friendliness or something like that. As
    Paul> such, "validate" ought to be considered about the same way.

    How is

    print("hello",42)

    less newbie friendly than

    print "hello",42?

    To me, the former actually seems *more* newbie friendly because there
    is nothing special about it. The same applies for assert, validate and
    other statements that don't need to be statements.

    >> Any specific reason not to make [validate] a builtin function
    >> instead of statement?


    Paul> It's similar enough to assert that for consistency in the
    Paul> language I think it ought to be done the same way. But a
    Paul> builtin function would be ok.

    "foolish consistency..." ;-).

    >> validate((x,int), (y,str), x > int(y))


    Paul> If the compiler is going to rely on something like that,
    Paul> then it should definitely be a statement, rather than a
    Paul> function that the user

    Of course the compiler wouldn't rely on it. I was thinking of an
    interim solution for auxiliary tools like doc generators - but then I
    remembered we are going to get decorators soon, and they are better
    for this purpose.

    >> (validate checks every tuple with isinstance(t[0],t[1]), every arg to
    >> validate that is "false" in the pythonic falsehoos sense2 fails the
    >> validation)


    Paul> This is a little bit bogus: first of all, data validation should be

    Yes, it is - as I said, decorators will work better. My bad.

    --
    Ville Vainio http://tinyurl.com/2prnb
     
    Ville Vainio, Jul 13, 2004
    #11
  12. Paul Rubin

    John Roth Guest

    "Paul Rubin" <http://> wrote in message
    news:...
    > Ville Vainio <> writes:
    > > Yes, and I (and many others, I feel) consider print statement a wart
    > > in the language. Let's not make any more of these... Too bad it's so
    > > widely used it can't be right out deprecated.

    >
    > I can sympathize with the notion that print and assert are warts, but
    > I think they're considered to be important to Python's
    > newbie-friendliness or something like that. As such, "validate" ought
    > to be considered about the same way.


    Both statements have specific functions. The problem arises
    with people that try to use them for other purposes than
    they were intended.

    Print is for debugging prints. I use it a lot for that purpose,
    and never use it for any "user" output, even though I am
    both the programmer and the user in most cases. It
    works well when used for the purpose it was intended.

    Assert is intended for debugging and program validation.
    Using it for standard program logic is using it outside of
    its intended purpose. All the complaints I've ever seen
    about assert revolve around that.

    I don't like the notion of a validate statement for a number
    of reasons. One of them is that the proposal doesn't say
    what it would do in enough detail for me to figure out
    whether I could actually use it. My suspicion is that it
    wouldn't fit in with the way I write validation routines
    anyway. Clue: I generally don't throw exceptions.
    For me, validation logic is tightly tied to the (g)UI
    logic, since it's the UI that's going to have to tell the
    user that he mucked up.

    John Roth
     
    John Roth, Jul 14, 2004
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    837
  2. Jay McGavren
    Replies:
    11
    Views:
    1,186
    Alan Krueger
    Jan 16, 2006
  3. tedsuzman
    Replies:
    2
    Views:
    7,167
    Michel Claveau, résurectionné d'outre-bombe inform
    Jul 21, 2004
  4. Ted
    Replies:
    1
    Views:
    497
    Duncan Booth
    Jul 22, 2004
  5. Dan Fitzpatrick
    Replies:
    32
    Views:
    369
    Csaba Henk
    Feb 27, 2005
Loading...

Share This Page