Re: String substitution VS proper mysql escaping

Discussion in 'Python' started by Cameron Simpson, Aug 18, 2010.

  1. On 18Aug2010 12:07, Nik Gr <> wrote:
    | Στις 18/8/2010 7:31 πμ, ο/η Cameron Simpson έγÏαψε:
    | >On 17Aug2010 20:15, Îίκος<> wrote:
    | >| ===============================
    | >| cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
    | >| '%s' ORDER BY date DESC ''' % (page) )
    | >| ===============================
    | >|
    | >| Someone told me NOT to do string substitution ("%") on SQL statements
    | >| and to let MySQLdb do it
    | >| for me, with proper escaping like the following
    | >|
    | >| ===============================
    | >| cursor.execute('''SELECT host, hits, date FROM visitors WHERE page=%s
    | >| ORDER BY date DESC''', (page,))
    | >| ===============================
    | >|
    | >| The difference is that if some external source can control "page",
    | >| and
    | >| they put in a value like
    | >| 100 ; DELETE FROM visitors; SELECT * FROM visitors
    | >| i will be losing my database table data.
    | >
    | >That other difference is that the mysql dialect support knows how to
    | >correctly escape a string for insertion into an SQL statement. You may
    | >not, or may forget to pre-escape the string, etc. Using the MySQLdb
    | >stuff do it for you is reliable and robust.
    |
    | Can you please tell me what escaping means by giving me an example
    | of what is escaped and whats isn't?

    In your plain substitution example above:

    cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
    '%s' ORDER BY date DESC ''' % (page) )

    Supposing page is the string "100". This will produce the SQL statement:

    SELECT host, hits, date FROM visitors WHERE page = '100' ORDER BY date DESC

    which looks ok. But suppose page was the string:

    bill o'reilly

    Then your SQL statement looks like this:

    SELECT host, hits, date FROM visitors WHERE page = 'bill o'reilly' ORDER BY date DESC

    To the SQL engine this looks like the string "bill o" followed by an SQL
    instruction named "reilly", and then the opening quote for another string.
    Invalid SQL.

    The procedure used to avoid this problem (to insert an _arbitrary_
    string into the SQL statement) is to "escape" problematic characters in
    strings when placing them into SQL statements. In this case, the quote
    character in the string is the SQL "end string" character. Therefore the
    string must be modified in the SQL statement to be correctly expressed.

    IIRC, SQL uses the quote doubling convention for strings, so this:

    SELECT host, hits, date FROM visitors WHERE page = 'bill o''reilly' ORDER BY date DESC

    is how one would write the literal SQL for that.

    The MySQLdb library will do this and a host of other equivalent things
    automatically and correctly and consistently when you pass page as a
    parameter to the execute() method, needing no special attention or
    detailed syntactic knowledge on your part when you write your program.

    [...snip...]
    | >The "%" operator has special knowledge that is it is passed as string instead
    | >of a list or tuple or other sequence then it should act _as_ _if_ it had been
    | >passed a single element tuple containing the string.
    | >
    | %s and %d is behaving the same due to % expecting a string instead
    | of an integer?

    I haven't checked. I make a point of using the (page,) form (i.e.
    always use a tuple, even with just one argument) these days. That way
    there is no ambiguity.

    | >Otherwise, because a string _is_ a sequence the "%" might want to treat
    | >the string "foo" as the sequence:
    | >
    | > ("f", "o", "o")
    | cursor.execute('''SELECT host, hits, date FROM visitors WHERE
    | page=%s ORDER BY date DESC''', page)
    |
    | But it alss might treat it an entity, i mean since 'page' is a
    | variable containing a string why not just 'page' as it is expecting
    | 'page' variable to give its value when asked?

    A string is also a sequence of characters.

    | >Run these three loops to see the difference:
    | >
    | > for s in "foo":
    | > print s
    | > for s in ("foo"):
    | > print s
    | > for s in ("foo",):
    | > print s
    | >
    | >Cheers,
    | >>> for s in "nikos":
    | print s
    |
    |
    | n
    | i
    | k
    | o
    | s
    |
    | # this handles the string "nikos" as a series of chars right?

    Yes.

    | >>> for s in ("nikos"):
    | print s
    |
    |
    | n
    | i
    | k
    | o
    | s
    |
    | # this handles the string "nikos" as a series of chars too but what
    | si the difference with the above in htis with the parentheses? is
    | "nikos" is handles still as string here?

    It is exactly the same as the first loop. Just as:

    1 + 3

    is exactly the same as:

    (1) + (3)

    | >>> for s in ("nikos",):
    | print s
    |
    | nikos
    |
    | # Here yes it handles "nikos" as the 1st item of a tuple

    Yep.

    | >>> for s in ["nikos"]:
    | print s
    |
    | nikos
    |
    | # Here? why is it behaving fifferent than the above ("nikos") and is
    | proccessign it all chars in one?

    ("nikos",) is a single element tuple.
    ["nikos"] is a single element list.
    ["nikos",] is also a single element list, just written like the tuple.

    You don't see the ["nikos",] form very often because ["nikos"] is not
    ambiguous. It is only because ("nikos") gets reduced to plain "nikos"
    just like the arithmetic above that you see the ("nikos",) form - the
    comma makes it a tuple.

    | >>> for s in ["nikos",]:
    | print s
    |
    |
    | nikos
    |
    | # Here it handles "nikos" as the 1st item of a list right?

    Yes.

    Cheers,
    --
    Cameron Simpson <> DoD#743
    http://www.cskk.ezoshosting.com/cs/

    Everything is gone;
    Your life's work has been destroyed.
    Squeeze trigger (yes/no)?
    - Haiku Error Messages http://www.salonmagazine.com/21st/chal/1998/02/10chal2.html
    Cameron Simpson, Aug 18, 2010
    #1
    1. Advertising

  2. I would expect that:

    "nikos" is a string, while,
    ("nikos") is a single element tuple.
    ["nikos"] is a single element list.

    That way we wouldn't be needing comma seperators.
    I just don't like it when "nikos" and ("nikos") is the same thing
    exactly. Parentheses are to be used to define a tuple and square
    brackets to define a list.

    Also i want to ask whats the difference between 'nikos', "nikos" and
    '''nikos''' for Python and whats the best way to use to enclose a
    string.


    ==================================
    if in my example instead of

    | cursor.execute('''SELECT host, hits, date FROM visitors WHERE
    | page=%s ORDER BY date DESC''', (page,) )

    i use

    | cursor.execute('''SELECT host, hits, date FROM visitors WHERE
    | page=%s ORDER BY date DESC''', page)

    instead which i tend to like more, would there i have a problem?

    Also how how this part here page=%s doesn't need to be written as
    page='%s' which is the way i used to have it written with myssql
    string substitution?

    ==========================================

    And last but not least :) is that

    http://webville.gr/index.html?page="100 ; DELETE FROM visitors;
    SELECT
    * FROM visitors"
    ===============================
    don't reproduce the problem of actual deleting my data. I don't care
    losing it!
    I just want to see that happening with my own eyes!

    but if you try it you get an error. Maybe die to whitespace in the
    value of the page?
    How to write that propetly?

    Thank again fellows for ALL your help!
    Îίκος, Aug 19, 2010
    #2
    1. Advertising

  3. On 18 ΑÏγ, 12:50, Cameron Simpson <> wrote:

    >
    > ("nikos",) is a single element tuple.
    > ["nikos"] is a single element list.
    > ["nikos",] is also a single element list, just written like the tuple.



    It makes more sense if i:

    "nikos" is just a string
    ("nikos") is a single element tuple
    ["nikos"] is also a single element list


    After all () used to define tuples and [] usedd to define lists. Why
    commas?

    Also is there a difference between 'nikos' or "nikos" or '''nikos''' ?
    What's and why best to use to enclose strings?

    ===========================

    Why in mysql string substitution example i have to use page='%s' and
    in the comma way(automatic mysql converetion i dont need the single
    quotes page=%s ?
    What is the diff?

    ===========================

    Why http://webville.gr/index.html?page="100 ; DELETE FROM visitors;
    SELECT * FROM visitors"

    don't reproduce the problem of actual deleting my data to demonstrate
    the wrongness of string substitution in mysql queries?

    I don't care losing my data! The page is there to helpe me learn
    python and mysql.

    I just want to see that happening with my own eyes!

    Thanks again fols for all your precious help and explanations.
    Îίκος, Aug 19, 2010
    #3
  4. Cameron Simpson

    Tim Chase Guest

    On 08/19/10 02:10, Îίκος wrote:
    >> ("nikos",) is a single element tuple.
    >> ["nikos"] is a single element list.
    >> ["nikos",] is also a single element list, just written like the tuple.

    >
    > It makes more sense if i:
    >
    > "nikos" is just a string
    > ("nikos") is a single element tuple
    > ["nikos"] is also a single element list
    >
    > After all () used to define tuples and [] usedd to define lists. Why
    > commas?


    You have to look at the other side: what *else* they're used
    for. Python also uses () to override order of operations (and to
    call functions, but that's contextually different) which can
    occur in the same context as tuples, while [] are used only
    within contexts where they can be disambiguated. Going back to
    one of the originals example posted in this thread:

    (1) + (2)

    do you think this should yield (1, 2) or 3? It would be crazy if
    evaluation of

    (3*14) + (7*21)

    was reduced, treated as "(42) + (147)" and then reduced to
    "(42,147)" instead of 189.

    So Python needs a way to express that you *explicitly* mean "this
    is one of those rare one-element tuples, not an order of
    operations prioritization":

    (1,) + (2,)

    to return "(1,2)"

    > Also is there a difference between 'nikos' or "nikos" or '''nikos''' ?
    > What's and why best to use to enclose strings?


    Internally, there's no difference other than how easily you can
    include " or ' characters in your string. Thus you might write:

    with_dquote = 'He said "Hello"'
    with_apos = "It's 2:00am"
    with_both1 = """She said "Don't touch me" to her boss"""
    with_both2 = '''She said "Don't touch me" to her boss'''

    You can also prefix any of them with "r" such as

    file_path = r"c:\path\to\file.txt"
    file_path = r'c:\path\to\file.txt
    file_path = r"""c:\path\to\file.txt"""
    file_path = r'''c:\path\to\file.txt'''

    to alter how "\" are treated.

    Otherwise, if it doesn't make a difference, I tend to use C-ish
    conventions of using " for strings and ' for single characters:

    if 'w' in "hello world":

    but the important/kind thing is to be internally consistent to
    make your own life easier. :)

    -tkc
    Tim Chase, Aug 19, 2010
    #4
  5. Hi Îίκος,

    On 2010-08-19 09:10, Îίκος wrote:
    > On 18 ΑÏγ, 12:50, Cameron Simpson <> wrote:
    >>
    >> ("nikos",) is a single element tuple.
    >> ["nikos"] is a single element list.
    >> ["nikos",] is also a single element list, just written like the tuple.

    >
    > It makes more sense if i:
    >
    > "nikos" is just a string
    > ("nikos") is a single element tuple
    > ["nikos"] is also a single element list
    >
    > After all () used to define tuples and [] usedd to define lists. Why
    > commas?


    Because parentheses are also used to group expressions.
    For example,

    >>> 2 * (1+2)

    6

    If it were as you would like, the result would have been

    (3, 3)

    So because <something_without_commas> inside parentheses
    already denotes an expression you have to add a comma to
    distinguish a one-element tuple from an expression.

    Stefan
    Stefan Schwarzer, Aug 19, 2010
    #5
  6. Cameron Simpson

    Nik Gr Guest

    Στις 19/8/2010 2:32 μμ, ο/η Tim Chase έγÏαψε:
    > (1,) + (2,)
    >
    > to return "(1,2)"

    This is actually joining two single element tuples (1,) and (2, ) to a
    new bigger tuple of two elements, correct?
    Nik Gr, Aug 19, 2010
    #6
  7. Cameron Simpson

    Tim Chase Guest

    On 08/19/10 10:42, Nik Gr wrote:
    >> You can also prefix any of them with "r" such as
    >>
    >> file_path = r"c:\path\to\file.txt"
    >> file_path = r'c:\path\to\file.txt
    >> file_path = r"""c:\path\to\file.txt"""
    >> file_path = r'''c:\path\to\file.txt'''

    >
    > 'r' is to avoid escaping backslashes only or other special charcaters as
    > well?


    Yes, just backslashes.

    > As for the string i noticed that if i'am to mix single quotes and double
    > quotes(any number of them not just always pairs)
    > and backslashes and other special stuff in them then i'm best off using
    > 3-sinlge-quotes like
    >
    > name='''My name is "Nikos" and i'am from Thessaloniki\Greece'''
    >
    > The above example can only be written by using 3-single quoting right?
    > Not by pairs of single or double quotes, correct?


    It can be written as a non-3-quote string, you just have to
    escape the inner quotes (single & double) and the backslash to be
    seen:

    name = 'My name is Nikos and I\'m from Thessaloniki\\Greece'
    name = "My name is \"Nikos\" and I'm from Thessaloniki\\Greece"

    > And i dont have to use the 'r' in fornt of it too.


    Using the 'r' in front would make it much more challenging,
    because it would prevent the backslashes from being seen as
    escaping. :)

    >> (1,) + (2,)
    >>
    >> to return "(1,2)"

    >
    > This is actually joining two single element tuples (1,) and (2, ) to a
    > new bigger tuple of two elements, correct?


    Correct.

    > Also if you please comment on my mysql string substitution example i've
    > posted in my previous post just to make it work.


    There's a number of variables which can impact the exact string
    that would need to be passed, so it's not a trivial thing to do.
    You may or may not be un-escaping HTML entities in the GET
    parameters ("%20" -> a space, etc), and I don't have a readily
    available way to duplicate your environment, so testing becomes a
    bit harder. Hopefully others on the list can give you a hand on
    breaking your code.

    -tkc
    Tim Chase, Aug 19, 2010
    #7
  8. Cameron Simpson

    MRAB Guest

    Nik Gr wrote:
    [snip]
    > Why does the page variable which is actually a string needs to be a
    > tuple or a list and not just as a string which is what it actually is?
    > I have a strong desire to use it like this:
    >
    > cursor.execute( '''SELECT hits FROM counters WHERE page = %s''' , page )
    >
    > opposed to tuple. Would i might facing a problem? Of what? MySQLdb
    > instead of give the whole value to the placeholder to give just a single
    > char?


    When there's more than one value you provide a tuple. It's makes sense
    from the point of view of consistency that you also provide a tuple when
    there's only one value.

    > Also do i need 3-single-quoting here as well or it can be written qith
    > signle/double quotes?
    > What appleis to strings apply to mysql queries as well?
    >

    It's just a normal string literal.
    MRAB, Aug 19, 2010
    #8
  9. Cameron Simpson

    John Nagle Guest

    On 8/18/2010 2:50 AM, Cameron Simpson wrote:
    > On 18Aug2010 12:07, Nik Gr<> wrote:
    > | Στις 18/8/2010 7:31 πμ, ο/η Cameron Simpson έγÏαψε:
    > |>On 17Aug2010 20:15, Îίκος<> wrote:
    > |>| ===============================
    > |>| cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
    > |>| '%s' ORDER BY date DESC ''' % (page) )
    > |>| ===============================
    > |>|
    > |>| Someone told me NOT to do string substitution ("%") on SQL statements
    > |>| and to let MySQLdb do it
    > |>| for me, with proper escaping like the following
    > |>|
    > |>| ===============================
    > |>| cursor.execute('''SELECT host, hits, date FROM visitors WHERE page=%s
    > |>| ORDER BY date DESC''', (page,))
    > |>| ===============================
    > |>|
    > |>| The difference is that if some external source can control "page",
    > |>| and
    > |>| they put in a value like
    > |>| 100 ; DELETE FROM visitors; SELECT * FROM visitors
    > |>| i will be losing my database table data.
    > |>
    > |>That other difference is that the mysql dialect support knows how to
    > |>correctly escape a string for insertion into an SQL statement. You may
    > |>not, or may forget to pre-escape the string, etc. Using the MySQLdb
    > |>stuff do it for you is reliable and robust.
    > |
    > | Can you please tell me what escaping means by giving me an example
    > | of what is escaped and whats isn't?
    >
    > In your plain substitution example above:
    >
    > cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
    > '%s' ORDER BY date DESC ''' % (page) )
    >
    > Supposing page is the string "100". This will produce the SQL statement:
    >
    > SELECT host, hits, date FROM visitors WHERE page = '100' ORDER BY date DESC
    >
    > which looks ok. But suppose page was the string:
    >
    > bill o'reilly
    >
    > Then your SQL statement looks like this:
    >
    > SELECT host, hits, date FROM visitors WHERE page = 'bill o'reilly' ORDER BY date DESC
    >
    > To the SQL engine this looks like the string "bill o" followed by an SQL
    > instruction named "reilly", and then the opening quote for another string.
    > Invalid SQL.
    >
    > The procedure used to avoid this problem (to insert an _arbitrary_
    > string into the SQL statement) is to "escape" problematic characters in
    > strings when placing them into SQL statements. In this case, the quote
    > character in the string is the SQL "end string" character. Therefore the
    > string must be modified in the SQL statement to be correctly expressed.
    >
    > IIRC, SQL uses the quote doubling convention for strings, so this:
    >
    > SELECT host, hits, date FROM visitors WHERE page = 'bill o''reilly' ORDER BY date DESC
    >
    > is how one would write the literal SQL for that.
    >
    > The MySQLdb library will do this and a host of other equivalent things
    > automatically and correctly and consistently when you pass page as a
    > parameter to the execute() method, needing no special attention or
    > detailed syntactic knowledge on your part when you write your program.


    Right. There's much mystery about this, but it's really simple.
    All MySQLdb is doing for

    cursor.execute(sqlstring, values)

    is

    quotedstr = sqlstring % map(MySQLdb.escape_string, values)
    cursor.execute(quotedstr)

    It just applies "MySQLdb.escape_string to each arg.

    If PHP did that, we'd have far fewer "SQL injection attacks".

    Most programs shouldn't use "MySQLdb.escape", and should let
    the built-in call do it. It's useful, though, if you're constructing
    a data file for LOAD DATA to do a bulk database load. The
    same escaping works for LOAD DATA INFILE with the default
    input format.

    John Nagle
    John Nagle, Aug 19, 2010
    #9
  10. In message
    <>, Îίκος
    wrote:

    > I would expect that:
    >
    > ("nikos") is a single element tuple.


    Then how would you do a simple parenthesized expression?
    Lawrence D'Oliveiro, Aug 21, 2010
    #10
  11. Cameron Simpson

    Aahz Guest

    In article <>,
    =?UTF-8?B?zp3Or866zr/Pgg==?= <> wrote:
    >
    >After all () used to define tuples and [] usedd to define lists. Why
    >commas?


    No, "()" does *not* define tuples, except for the empty tuple. The comma
    defines tuples, with parentheses simply used for visual effect:

    >>> 1, 2, 3

    (1, 2, 3)
    --
    Aahz () <*> http://www.pythoncraft.com/

    "...if I were on life-support, I'd rather have it run by a Gameboy than a
    Windows box." --Cliff Wells
    Aahz, Sep 4, 2010
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Amy G

    escaping % in a string???

    Amy G, Feb 27, 2004, in forum: Python
    Replies:
    3
    Views:
    4,209
    Duncan Booth
    Feb 27, 2004
  2. Grant Olson

    Safe string escaping?

    Grant Olson, Mar 8, 2005, in forum: Python
    Replies:
    2
    Views:
    405
    Bengt Richter
    Mar 8, 2005
  3. Íßêïò
    Replies:
    2
    Views:
    467
    Nik Gr
    Aug 18, 2010
  4. Nik Gr
    Replies:
    21
    Views:
    644
    Dennis Lee Bieber
    Aug 31, 2010
  5. sunckell

    Proper Technique for DBD::mysql install

    sunckell, Apr 7, 2009, in forum: Perl Misc
    Replies:
    0
    Views:
    131
    sunckell
    Apr 7, 2009
Loading...

Share This Page