Lie Hetland book: Beginning Python..

Discussion in 'Python' started by Vittorio, Nov 7, 2005.

  1. Vittorio

    Vittorio Guest

    I am reading "Beginning Python from Novice to Professional" and the book
    is really awesome. Nonetheless on ch 13 "Database Support" I found this
    code to import data (in a txt file) into a SQLite Database:

    #this was corrected because original "import sqlite" does not work
    from pysqlite2 import dbapi2 as sqlite

    #this function strips the txt file from special chars
    def convert(value):
    if value.startswith('~'):
    return value.strip('~')
    if not value:
    value = '0'
    return float(value)

    conn = sqlite.connect('food.db')
    curs = conn.cursor()

    curs.execute('''
    CREATE TABLE food (
    id TEXT PRIMARY KEY,
    desc TEXT,
    water FLOAT,
    kcal FLOAT,
    protein FLOAT,
    fat FLOAT,
    ash FLOAT,
    carbs FLOAT,
    fiber FLOAT,
    sugar FLOAT
    )
    ''')

    field_count = 10

    #following is the line I suspect mistyped
    markers = ', '.join(['%s']*field_count)

    query = 'INSERT INTO food VALUES (%s)' % markers


    for line in open('ABBREV.txt'):
    fields = line.split('^')
    vals = [convert(f) for f in fields[:field_count]]
    #the following line raises error
    curs.execute(query,vals)

    conn.commit()
    conn.close


    The error was "Traceback (most recent call last):
    File "C:\Python24\food.py", line 39, in ?
    curs.execute(query,vals)
    pysqlite2.dbapi2.OperationalError: near "%": syntax error"

    After two hours of trying (did I say I am a beginner?) and after some
    documentation about PySqlite I suspect the error is in:
    markers = ', '.join(['%s']*field_count)

    I think Magnus intended:
    markers = ', '.join(['?']*field_count)


    Did I found an errata or my Python is still too green?
     
    Vittorio, Nov 7, 2005
    #1
    1. Advertising

  2. Vittorio

    Steve Holden Guest

    Vittorio wrote:
    > I am reading "Beginning Python from Novice to Professional" and the book
    > is really awesome. Nonetheless on ch 13 "Database Support" I found this
    > code to import data (in a txt file) into a SQLite Database:
    >
    > #this was corrected because original "import sqlite" does not work
    > from pysqlite2 import dbapi2 as sqlite
    >
    > #this function strips the txt file from special chars
    > def convert(value):
    > if value.startswith('~'):
    > return value.strip('~')
    > if not value:
    > value = '0'
    > return float(value)
    >
    > conn = sqlite.connect('food.db')
    > curs = conn.cursor()
    >
    > curs.execute('''
    > CREATE TABLE food (
    > id TEXT PRIMARY KEY,
    > desc TEXT,
    > water FLOAT,
    > kcal FLOAT,
    > protein FLOAT,
    > fat FLOAT,
    > ash FLOAT,
    > carbs FLOAT,
    > fiber FLOAT,
    > sugar FLOAT
    > )
    > ''')
    >
    > field_count = 10
    >
    > #following is the line I suspect mistyped
    > markers = ', '.join(['%s']*field_count)
    >
    > query = 'INSERT INTO food VALUES (%s)' % markers
    >
    >
    > for line in open('ABBREV.txt'):
    > fields = line.split('^')
    > vals = [convert(f) for f in fields[:field_count]]
    > #the following line raises error
    > curs.execute(query,vals)
    >
    > conn.commit()
    > conn.close
    >
    >
    > The error was "Traceback (most recent call last):
    > File "C:\Python24\food.py", line 39, in ?
    > curs.execute(query,vals)
    > pysqlite2.dbapi2.OperationalError: near "%": syntax error"
    >
    > After two hours of trying (did I say I am a beginner?) and after some
    > documentation about PySqlite I suspect the error is in:
    > markers = ', '.join(['%s']*field_count)
    >
    > I think Magnus intended:
    > markers = ', '.join(['?']*field_count)
    >
    >
    > Did I found an errata or my Python is still too green?
    >
    >

    No, you actually did quite a creditable piece of debugging. The DB-API
    specifications allow database modules to substitute parameters into SQL
    commands in a number of different ways, and they are supposed to
    indicate the technique they use by setting a module variable
    "paramstyle" to one of five possible values.

    Magnus' original code was written to use a different (but valid)
    paramstyle, so I'm guessing that his sqlite module and your sqlite2
    simply use different paramstyles. Whether that's because a change was
    made in developing the pysqlite code or because pysqlite and pysqlite2
    come from different developers I couldn't say, but you have nailed the
    problem. Well done!

    regards
    Steve
    --
    Steve Holden +44 150 684 7255 +1 800 494 3119
    Holden Web LLC www.holdenweb.com
    PyCon TX 2006 www.python.org/pycon/
     
    Steve Holden, Nov 7, 2005
    #2
    1. Advertising

  3. Vittorio

    Vittorio Guest

    Steve Holden <> wrote in
    news::

    > No, you actually did quite a creditable piece of debugging. The DB-API
    > specifications allow database modules to substitute parameters into
    > SQL commands in a number of different ways, and they are supposed to
    > indicate the technique they use by setting a module variable
    > "paramstyle" to one of five possible values.
    >
    > Magnus' original code was written to use a different (but valid)
    > paramstyle, so I'm guessing that his sqlite module and your sqlite2
    > simply use different paramstyles. Whether that's because a change was
    > made in developing the pysqlite code or because pysqlite and pysqlite2
    > come from different developers I couldn't say, but you have nailed the
    > problem. Well done!


    Thanks Steve for your encouragement.

    Actually, subsequently to my posting, I had realised that the point was
    in the different version of the Pysqlite interface: in fact I found many
    other pieces of code which were affected the same way, and each of them
    made use of "import pysqlite".

    Nonetheless, I was unable to find any documentation about such a
    different behaviour between Pysqlite and Pysqlite2; from my beginner
    point of view the Pysqlite (Magnus' version) paramstyle looks a better
    and more pythonic choice and I don't grasp the Pysqlite2 developers'
    intentions deviating from that way.

    I would be very grateful if someone would cast a light over
    Pysqlite/Pysqlite2 discrepancies.
     
    Vittorio, Nov 8, 2005
    #3
  4. Vittorio

    Magnus Lycka Guest

    Vittorio wrote:
    > Nonetheless, I was unable to find any documentation about such a
    > different behaviour between Pysqlite and Pysqlite2; from my beginner
    > point of view the Pysqlite (Magnus' version) paramstyle looks a better
    > and more pythonic choice and I don't grasp the Pysqlite2 developers'
    > intentions deviating from that way.


    Please note that the DB-APIs let you use a foreign language, SQL,
    in Python strings. Having SQL look Pythonic is hardly a virtue.
    SQL should look SQLic! The SQL standards clearly state that '?'
    is the correct symbol for dynamic SQL placeholders. For embedded
    SQL (which is really moot for Python) it's ':NAME', but '%s' has
    nothing to do with SQL. Pysqlite supports both '?' and ':NAME',
    but no longer '%s', which is a blessing in my book.

    Please note that while there is a rough correspondence between
    the placeholders in SQL and %s and friends in Python strings,
    they are far from the same. With SQL placeholders and separately
    passed parameters, proper implementations of database servers will
    prevent SQL injection attacks and provde a much better performance
    than if you build an SQL string with Python's %-operator and %s etc
    in the SQL string. Proper SQL parameter passing will also mean that
    parameter quoting is handled for you. On the other hand, you can only
    use placeholders in certain positions in SQL, so you might need %s
    as well in SQL strings too, if you for instance need to determine
    the table to search from in runtime.

    Using the same symbol for both string substitutions and SQL placeholder
    such as pysqlite 1 and the MySQL interface does, is not really a bright
    idea in my opinion. Who thinks this is pretty?

    sql = "SELECT %s FROM %s WHERE %s = %%s"
    cur.execute(sql % (col,table,search_col), (param,))

    I think it's less confusing with:

    sql = "SELECT %s FROM %s WHERE %s = ?"
    cur.execute(sql % (col,table,search_col), (param,))

    With %s as placeholder, it's easy to do either...

    sql = "SELECT %s FROM %s WHERE %s = %s"
    cur.execute(sql % (col,table,search_col,param))

    If you do this, you won't have any help with quoting,
    you are suceptible to SQL injection attacks, and your
    performance won't improve if the same query is performed
    repeatedly with different values for param, since the
    database server will make a new query execution plan
    every time. :(

    or...

    sql = "SELECT %s FROM %s WHERE %s = %s"
    cur.execute(sql, (col,table,search_col,param))

    If this works with your DB driver, it's likely to be really
    broken and just work as the previous example. In other
    words you don't have the benefits in performance, convenience
    or security that parameter passing provides in dynamic SQL.

    Of course, the "proper" way, with %s-substitution for e.g. table
    names and ? for parameters is also open for SQL injection attacks
    if the values in the strings col, table and search_col above are
    user input, but since they are plain SQL identifiers, they are much
    easier to check than arbitrary search values. You'd probably have
    a set of allowed values, and check that the input was in that
    set. They are also less likely to come from an untrusted source.

    The DB-API spec is available at http://python.org/peps/pep-0249.html
    It's a good read. You could also look at:
    http://www.thinkware.se/epc2004db/

    > I would be very grateful if someone would cast a light over
    > Pysqlite/Pysqlite2 discrepancies.


    I'm afraid I haven't seen that anywhere. Some of the more subtle
    changes probably results from the difference between SQLite 2 and
    SQLite 3, since these are the versions those Python libraries wrap.

    As you can see in
    http://initd.org/tracker/pysqlite/wiki/PysqliteVersions
    you can use pysqlite 1.1 if you want to use the old pysqlite 1
    API.

    Pysqlite2 is documented here:
    http://initd.org/pub/software/pysqlite/doc/usage-guide.html

    It contains a fair amount of examples, but unfortunately no
    direct comparision with pysqlite 1.
     
    Magnus Lycka, Nov 8, 2005
    #4
  5. Vittorio

    Vittorio Guest

    Magnus Lycka <> wrote in news:dkqdf9$17g$:

    Thank you for your message I found really illuminating.

    > Using the same symbol for both string substitutions and SQL placeholder
    > such as pysqlite 1 and the MySQL interface does, is not really a bright
    > idea in my opinion. Who thinks this is pretty?
    >
    > sql = "SELECT %s FROM %s WHERE %s = %%s"
    > cur.execute(sql % (col,table,search_col), (param,))
    >
    > I think it's less confusing with:
    >
    > sql = "SELECT %s FROM %s WHERE %s = ?"
    > cur.execute(sql % (col,table,search_col), (param,))


    I had never thought about it, I really agree.


    > Of course, the "proper" way, with %s-substitution for e.g. table
    > names and ? for parameters is also open for SQL injection attacks
    > if the values in the strings col, table and search_col above are
    > user input, but since they are plain SQL identifiers, they are much
    > easier to check than arbitrary search values. You'd probably have
    > a set of allowed values, and check that the input was in that
    > set. They are also less likely to come from an untrusted source.


    undoubtedly.

    > As you can see in
    > http://initd.org/tracker/pysqlite/wiki/PysqliteVersions
    > you can use pysqlite 1.1 if you want to use the old pysqlite 1
    > API.
    >
    > Pysqlite2 is documented here:
    > http://initd.org/pub/software/pysqlite/doc/usage-guide.html


    yes I had already found the docs and noticed there was no explanation
    about such a remarkable difference from pysqlite1 and pysqlite2.

    After your message, I find even more strange that Magnus' book reported
    pysqlite1 examples as Sqlite 3 was a great step forward in my opinion.
    But this can not prevent me from saying that it is the best Python book
    around in the beginner to intermediate range.
     
    Vittorio, Nov 9, 2005
    #5
  6. Vittorio wrote:
    > [...]
    > Nonetheless, I was unable to find any documentation about such a
    > different behaviour between Pysqlite and Pysqlite2; from my beginner
    > point of view the Pysqlite (Magnus' version) paramstyle looks a better
    > and more pythonic choice and I don't grasp the Pysqlite2 developers'
    > intentions deviating from that way.


    The reason why pysqlite 0.x/1.x used paramstyle "pyformat", based on
    Python string substitution for SQL parameters is that at the time
    pysqlite was started, SQLite 2.x did not have any support for parameter
    binding. So we had to "fake" it in Python, just like the MySQL interface
    does (for the same reasons).

    Later SQLite 2.x versions and of course SQLite 3.x supported real bound
    parameters and pysqlite2 was developed from scratch to benefit from
    them. SQLite 3.x supports both qmark and named paramstyles, so you can
    use question marks *or* named parameters:

    >>> from pysqlite2 import dbapi2 as sqlite
    >>> con = sqlite.connect(":memory:")
    >>> cur = con.cursor()
    >>> cur.execute("select 2*?", (14,))
    >>> cur.fetchone()

    (28,)
    >>>
    >>> cur.execute("select 2 * :x", {"x": 14})
    >>> cur.fetchone()

    (28,)
    >>>
    >>> x = 14
    >>> cur.execute("select 2 * :x", locals())
    >>> cur.fetchone()

    (28,)
    >>>


    I've also once written a wrapper using pysqlite 2.x's hooks that allows
    you to use the "format" paramstyle with pysqlite 2.x, so you can reuse
    more code that was originally written against pysqlite 0.x/1.x:

    from pysqlite2 import dbapi2 as sqlite

    class PyFormatConnection(sqlite.Connection):
    def cursor(self):
    return sqlite.Connection.cursor(self, PyFormatCursor)

    class PyFormatCursor(sqlite.Cursor):
    def execute(self, sql, args=None):
    if args:
    qmarks = ["?"] * len(args)
    sql = sql % tuple(qmarks)
    sqlite.Cursor.execute(self, sql, args)
    else:
    sqlite.Cursor.execute(self, sql)

    con = sqlite.connect(":memory:", factory=PyFormatConnection)
    cur = con.cursor()
    cur.execute("create table test(a, b, c)")
    cur.execute("insert into test(a, b, c) values (%s, %s, %s)", ('asdf', 4,
    5.2))
    cur.execute("select a, b, c from test where c <> %s", (4.27,))
    print cur.fetchone()
    cur.close()
    con.close()

    > I would be very grateful if someone would cast a light over
    > Pysqlite/Pysqlite2 discrepancies.


    I think about the only place I wrote a bit about the differences was in
    the pysqlite 2.0 final announcement:

    http://lists.initd.org/pipermail/pysqlite/2005-May/000043.html

    -- Gerhard
     
    =?ISO-8859-1?Q?Gerhard_H=E4ring?=, Nov 9, 2005
    #6
  7. Vittorio

    Steve Holden Guest

    Gerhard Häring wrote:
    > Vittorio wrote:
    >
    >>[...]
    >>Nonetheless, I was unable to find any documentation about such a
    >>different behaviour between Pysqlite and Pysqlite2; from my beginner
    >>point of view the Pysqlite (Magnus' version) paramstyle looks a better
    >>and more pythonic choice and I don't grasp the Pysqlite2 developers'
    >>intentions deviating from that way.

    >
    >
    > The reason why pysqlite 0.x/1.x used paramstyle "pyformat", based on
    > Python string substitution for SQL parameters is that at the time
    > pysqlite was started, SQLite 2.x did not have any support for parameter
    > binding. So we had to "fake" it in Python, just like the MySQL interface
    > does (for the same reasons).
    >
    > Later SQLite 2.x versions and of course SQLite 3.x supported real bound
    > parameters and pysqlite2 was developed from scratch to benefit from
    > them. SQLite 3.x supports both qmark and named paramstyles, so you can
    > use question marks *or* named parameters:
    >
    > >>> from pysqlite2 import dbapi2 as sqlite
    > >>> con = sqlite.connect(":memory:")
    > >>> cur = con.cursor()
    > >>> cur.execute("select 2*?", (14,))
    > >>> cur.fetchone()

    > (28,)
    > >>>
    > >>> cur.execute("select 2 * :x", {"x": 14})
    > >>> cur.fetchone()

    > (28,)
    > >>>
    > >>> x = 14
    > >>> cur.execute("select 2 * :x", locals())
    > >>> cur.fetchone()

    > (28,)
    > >>>

    >
    > I've also once written a wrapper using pysqlite 2.x's hooks that allows

    [...]
    >
    >>I would be very grateful if someone would cast a light over
    >>Pysqlite/Pysqlite2 discrepancies.

    >
    >
    > I think about the only place I wrote a bit about the differences was in
    > the pysqlite 2.0 final announcement:
    >
    > http://lists.initd.org/pipermail/pysqlite/2005-May/000043.html
    >

    Unfortunately this appears to mean that pysqlite2 isn't fully DB
    API-conformant.

    >>> import pysqlite2
    >>> pysqlite2.paramstyle

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    AttributeError: 'module' object has no attribute 'paramstyle'
    >>>


    Of course, given the module's flexibility it's difficult to know what
    you *would* put in paramstyle. I take it mixing different paramstyles in
    the same query will fail?

    regards
    Steve
    --
    Steve Holden +44 150 684 7255 +1 800 494 3119
    Holden Web LLC www.holdenweb.com
    PyCon TX 2006 www.python.org/pycon/
     
    Steve Holden, Nov 9, 2005
    #7
  8. Vittorio

    carlo Guest

    Gerhard Häring ha scritto:

    > The reason why pysqlite 0.x/1.x used paramstyle "pyformat", based on
    > Python string substitution for SQL parameters is that at the time
    > pysqlite was started, SQLite 2.x did not have any support for parameter
    > binding. So we had to "fake" it in Python, just like the MySQL interface
    > does (for the same reasons).


    Thanks Gerhard for your valuable help.
     
    carlo, Nov 9, 2005
    #8
  9. Magnus Lycka wrote:
    > Vittorio wrote:
    > Using the same symbol for both string substitutions and SQL placeholder
    > such as pysqlite 1 and the MySQL interface does, is not really a bright
    > idea in my opinion. Who thinks this is pretty?
    >
    > sql = "SELECT %s FROM %s WHERE %s = %%s"
    > cur.execute(sql % (col,table,search_col), (param,))
    >
    > I think it's less confusing with:
    >
    > sql = "SELECT %s FROM %s WHERE %s = ?"
    > cur.execute(sql % (col,table,search_col), (param,))
    >

    or you could use:

    sql = "SELECT %s FROM %s WHERE %s = %s"
    cur.execute(sql % (col,table,search_col, '%s'), (param,))

    which I like better, because you don't have to read
    extremely carefully for the double-percents.

    --Scott David Daniels
     
    Scott David Daniels, Nov 11, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ilias Lazaridis
    Replies:
    0
    Views:
    936
    Ilias Lazaridis
    Dec 12, 2004
  2. axter
    Replies:
    16
    Views:
    3,262
    Thomas G. Marshall
    Feb 27, 2005
  3. Jeff Epler
    Replies:
    10
    Views:
    686
    Anton Vredegoor
    Aug 20, 2003
  4. Kriv
    Replies:
    8
    Views:
    1,318
    bruno at modulix
    May 30, 2006
  5. Jesse B.
    Replies:
    9
    Views:
    259
    Jesse B.
    Mar 27, 2010
Loading...

Share This Page