Prepare Statements VS Statements

Discussion in 'Java' started by Vince, Jan 20, 2008.

  1. Vince

    Vince Guest

    Hi everybody

    for one of my customer I need to program an adapter that basically
    receives about 25'000 MQ Messages, parses them and then finally writes
    the data into an Oracle DB.

    I must say at this point that every message has a variable length and
    thus variable amount of data/fields that must be returned to the DB.

    During a workshop I got an argue with another developer because he said
    we should use PrepareStatements instead of normal Statements, they've
    got a better performance - compiling once on DB Server. I generally
    agreed, but said that in our case the PrepareStatements won't help in
    terms of performance since we have a variable amount of columns for each
    SQL Insert Statement. What IMHO would lead to an overhead because we
    would have to concatenate the PrepareStatements individually (25000
    times) plus we would then have to assign a value for each question
    mark.. We basically better concatenate the Insert Statement once for
    each row of data and then use a batch and commit it to the DB...

    My question now: How would you solve it? Would the PrepareStatement
    still improve the performance, considering though the variable amount of
    columns?

    Thanks for your opinion and help

    Vince

    --
    Posted via a free Usenet account from http://www.teranews.com
    Vince, Jan 20, 2008
    #1
    1. Advertising

  2. Vince

    Arne Vajhøj Guest

    Vince wrote:
    > for one of my customer I need to program an adapter that basically
    > receives about 25'000 MQ Messages, parses them and then finally writes
    > the data into an Oracle DB.
    >
    > I must say at this point that every message has a variable length and
    > thus variable amount of data/fields that must be returned to the DB.
    >
    > During a workshop I got an argue with another developer because he said
    > we should use PrepareStatements instead of normal Statements, they've
    > got a better performance - compiling once on DB Server. I generally
    > agreed, but said that in our case the PrepareStatements won't help in
    > terms of performance since we have a variable amount of columns for each
    > SQL Insert Statement. What IMHO would lead to an overhead because we
    > would have to concatenate the PrepareStatements individually (25000
    > times) plus we would then have to assign a value for each question
    > mark.. We basically better concatenate the Insert Statement once for
    > each row of data and then use a batch and commit it to the DB...
    >
    > My question now: How would you solve it? Would the PrepareStatement
    > still improve the performance, considering though the variable amount of
    > columns?


    Some comments:

    1) PreparedStatement usually but not always performs better than
    Statement went executed multiple times.

    2) My experience is that the gain for Oracle of using prepared
    statements can be really huge.

    3) I will assume that you can reuse some prepared statements. 25000
    distinct INSERT statements does not sound likely.

    4) Besides the performance issue PreparedStatement is also much
    better to handle date formats and strings with quotes in.

    5) Multiple prepare statement calls can be batched just like
    multiple ordinary statement calls can.

    So based on the little info available I will recommend you to
    look into prepared statements.

    Arne
    Arne Vajhøj, Jan 20, 2008
    #2
    1. Advertising

  3. Vince

    Lew Guest

    Vince wrote:
    >> My question now: How would you solve it? Would the PrepareStatement
    >> still improve the performance, considering though the variable amount
    >> of columns?


    Arne Vajhøj wrote:
    > 1) PreparedStatement usually but not always performs better than
    > Statement [when] executed multiple times.
    >
    > 2) My experience is that the gain for Oracle of using prepared
    > statements can be really huge.
    >
    > 3) I will assume that you can reuse some prepared statements. 25000
    > distinct INSERT statements does not sound likely.


    All statements get "prepared" by the database engine, in the sense that the
    SQL string gets sort-of-compiled by the engine just to run it. The
    performance advantage of PreparedStatement comes perhaps from the ability of
    the JDBC driver to understand that a statement will be reused, and certainly
    from any ability the DBMS engine has to reuse its sort-of-compiled version of
    the statement.

    Some engines, e.g., Postgres, only take full advantage of a prepared (in its
    sense) statement within a single "session", which I believe corresponds to the
    lifetime of the Java connection to the database. If understand the
    implications correctly, closing a connection would flush Postgres's cache of
    prepared statements from that connection. This is likely not a problem if the
    JDBC connections are pooled, as they should be in production. Pooled
    connections rarely close from the DBMS's point of view. The point of this
    whole paragraph is twofold: that getting the full performance benefit of
    PreparedStatement depends on the tuning of a lot of operational factors, and
    that operations is an art unto itself.

    > 4) Besides the performance issue PreparedStatement is also much
    > better to handle date formats and strings with quotes in.


    Often it seems that performance is pursued before correctness is assured.
    Arne makes the point here that PreparedStatement has advantages irrespective
    of its performance comparison to other kinds of Statement.

    > 5) Multiple prepare statement calls can be batched just like
    > multiple ordinary statement calls can.
    >
    > So based on the little info available I will recommend you to
    > look into prepared statements.


    In addition to Arne's excellent advice, and expanding on point 4 above,
    PreparedStatements provide a degree of type safety to run-time data in SQL
    actions. This is particularly important for data provided by the public -
    well-known "SQL injection" attacks will not work on PreparedStatement forms
    because the text is better encapsulated than in the more casual forms.

    Even in more internal usage, it's a lot harder to insert, say, a misconfigured
    date to a query when the argument must be of type java.sql.Date to start.
    PreparedStatement lets you enforce certain guarantees on dynamic data.

    --
    Lew
    Lew, Jan 20, 2008
    #3
  4. Lew wrote:
    > Vince wrote:
    >>> My question now: How would you solve it? Would the PrepareStatement
    >>> still improve the performance, considering though the variable amount
    >>> of columns?

    >
    > Arne Vajhøj wrote:
    >> 1) PreparedStatement usually but not always performs better than
    >> Statement [when] executed multiple times.
    >>
    >> 2) My experience is that the gain for Oracle of using prepared
    >> statements can be really huge.
    >>
    >> 3) I will assume that you can reuse some prepared statements. 25000
    >> distinct INSERT statements does not sound likely.

    >
    > All statements get "prepared" by the database engine, in the sense that
    > the SQL string gets sort-of-compiled by the engine just to run it. The
    > performance advantage of PreparedStatement comes perhaps from the
    > ability of the JDBC driver to understand that a statement will be
    > reused, and certainly from any ability the DBMS engine has to reuse its
    > sort-of-compiled version of the statement.
    >
    > Some engines, e.g., Postgres, only take full advantage of a prepared (in
    > its sense) statement within a single "session", which I believe
    > corresponds to the lifetime of the Java connection to the database. If
    > understand the implications correctly, closing a connection would flush
    > Postgres's cache of prepared statements from that connection. This is
    > likely not a problem if the JDBC connections are pooled, as they should
    > be in production. Pooled connections rarely close from the DBMS's point
    > of view. The point of this whole paragraph is twofold: that getting the
    > full performance benefit of PreparedStatement depends on the tuning of a
    > lot of operational factors, and that operations is an art unto itself.
    >
    >> 4) Besides the performance issue PreparedStatement is also much
    >> better to handle date formats and strings with quotes in.

    >
    > Often it seems that performance is pursued before correctness is
    > assured. Arne makes the point here that PreparedStatement has advantages
    > irrespective of its performance comparison to other kinds of Statement.
    >
    >> 5) Multiple prepare statement calls can be batched just like
    >> multiple ordinary statement calls can.
    >>
    >> So based on the little info available I will recommend you to
    >> look into prepared statements.

    >
    > In addition to Arne's excellent advice, and expanding on point 4 above,
    > PreparedStatements provide a degree of type safety to run-time data in
    > SQL actions. This is particularly important for data provided by the
    > public - well-known "SQL injection" attacks will not work on
    > PreparedStatement forms because the text is better encapsulated than in
    > the more casual forms.
    >
    > Even in more internal usage, it's a lot harder to insert, say, a
    > misconfigured date to a query when the argument must be of type
    > java.sql.Date to start. PreparedStatement lets you enforce certain
    > guarantees on dynamic data.
    >

    Adding to the Arne and Lew's good advice, I'd add that the issue of NULL
    columns is not a problem for PreparedStatement usage because it provides
    a setNull() method.

    As a table has a fixed number of columns, you can declare a
    PreparedStatement for each table that stores MQ messages. As each
    message is stored you:
    - select the appropriate table's PreparedStatement
    - put each field in the message into the appropriate column
    - set the rest to NULL
    - use executeUpdate() to store the row.

    Apart from that, its up to you, your system recovery requirements and
    performance considerations whether you let transaction commitment
    default to automatic (commit after every SQL statement) or setAutoCommit
    off and handle commits yourself. Committing batches of inserts should be
    faster than committing every transaction.

    Database committal can be tied to MQ's transactional capability, which
    should centralise application restart & recovery in the program sending
    the MQ messages.


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
    Martin Gregorie, Jan 20, 2008
    #4
  5. Vince

    Roedy Green Guest

    On Sun, 20 Jan 2008 15:46:36 +0100, Vince <> wrote,
    quoted or indirectly quoted someone who said :

    >My question now: How would you solve it? Would the PrepareStatement
    >still improve the performance, considering though the variable amount of
    >columns?


    Presumably there would be a number of common patterns.
    Each one could have its own prepared statement.
    --
    Roedy Green, Canadian Mind Products
    The Java Glossary, http://mindprod.com
    Roedy Green, Jan 20, 2008
    #5
  6. Vince

    Roedy Green Guest

    On Sun, 20 Jan 2008 15:46:36 +0100, Vince <> wrote,
    quoted or indirectly quoted someone who said :

    >Hi everybody
    >
    >for one of my customer I need to program an adapter that basically
    >receives about 25'000 MQ Messages, parses them and then finally writes
    >the data into an Oracle DB.
    >
    >I must say at this point that every message has a variable length and
    >thus variable amount of data/fields that must be returned to the DB.
    >
    >During a workshop I got an argue with another developer because he said
    >we should use PrepareStatements instead of normal Statements, they've
    >got a better performance - compiling once on DB Server. I generally
    >agreed, but said that in our case the PrepareStatements won't help in
    >terms of performance since we have a variable amount of columns for each
    >SQL Insert Statement. What IMHO would lead to an overhead because we
    >would have to concatenate the PrepareStatements individually (25000
    >times) plus we would then have to assign a value for each question
    >mark.. We basically better concatenate the Insert Statement once for
    >each row of data and then use a batch and commit it to the DB...
    >
    >My question now: How would you solve it? Would the PrepareStatement
    >still improve the performance, considering though the variable amount of
    >columns?


    You might invent a PreparedStatementCache. You can lookup using a key
    based on the string you would feed to construct the Prepared
    Statement. You would also maintain an LRU chain of the
    PreparedStatements.. When you used a prepared stament you would
    migrate it to the head of the chain. When the cache got too fat, you
    would prune the least recently used element.

    If it turned out using PreparedStatement had a higher overhead that
    plain, you could avoid creating a PreparedStatement until you had seen
    n instances where optimal n is to be found by experiment. Until
    then, you use a regular Stament and record the usage.
    --
    Roedy Green, Canadian Mind Products
    The Java Glossary, http://mindprod.com
    Roedy Green, Jan 20, 2008
    #6
  7. Vince

    Vince Guest

    Martin Gregorie wrote:
    > Lew wrote:
    >> Vince wrote:
    >>>> My question now: How would you solve it? Would the PrepareStatement
    >>>> still improve the performance, considering though the variable
    >>>> amount of columns?

    >>
    >> Arne Vajhøj wrote:
    >>> 1) PreparedStatement usually but not always performs better than
    >>> Statement [when] executed multiple times.
    >>>
    >>> 2) My experience is that the gain for Oracle of using prepared
    >>> statements can be really huge.
    >>>
    >>> 3) I will assume that you can reuse some prepared statements. 25000
    >>> distinct INSERT statements does not sound likely.

    >>
    >> All statements get "prepared" by the database engine, in the sense
    >> that the SQL string gets sort-of-compiled by the engine just to run
    >> it. The performance advantage of PreparedStatement comes perhaps from
    >> the ability of the JDBC driver to understand that a statement will be
    >> reused, and certainly from any ability the DBMS engine has to reuse
    >> its sort-of-compiled version of the statement.
    >>
    >> Some engines, e.g., Postgres, only take full advantage of a prepared
    >> (in its sense) statement within a single "session", which I believe
    >> corresponds to the lifetime of the Java connection to the database.
    >> If understand the implications correctly, closing a connection would
    >> flush Postgres's cache of prepared statements from that connection.
    >> This is likely not a problem if the JDBC connections are pooled, as
    >> they should be in production. Pooled connections rarely close from
    >> the DBMS's point of view. The point of this whole paragraph is
    >> twofold: that getting the full performance benefit of
    >> PreparedStatement depends on the tuning of a lot of operational
    >> factors, and that operations is an art unto itself.
    >>
    >>> 4) Besides the performance issue PreparedStatement is also much
    >>> better to handle date formats and strings with quotes in.

    >>
    >> Often it seems that performance is pursued before correctness is
    >> assured. Arne makes the point here that PreparedStatement has
    >> advantages irrespective of its performance comparison to other kinds
    >> of Statement.
    >>
    >>> 5) Multiple prepare statement calls can be batched just like
    >>> multiple ordinary statement calls can.
    >>>
    >>> So based on the little info available I will recommend you to
    >>> look into prepared statements.

    >>
    >> In addition to Arne's excellent advice, and expanding on point 4
    >> above, PreparedStatements provide a degree of type safety to run-time
    >> data in SQL actions. This is particularly important for data provided
    >> by the public - well-known "SQL injection" attacks will not work on
    >> PreparedStatement forms because the text is better encapsulated than
    >> in the more casual forms.
    >>
    >> Even in more internal usage, it's a lot harder to insert, say, a
    >> misconfigured date to a query when the argument must be of type
    >> java.sql.Date to start. PreparedStatement lets you enforce certain
    >> guarantees on dynamic data.
    >>

    > Adding to the Arne and Lew's good advice, I'd add that the issue of NULL
    > columns is not a problem for PreparedStatement usage because it provides
    > a setNull() method.
    >
    > As a table has a fixed number of columns, you can declare a
    > PreparedStatement for each table that stores MQ messages. As each
    > message is stored you:
    > - select the appropriate table's PreparedStatement
    > - put each field in the message into the appropriate column
    > - set the rest to NULL
    > - use executeUpdate() to store the row.
    >
    > Apart from that, its up to you, your system recovery requirements and
    > performance considerations whether you let transaction commitment
    > default to automatic (commit after every SQL statement) or setAutoCommit
    > off and handle commits yourself. Committing batches of inserts should be
    > faster than committing every transaction.
    >
    > Database committal can be tied to MQ's transactional capability, which
    > should centralise application restart & recovery in the program sending
    > the MQ messages.
    >
    >


    I'm not sure if I'll be capable to prepare a preparedStatement for each
    type of product that goes through this adapter. One requirement of the
    adapter is that whenever a new product is introduced, the application
    must react dynamically. Basically no change to the code (thus no new
    release)is allowed.

    Nevertheless, I'm convinced now that the preparedStatement will be the
    best choice. Not because of the performance but much more because of the
    accuracy of the data (setString, setInt, setNull a.s.o). Latter will
    simplify the process distinctively..

    Thanks for your more than valuable opinions guys!

    --
    Posted via a free Usenet account from http://www.teranews.com
    Vince, Jan 21, 2008
    #7
  8. Vince

    Lew Guest

    Vince wrote:
    > I'm not sure if I'll be capable to prepare a preparedStatement for each
    > type of product that goes through this adapter. One requirement of the
    > adapter is that whenever a new product is introduced, the application
    > must react dynamically. Basically no change to the code (thus no new
    > release)is allowed.


    A new product should be only a data change, therefore no need to change code.
    The same PreparedStatement should work for any products, new, former,
    current, whatever - the data structure should not change with the new
    products, and it's the data structure that is reflected in the
    PreparedStatement, not the data content.

    So feel free to use PreparedStatement - it should not need any modification as
    you add new rows to your tables.

    --
    Lew
    Lew, Jan 21, 2008
    #8
  9. Lew wrote:
    > Vince wrote:
    >> I'm not sure if I'll be capable to prepare a preparedStatement for
    >> each type of product that goes through this adapter. One requirement
    >> of the adapter is that whenever a new product is introduced, the
    >> application must react dynamically. Basically no change to the code
    >> (thus no new release)is allowed.

    >
    > A new product should be only a data change, therefore no need to change
    > code. The same PreparedStatement should work for any products, new,
    > former, current, whatever - the data structure should not change with
    > the new products, and it's the data structure that is reflected in the
    > PreparedStatement, not the data content.


    "should" does not always mean "will".

    Arne
    Arne Vajhøj, Jan 21, 2008
    #9
  10. Vince

    Lew Guest

    Arne Vajhøj wrote:
    > Lew wrote:
    >> Vince wrote:
    >>> I'm not sure if I'll be capable to prepare a preparedStatement for
    >>> each type of product that goes through this adapter. One requirement
    >>> of the adapter is that whenever a new product is introduced, the
    >>> application must react dynamically. Basically no change to the code
    >>> (thus no new release)is allowed.

    >>
    >> A new product should be only a data change, therefore no need to
    >> change code. The same PreparedStatement should work for any products,
    >> new, former, current, whatever - the data structure should not change
    >> with the new products, and it's the data structure that is reflected
    >> in the PreparedStatement, not the data content.

    >
    > "should" does not always mean "will".


    And thus we provide the OP an opportunity to explain why data changes would
    require code changes in their particular scenario.

    They shouldn't, so if they do there is probably a mistake there.

    --
    Lew
    Lew, Jan 21, 2008
    #10
  11. Vince

    Vince Guest

    Lew wrote:
    > Arne Vajhøj wrote:
    >> Lew wrote:
    >>> Vince wrote:
    >>>> I'm not sure if I'll be capable to prepare a preparedStatement for
    >>>> each type of product that goes through this adapter. One requirement
    >>>> of the adapter is that whenever a new product is introduced, the
    >>>> application must react dynamically. Basically no change to the code
    >>>> (thus no new release)is allowed.
    >>>
    >>> A new product should be only a data change, therefore no need to
    >>> change code. The same PreparedStatement should work for any
    >>> products, new, former, current, whatever - the data structure should
    >>> not change with the new products


    Not true in my case, I've got 1600 possible attributes. A product
    (financial transactions like options, swaps, futures etc) will only use
    a subset of the 1600 possible attributes. Thus the data structure will
    change for each product (~400 different product available). The point is
    that I don't get a package of product X first and then Y and then Z. The
    products will come through MQ fully mixed up.

    --
    Posted via a free Usenet account from http://www.teranews.com
    Vince, Jan 21, 2008
    #11
  12. Vince

    Mark Rafn Guest

    Vince <> wrote:
    >I must say at this point that every message has a variable length and
    >thus variable amount of data/fields that must be returned to the DB.
    >
    >During a workshop I got an argue with another developer because he said
    >we should use PrepareStatements instead of normal Statements, they've
    >got a better performance - compiling once on DB Server.


    He's right. The other MAJOR advantage you get is automatic quoting. If
    you're executing statments via string concatenation, you have to deal with crap
    like quotes and SQL reserved words in the data.

    >We basically better concatenate the Insert Statement once for
    >each row of data and then use a batch and commit it to the DB...

    ....
    >My question now: How would you solve it? Would the PrepareStatement
    >still improve the performance, considering though the variable amount of
    >columns?


    I'd still plan on using as few distinct statements, even if it means you have
    a number of messages that are explicitly inserting nulls as opposed to
    generating a custom insert that ignores irrelevant columns (which still
    inserts nulls). Even if you generate them by concatenation, you can make use
    of caching - Oracle and some others will only have to do a soft-parse rather
    than a full parse if the text of the statement is identical to one in
    the cache. C3p0 or other pooling system will do client-side caching based
    on string comparisons.
    --
    Mark Rafn <http://www.dagon.net/>
    Mark Rafn, Jan 21, 2008
    #12
  13. Vince wrote:
    > Lew wrote:
    >> Arne Vajhøj wrote:
    >>> Lew wrote:
    >>>> Vince wrote:
    >>>>> I'm not sure if I'll be capable to prepare a preparedStatement for
    >>>>> each type of product that goes through this adapter. One
    >>>>> requirement of the adapter is that whenever a new product is
    >>>>> introduced, the application must react dynamically. Basically no
    >>>>> change to the code (thus no new release)is allowed.
    >>>>
    >>>> A new product should be only a data change, therefore no need to
    >>>> change code. The same PreparedStatement should work for any
    >>>> products, new, former, current, whatever - the data structure should
    >>>> not change with the new products

    >
    > Not true in my case, I've got 1600 possible attributes. A product
    > (financial transactions like options, swaps, futures etc) will only use
    > a subset of the 1600 possible attributes. Thus the data structure will
    > change for each product (~400 different product available). The point is
    > that I don't get a package of product X first and then Y and then Z. The
    > products will come through MQ fully mixed up.
    >

    We never said you needed a PreparedStatement per product - just one per
    table which gets re-used for each message that's stored in the table it
    accesses.

    We never mentioned any dependence on message/product ordering either.

    Maybe some pseudo code will make things a bit clearer:

    INITIALIZE
    FOR EACH TABLE
    CREATE PreparedStatement
    LABEL with the associated product code(s) for this table
    END-FOR
    SET AUTO-COMMIT off
    END-INITIALIZE

    FOR EACH MESSAGE
    SELECT PreparedStatement USING product code
    PARSE message
    FOR EACH FIELD
    ATTACH it to the prepared statement column
    END-PARSE
    FOR EACH UNUSED COLUMN
    SET prepared statement column to null
    END-FOR
    INSERT ROW
    COMMIT
    END-FOR


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
    Martin Gregorie, Jan 21, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmVyZW15?=

    Help - How to prepare for release?

    =?Utf-8?B?SmVyZW15?=, Jan 21, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    285
    Kevin Spencer
    Jan 21, 2004
  2. Steve Franks
    Replies:
    12
    Views:
    605
    Scott Allen
    Oct 25, 2005
  3. Tor Inge Rislaa

    Prepare IIS for .NET Framework 2.0

    Tor Inge Rislaa, Sep 25, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    342
    Juan T. Llibre
    Sep 25, 2006
  4. Steve
    Replies:
    7
    Views:
    606
  5. Dave Cardwell
    Replies:
    4
    Views:
    149
    Anno Siegel
    Feb 22, 2004
Loading...

Share This Page