Query 10 times slower (and results reordered!) from ASP.NET page than console app.

Discussion in 'ASP .Net' started by wizofaus@hotmail.com, Oct 25, 2006.

  1. Guest

    I have a chunk of code which is essentially

    IDbCommand cmd = db.CreateCommand();
    cmd.CommandText = "SELECT X, Y, Count(*) FROM Foo WHERE Z = 1 GROUP BY
    X, Y";
    using (IDataReader reader = cmd.ExecuteReader())
    while (reader.Read())
    {
    // grab values from query
    }

    The database is SQL server (MSDE 2000), and "Foo" has well over a
    million records. The query however only runs about 12 rows.

    When I compile this code into a Windows console application, and run
    it, it completes the query in under 2 seconds. But when I run exactly
    the same code from an ASP.NET class, it takes over 20 seconds!
    Furthermore, even though the values returned are exactly the same, the
    ORDER of the rows is quite different: specifically, under ASP.NET the
    rows are all ordered first by Y then by X, even though there's no
    logical reason they should be. I actually tried adding an "ORDER BY Y,
    X" to the end, which did cause the console app version to print out the
    rows in the same order, but made no difference to the execution speed.

    I've tried using an OdbcConnection, and OleDbConnection and an
    SqlConnection - neither make any difference there either. I've made
    sure both are compiled in release mode, and restarted IIS, and made
    sure that my ASP.NET application is doing nothing else except this one
    query.

    Any suggestions most welcome!

    Thanks

    Dylan
    , Oct 25, 2006
    #1
    1. Advertising

  2. Chris Lim Guest

    Use Profiler to compare execution plans. It sounds like different
    indexes are being used, hence the different ordering as well. The
    Console app's query may be using a better index than the ASP.NET one
    (which may or may not be using an index at all).

    wrote:
    > I have a chunk of code which is essentially
    >
    > IDbCommand cmd = db.CreateCommand();
    > cmd.CommandText = "SELECT X, Y, Count(*) FROM Foo WHERE Z = 1 GROUP BY
    > X, Y";
    > using (IDataReader reader = cmd.ExecuteReader())
    > while (reader.Read())
    > {
    > // grab values from query
    > }
    >
    > The database is SQL server (MSDE 2000), and "Foo" has well over a
    > million records. The query however only runs about 12 rows.
    >
    > When I compile this code into a Windows console application, and run
    > it, it completes the query in under 2 seconds. But when I run exactly
    > the same code from an ASP.NET class, it takes over 20 seconds!
    > Furthermore, even though the values returned are exactly the same, the
    > ORDER of the rows is quite different: specifically, under ASP.NET the
    > rows are all ordered first by Y then by X, even though there's no
    > logical reason they should be. I actually tried adding an "ORDER BY Y,
    > X" to the end, which did cause the console app version to print out the
    > rows in the same order, but made no difference to the execution speed.
    >
    > I've tried using an OdbcConnection, and OleDbConnection and an
    > SqlConnection - neither make any difference there either. I've made
    > sure both are compiled in release mode, and restarted IIS, and made
    > sure that my ASP.NET application is doing nothing else except this one
    > query.
    >
    > Any suggestions most welcome!
    >
    > Thanks
    >
    > Dylan
    Chris Lim, Oct 25, 2006
    #2
    1. Advertising

  3. Jeff Dillon Guest

    No need to cross-post so much!

    Does it still take 20 seconds on the "second" invocation of the same ASP.NET
    page?

    Jeff
    <> wrote in message
    news:...
    >I have a chunk of code which is essentially
    >
    > IDbCommand cmd = db.CreateCommand();
    > cmd.CommandText = "SELECT X, Y, Count(*) FROM Foo WHERE Z = 1 GROUP BY
    > X, Y";
    > using (IDataReader reader = cmd.ExecuteReader())
    > while (reader.Read())
    > {
    > // grab values from query
    > }
    >
    > The database is SQL server (MSDE 2000), and "Foo" has well over a
    > million records. The query however only runs about 12 rows.
    >
    > When I compile this code into a Windows console application, and run
    > it, it completes the query in under 2 seconds. But when I run exactly
    > the same code from an ASP.NET class, it takes over 20 seconds!
    > Furthermore, even though the values returned are exactly the same, the
    > ORDER of the rows is quite different: specifically, under ASP.NET the
    > rows are all ordered first by Y then by X, even though there's no
    > logical reason they should be. I actually tried adding an "ORDER BY Y,
    > X" to the end, which did cause the console app version to print out the
    > rows in the same order, but made no difference to the execution speed.
    >
    > I've tried using an OdbcConnection, and OleDbConnection and an
    > SqlConnection - neither make any difference there either. I've made
    > sure both are compiled in release mode, and restarted IIS, and made
    > sure that my ASP.NET application is doing nothing else except this one
    > query.
    >
    > Any suggestions most welcome!
    >
    > Thanks
    >
    > Dylan
    >
    Jeff Dillon, Oct 25, 2006
    #3
  4. Guest

    "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    Chris Lim wrote:
    > Use Profiler to compare execution plans. It sounds like different
    > indexes are being used, hence the different ordering as well. The
    > Console app's query may be using a better index than the ASP.NET one
    > (which may or may not be using an index at all).
    >

    That wouldn't make sense - it's exactly the same database.

    The problem, believe it or not, was the case of the word "by" in the
    GROUP BY clause.

    If you use uppercase 'BY' the query takes 20 seconds, if you use "By",
    it takes 2 seconds.

    Pretty close to the damn weirdest bug I've come across any time
    recently, especially because it happens under all three IDbConnection
    providers (but not using the command-line osql tool).
    , Oct 25, 2006
    #4
  5. Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    wrote:
    > Chris Lim wrote:
    > > Use Profiler to compare execution plans. It sounds like different
    > > indexes are being used, hence the different ordering as well. The
    > > Console app's query may be using a better index than the ASP.NET one
    > > (which may or may not be using an index at all).
    > >

    > That wouldn't make sense - it's exactly the same database.
    >
    > The problem, believe it or not, was the case of the word "by" in the
    > GROUP BY clause.
    >
    > If you use uppercase 'BY' the query takes 20 seconds, if you use "By",
    > it takes 2 seconds.
    >
    > Pretty close to the damn weirdest bug I've come across any time
    > recently, especially because it happens under all three IDbConnection
    > providers (but not using the command-line osql tool).


    Did some more poking around, the above isn't quite true. Firstly, it
    only happens with OdbcDbConnection and OleDbConnection - not
    SqlConnection. So I'm fairly sure it's not an SQL server problem per
    se.

    Furthermore, it specifically only occurs when I'm using Parameters that
    need their types converting, and using "IN" for matching:

    So the following code:

    IDbCommand cmd = db.CreateCommand();
    cmd.CommandText = "SELECT X, Y, Count(*) FROM Foo WHERE Z IN (?) GROUP
    BY X, Y";
    IDbDataParameter param = cmd.CreateParameter();
    param.Value = "1";
    cmd.Parameters.Add(param);
    using (IDataReader reader = cmd.ExecuteReader())
    while (reader.Read())
    {
    // grab values
    }

    Will reproduce the problem.

    Change the query to be "WHERE Z = ?", or change param.Value to be just
    1 (instead of the string literal "1"), or change "GROUP BY" to "GROUP
    By" and the problem goes away.

    Odd, huh.
    , Oct 25, 2006
    #5
  6. > Furthermore, even though the values returned are exactly the same, the
    > ORDER of the rows is quite different


    Well, you don't have an ORDER BY clause, so SQL Server is free to return the
    rows in any row it chooses. There is nothing wrong with even the same piece
    of code running multiple times and returning a different order every time...
    neglecting to have an ORDER BY clause is basically telling SQL Server that
    you don't care about the order. So, long story short, if you want/expect a
    certain order, SAY SO.

    >: specifically, under ASP.NET the
    > rows are all ordered first by Y then by X, even though there's no
    > logical reason they should be.


    You could say the same for the ordering no matter what order they came out:
    without an order by clause, there is no logical orderto expect. SQL Server
    will simply return the rows in whatever order it deems most efficient.

    > sure both are compiled in release mode, and restarted IIS


    And are you only testing the ASP.Net page once after restarting IIS? See,
    the first time you load an ASP.Net page in IIS, it does a bunch of
    background activity with it. Running Profiler might point out how much of
    that time IIS is spent doing its work with the ASP.Net file, as opposed to
    running the query.

    Remember too that there is more overhead in the display time used by the
    ASP.Net page to render results to HTML than there would be for a console app
    returning plain text to the command line. And if you are running the
    console app on the same machine as SQL Server, and the web site is on a
    different machine, you need to factor network latency in there as well.

    A
    Aaron Bertrand [SQL Server MVP], Oct 25, 2006
    #6
  7. Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    >> Use Profiler to compare execution plans. It sounds like different
    >> indexes are being used, hence the different ordering as well. The
    >> Console app's query may be using a better index than the ASP.NET one
    >> (which may or may not be using an index at all).
    >>

    > That wouldn't make sense - it's exactly the same database.
    >
    > The problem, believe it or not, was the case of the word "by" in the
    > GROUP BY clause.
    >
    > If you use uppercase 'BY' the query takes 20 seconds, if you use "By",
    > it takes 2 seconds.


    That's not exactly why. And I think Chris was more on track than you might
    think "makes sense."

    If you clear out the procedure cache and the buffers, you will find that
    whether you use By or BY the first time makes no difference. The problem is
    that SQL Server stores execution plans for ad hoc SQL statements, and treats
    the plan with case sensitivity (and even for whitespace differences). So it
    might be that when you ran one version you got a certain plan because of
    data size or who knows what, and when you ran a version with a different
    casing, a different plan got stored (again, who knows why, we don't know
    enough about your system).

    There's a guaranteed way around this: STOP USING AD HOC SQL. Is there any
    reason you're not using stored procedures?

    Another way around this is to use consistent casing and spacing/indenting
    for T-SQL keywords. You'd be amazed how much more readable and maintainable
    your code will become if you pick a convention and stick with it.

    A
    Aaron Bertrand [SQL Server MVP], Oct 25, 2006
    #7
  8. Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    Aaron Bertrand [SQL Server MVP] wrote:
    > >> Use Profiler to compare execution plans. It sounds like different
    > >> indexes are being used, hence the different ordering as well. The
    > >> Console app's query may be using a better index than the ASP.NET one
    > >> (which may or may not be using an index at all).
    > >>

    > > That wouldn't make sense - it's exactly the same database.
    > >
    > > The problem, believe it or not, was the case of the word "by" in the
    > > GROUP BY clause.
    > >
    > > If you use uppercase 'BY' the query takes 20 seconds, if you use "By",
    > > it takes 2 seconds.

    >
    > That's not exactly why. And I think Chris was more on track than you might
    > think "makes sense."
    >
    > If you clear out the procedure cache and the buffers, you will find that
    > whether you use By or BY the first time makes no difference. The problem is
    > that SQL Server stores execution plans for ad hoc SQL statements, and treats
    > the plan with case sensitivity (and even for whitespace differences). So it
    > might be that when you ran one version you got a certain plan because of
    > data size or who knows what, and when you ran a version with a different
    > casing, a different plan got stored (again, who knows why, we don't know
    > enough about your system).
    >

    Might be, but my additional investigation seemed to reveal that it was
    a bug in the Odbc and OleDb providers, given that the case of "By"
    wasn't the only thing causing the problem. I have to admit I didn't
    try chaning the case of other parts of the query, but it was absolutely
    consistent: I could change it to "BY", run it 3 times, and have it take
    20 seconds each time, then change it to "By" and it would run at < 2
    seconds.

    > There's a guaranteed way around this: STOP USING AD HOC SQL. Is there any
    > reason you're not using stored procedures?


    Yup, we need to support Access, Oracle, MySql etc. etc.

    I certainly want to look at using more parameterized views, which would
    probably help in this case. But I actually build up the SQL statements
    dynamically based on a number of user-configurable parameters -
    including even which fields are selected/grouped on, so it's hard to
    see how to avoid some amount of ad hoc SQL.
    >
    > Another way around this is to use consistent casing and spacing/indenting
    > for T-SQL keywords. You'd be amazed how much more readable and maintainable
    > your code will become if you pick a convention and stick with it.


    I always use all upper-case - I've actually no idea how I ended up with
    the mixed case "By", but it was a good job I did, or I might never have
    found the problem.
    , Oct 25, 2006
    #8
  9. Chris Lim Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    wrote:
    > Might be, but my additional investigation seemed to reveal that it was
    > a bug in the Odbc and OleDb providers, given that the case of "By"
    > wasn't the only thing causing the problem. I have to admit I didn't
    > try chaning the case of other parts of the query, but it was absolutely
    > consistent: I could change it to "BY", run it 3 times, and have it take
    > 20 seconds each time, then change it to "By" and it would run at < 2
    > seconds.


    Strange problem. Still, using Profiler you could at least see where the
    delay is and it might give some more clues as to what's actually going
    on.
    Chris Lim, Oct 25, 2006
    #9
  10. Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    > wasn't the only thing causing the problem. I have to admit I didn't
    > try chaning the case of other parts of the query, but it was absolutely
    > consistent: I could change it to "BY", run it 3 times, and have it take
    > 20 seconds each time, then change it to "By" and it would run at < 2
    > seconds.


    You could run BY 8000 times and By 8000 times and it would still execute the
    same in each case. Unless you follow my earlier advice to clear the
    procedure cache and drop the buffers, which will get rid of both plans and
    you may see that the behavior was due to a bad plan stored for one of the
    cases, not due to a bug in ODBC or due to the case itself.

    A
    Aaron Bertrand [SQL Server MVP], Oct 25, 2006
    #10
  11. Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    > wasn't the only thing causing the problem. I have to admit I didn't
    > try chaning the case of other parts of the query, but it was absolutely
    > consistent: I could change it to "BY", run it 3 times, and have it take
    > 20 seconds each time, then change it to "By" and it would run at < 2
    > seconds.


    You could run BY 8000 times and By 8000 times and it would still execute the
    same in each case. Unless you follow my earlier advice to clear the
    procedure cache and drop the buffers, which will get rid of both plans and
    you may see that the behavior was due to a bad plan stored for one of the
    cases, not due to a bug in ODBC or due to the case itself.

    A
    Aaron Bertrand [SQL Server MVP], Oct 25, 2006
    #11
  12. Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    Aaron Bertrand [SQL Server MVP] wrote:
    > > wasn't the only thing causing the problem. I have to admit I didn't
    > > try chaning the case of other parts of the query, but it was absolutely
    > > consistent: I could change it to "BY", run it 3 times, and have it take
    > > 20 seconds each time, then change it to "By" and it would run at < 2
    > > seconds.

    >
    > You could run BY 8000 times and By 8000 times and it would still execute the
    > same in each case. Unless you follow my earlier advice to clear the
    > procedure cache and drop the buffers, which will get rid of both plans and
    > you may see that the behavior was due to a bad plan stored for one of the
    > cases, not due to a bug in ODBC or due to the case itself.
    >

    Possible, but it wouldn't explain why a) it didn't happen when using
    the .NET class "SqlConnection" and b) it didn't happen when using an
    integer C# datatype for the parameter.

    BTW, I have another question concerning efficiency - I ended up having
    to delete all those records from the "Foo" table on my development
    machine because it was causing even unrelated queries, i.e. queries
    that were pulling records from that table, but that explicitly filtered
    out almost of them in the WHERE clause via a key field, to run too
    slow. Now that key field was marked with a foreign key constraint to
    another table, but I hadn't explicitly created an index on it (mainly
    because I thought the foreign key constraint would implicitly define
    one, as it does under Access and even Oracle, I believe). Should I
    reasonably expect a query like

    SELECT Count(*) FROM Foo WHERE Z = 2

    To run at much the same speed no matter how many records there are in
    Foo that have Z <> 2, provided there is an index on Z?
    , Oct 25, 2006
    #12
  13. Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    > Possible, but it wouldn't explain why a) it didn't happen when using
    > the .NET class "SqlConnection" and b) it didn't happen when using an
    > integer C# datatype for the parameter.


    Changing the case from GROUP BY to GROUP By does not magically change the
    way the query works, unless you had a bad plan cached for one case.

    Maybe in your scenario the settings for "SqlConnection" or some other
    environmental thing prevented the cached plan from being used.

    Did you bother running a profiler trace to see what is going on? How about
    examining the plan for both queries (copied verbatim) in Query Analyzer /
    Management Studio?

    I have no idea how to reproduce your scenario, so I don't know what else to
    tell you, except that chaning BY to By alone is not the primary / sole
    culprit for the difference you're seeing.

    A
    Aaron Bertrand [SQL Server MVP], Oct 25, 2006
    #13
  14. Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    Aaron Bertrand [SQL Server MVP] wrote:
    > > Possible, but it wouldn't explain why a) it didn't happen when using
    > > the .NET class "SqlConnection" and b) it didn't happen when using an
    > > integer C# datatype for the parameter.

    >
    > Changing the case from GROUP BY to GROUP By does not magically change the
    > way the query works, unless you had a bad plan cached for one case.


    Or there's something odd in the Odbc/OleDb drivers that somehow picks
    up on the case of the query. Anyway, having deleted and re-inserted
    all the rows from the table, I can't reproduce the problem anymore. I
    suppose the "bad plan" theory makes the most sense, although it
    concerns me how it happened in the first place.
    >
    > Maybe in your scenario the settings for "SqlConnection" or some other
    > environmental thing prevented the cached plan from being used.
    >
    > Did you bother running a profiler trace to see what is going on? How about
    > examining the plan for both queries (copied verbatim) in Query Analyzer /
    > Management Studio?


    I'm only using MSDE at this point - I don't have access to a full blown
    SQL installation, so I don't believe I have all those tools (and I
    certainly don't pretend to be familiar with them).
    , Oct 25, 2006
    #14
  15. Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    wrote:
    >
    > BTW, I have another question concerning efficiency - I ended up having
    > to delete all those records from the "Foo" table on my development
    > machine because it was causing even unrelated queries, i.e. queries
    > that were pulling records from that table, but that explicitly filtered
    > out almost of them in the WHERE clause via a key field, to run too
    > slow. Now that key field was marked with a foreign key constraint to
    > another table, but I hadn't explicitly created an index on it (mainly
    > because I thought the foreign key constraint would implicitly define
    > one, as it does under Access and even Oracle, I believe). Should I
    > reasonably expect a query like
    >
    > SELECT Count(*) FROM Foo WHERE Z = 2
    >
    > To run at much the same speed no matter how many records there are in
    > Foo that have Z <> 2, provided there is an index on Z?


    Further on this, I determined that

    a) sql server definitely does create an implicit index when creating a
    foreign key constraint (you can see it in the sysindexes table)
    but
    b) adding an explicit index definitely speeds up queries that filter
    out most of the records - by a factor or 4 or 5!
    however
    c) adding the explicit index makes no difference to queries that do
    need to process most of the records

    To clarify:

    Table 'foo' has over 1.5 million rows with Z = 1, and about 2000 with Z
    = 2

    Z has a foreign key constraint to another table.

    The query

    SELECT X, Y, Count(*) FROM Foo WHERE Z = 1 GROUP BY X, Y

    Takes about the same time (just under 2 seconds) whether or not there
    is an explicit index (using CREATE INDEX) on Z, but the query

    SELECT X, Y, Count(*) FROM Foo WHERE Z = 2 GROUP BY X, Y

    Takes about 5 times longer (200ms vs 40ms) when there is no explicit
    index on Z.

    Note that both queries return the same number of rows (12).
    Does this seem normal?
    , Oct 25, 2006
    #15
  16. Roy Harvey Guest

    Re: "BY" vs "By" (was Re: Query 10 times slower (and results reordered!) from ASP.NET page than console app.)

    On 25 Oct 2006 15:21:09 -0700, wrote:

    >Further on this, I determined that
    >
    >a) sql server definitely does create an implicit index when creating a
    >foreign key constraint (you can see it in the sysindexes table)


    No, it does not. It creates index statistics, which is what you were
    seeing, but not an index.

    >but
    >b) adding an explicit index definitely speeds up queries that filter
    >out most of the records - by a factor or 4 or 5!


    Yes, having an index that fits a selective query is way faster, but
    remember this is comparing to no index at all.

    >however
    >c) adding the explicit index makes no difference to queries that do
    >need to process most of the records


    The optimizer knows better to use an index - and thus go jumping all
    over the table - when it has to look at most of the table in any case.

    Roy Harvey
    Beacon Falls, CT
    Roy Harvey, Oct 26, 2006
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Rivers

    VS.NET is 10 times slower than VB6

    John Rivers, Aug 23, 2005, in forum: ASP .Net
    Replies:
    91
    Views:
    3,086
    =?Utf-8?B?TWFuaWthbmRhbg==?=
    Nov 2, 2005
  2. croeltgen
    Replies:
    1
    Views:
    485
    Andrew Thompson
    Oct 25, 2004
  3. Jack Steven
    Replies:
    2
    Views:
    413
    Chris Rebert
    Mar 9, 2009
  4. Philip Semanchuk

    Re: C-extension 2 times slower than exe

    Philip Semanchuk, Jun 23, 2009, in forum: Python
    Replies:
    3
    Views:
    397
  5. sanket
    Replies:
    7
    Views:
    976
    Tsung
    Nov 3, 2011
Loading...

Share This Page