How Much Data to Provide?

Discussion in 'ASP .Net' started by Kevin Frey, Nov 29, 2007.

  1. Kevin Frey

    Kevin Frey Guest

    At our work a bit of a "philosophical" debate is underway as to how much
    data (in terms of number of rows of data) should be considered "enough" data
    for presenting to a user, via web application. This is not a web-based
    report (an entirely different issue) but a UI that allows a user to pick a
    record and work on it in a transactional system.

    What I mean by this is: if a particular table is extremely large (say 3
    MILLION rows) should we consider it "reasonable" to allow them page through
    all 3 million rows of data, if that's what they want to do?

    In the past I have built desktop applications that would happily let you
    page/scroll through millions of records (the concept is relatively similar).
    But nowadays I'm of the opinion that there is a fairly small number (let's
    say a few thousand, certainly unlikely > 5,000) that if the user needs to
    see beyond this amount they are either using the wrong mechanism (should be
    using a report) or they should not be doing that due to being so
    unproductive. In the second case, the user should instead be using a search
    mechanism to increase the selectivity of what they need require and narrow
    the number of records displayed.

    My technical reason for asking such a question is that when you want to
    allow paging through X million records, it is only feasible to do this when
    you are following well-defined navigational paths (indexes) in the database.
    This allows you to use a "TOP N" approach and grab small chunks of data to
    support each "page-full" of data being displayed.

    Such a technical "restriction" produces a dichotomy when you also want to
    present the data to a user in a grid and have them click a column heading to
    sort the data - my way of thinking is if someone sorts the data and then
    clicks "Next Page" they *should* see the next chunk of data based on that
    sort order. To permit that for every possible column heading of a data table
    would either require a bucket-load of indexes. But even then it is often the
    case that an index cannot help you because the user wants to sort on a
    foreign-key non-primary-key field (for example, Customer connects to
    Customer Type, and user wants to sort by Customer Type Description [which is
    not the key]).

    I know their exist strategies like materialised views and indexable views,
    but are these really appropriate solutions to this kind of problem? To make
    matters worse in our case, we want users to have some degree of
    customisation of what data they see, which means [within the limits of what
    makes logical sense in the model] they can "join up" more foreign keys or
    remove them. Even if we did use indexed views, simply indexing "everything"
    would be wasteful.

    Appreciate any comments you wish to make regarding the various strategies
    available.
     
    Kevin Frey, Nov 29, 2007
    #1
    1. Advertising

  2. I would not allow more than a few hundred in most cases. I would also
    provide a way to set up a hierarchy to reduce numbers (search with
    controlled constraints), like alphabetically by last name, currently under
    treatment, today's visitors, and the like. The means of whittling down
    depend on the nature of the business and the application.

    I would also have an advanced search type of feature that allows search for
    items by some or all fields that can be searched.

    I have retooled tons of BAD user interfaces that allowed people to get far
    more data than they would ever be able to use that day (sometimes even in a
    lifetime). Allowing a user to page through millions of records would fit
    that kind of UI.

    --
    Gregory A. Beamer
    MVP, MCP: +I, SE, SD, DBA

    *************************************************
    | Think outside the box!
    |
    *************************************************
    "Kevin Frey" <> wrote in message
    news:%...
    >
    > At our work a bit of a "philosophical" debate is underway as to how much
    > data (in terms of number of rows of data) should be considered "enough"
    > data for presenting to a user, via web application. This is not a
    > web-based report (an entirely different issue) but a UI that allows a user
    > to pick a record and work on it in a transactional system.
    >
    > What I mean by this is: if a particular table is extremely large (say 3
    > MILLION rows) should we consider it "reasonable" to allow them page
    > through all 3 million rows of data, if that's what they want to do?
    >
    > In the past I have built desktop applications that would happily let you
    > page/scroll through millions of records (the concept is relatively
    > similar). But nowadays I'm of the opinion that there is a fairly small
    > number (let's say a few thousand, certainly unlikely > 5,000) that if the
    > user needs to see beyond this amount they are either using the wrong
    > mechanism (should be using a report) or they should not be doing that due
    > to being so unproductive. In the second case, the user should instead be
    > using a search mechanism to increase the selectivity of what they need
    > require and narrow the number of records displayed.
    >
    > My technical reason for asking such a question is that when you want to
    > allow paging through X million records, it is only feasible to do this
    > when you are following well-defined navigational paths (indexes) in the
    > database. This allows you to use a "TOP N" approach and grab small chunks
    > of data to support each "page-full" of data being displayed.
    >
    > Such a technical "restriction" produces a dichotomy when you also want to
    > present the data to a user in a grid and have them click a column heading
    > to sort the data - my way of thinking is if someone sorts the data and
    > then clicks "Next Page" they *should* see the next chunk of data based on
    > that sort order. To permit that for every possible column heading of a
    > data table would either require a bucket-load of indexes. But even then it
    > is often the case that an index cannot help you because the user wants to
    > sort on a foreign-key non-primary-key field (for example, Customer
    > connects to Customer Type, and user wants to sort by Customer Type
    > Description [which is not the key]).
    >
    > I know their exist strategies like materialised views and indexable views,
    > but are these really appropriate solutions to this kind of problem? To
    > make matters worse in our case, we want users to have some degree of
    > customisation of what data they see, which means [within the limits of
    > what makes logical sense in the model] they can "join up" more foreign
    > keys or remove them. Even if we did use indexed views, simply indexing
    > "everything" would be wasteful.
    >
    > Appreciate any comments you wish to make regarding the various strategies
    > available.
    >
     
    Cowboy \(Gregory A. Beamer\), Nov 29, 2007
    #2
    1. Advertising

  3. Kevin Frey

    Dave Bush Guest

    My general strategy is to provide a text box and submit button above the
    gridview where the user can type part of the information he's looking
    for. The submit then triggers a query against the db using WHERE x LIKE
    '%' + data + '%' to retrieve a subset of the data the user is looking
    for.

    You may need to modify this somewhat for your circumstance, but my
    experience is that the user normally has some idea what they are looking
    for and allowing them to pre-filter the data and then refine by paging
    is a lot more efficient than either displaying it all on one page or
    making them page through the data.

    As for "How much data?" I generally show 10 row at a time.

    I'd also recommend implementing paging at the stored procedure instead
    of letting .NET do it for you. If they are looking for something that
    returns a lot of records, this will return less data, which will reduce
    the amount of time it will take to display the results on the browser.


    Dave Bush
    http://blog.dmbcllc.com




    -----Original Message-----
    From: Kevin Frey [mailto:]
    Posted At: Thursday, November 29, 2007 5:30 PM
    Posted To: microsoft.public.dotnet.framework.aspnet
    Conversation: How Much Data to Provide?
    Subject: How Much Data to Provide?


    At our work a bit of a "philosophical" debate is underway as to how much

    data (in terms of number of rows of data) should be considered "enough"
    data
    for presenting to a user, via web application. This is not a web-based
    report (an entirely different issue) but a UI that allows a user to pick
    a
    record and work on it in a transactional system.

    What I mean by this is: if a particular table is extremely large (say 3
    MILLION rows) should we consider it "reasonable" to allow them page
    through
    all 3 million rows of data, if that's what they want to do?

    In the past I have built desktop applications that would happily let you

    page/scroll through millions of records (the concept is relatively
    similar).
    But nowadays I'm of the opinion that there is a fairly small number
    (let's
    say a few thousand, certainly unlikely > 5,000) that if the user needs
    to
    see beyond this amount they are either using the wrong mechanism (should
    be
    using a report) or they should not be doing that due to being so
    unproductive. In the second case, the user should instead be using a
    search
    mechanism to increase the selectivity of what they need require and
    narrow
    the number of records displayed.

    My technical reason for asking such a question is that when you want to
    allow paging through X million records, it is only feasible to do this
    when
    you are following well-defined navigational paths (indexes) in the
    database.
    This allows you to use a "TOP N" approach and grab small chunks of data
    to
    support each "page-full" of data being displayed.

    Such a technical "restriction" produces a dichotomy when you also want
    to
    present the data to a user in a grid and have them click a column
    heading to
    sort the data - my way of thinking is if someone sorts the data and then

    clicks "Next Page" they *should* see the next chunk of data based on
    that
    sort order. To permit that for every possible column heading of a data
    table
    would either require a bucket-load of indexes. But even then it is often
    the
    case that an index cannot help you because the user wants to sort on a
    foreign-key non-primary-key field (for example, Customer connects to
    Customer Type, and user wants to sort by Customer Type Description
    [which is
    not the key]).

    I know their exist strategies like materialised views and indexable
    views,
    but are these really appropriate solutions to this kind of problem? To
    make
    matters worse in our case, we want users to have some degree of
    customisation of what data they see, which means [within the limits of
    what
    makes logical sense in the model] they can "join up" more foreign keys
    or
    remove them. Even if we did use indexed views, simply indexing
    "everything"
    would be wasteful.

    Appreciate any comments you wish to make regarding the various
    strategies
    available.
     
    Dave Bush, Nov 29, 2007
    #3
  4. Kevin Frey

    Kevin Frey Guest

    > I'd also recommend implementing paging at the stored procedure instead
    > of letting .NET do it for you. If they are looking for something that
    > returns a lot of records, this will return less data, which will reduce
    > the amount of time it will take to display the results on the browser.


    So do I take from your comments that if there was a substantive amount of
    data (let's say 500 records) you would only retrieve 10 or so at a time?

    How would you then handle the situation if the user wants a different
    sorting order for their data, but still wants to page through it?
     
    Kevin Frey, Nov 30, 2007
    #4
  5. "Kevin Frey" <> wrote in message
    news:%...

    > So do I take from your comments that if there was a substantial amount of
    > data (let's say 500 records) you would only retrieve 10 or so at a time?


    I certainly would - the trick with web performance is many small requests...

    > How would you then handle the situation if the user wants a different
    > sorting order for their data, but still wants to page through it?


    Sorting changes the sort order, and sets the page of data back to 1.

    Paging retains the sort order.


    --
    Mark Rae
    ASP.NET MVP
    http://www.markrae.net
     
    Mark Rae [MVP], Nov 30, 2007
    #5
  6. Kevin Frey

    Dave Bush Guest

    Agree

    -----Original Message-----
    From: Mark Rae [MVP] [mailto:]
    Posted At: Friday, November 30, 2007 4:22 AM
    Posted To: microsoft.public.dotnet.framework.aspnet
    Conversation: How Much Data to Provide?
    Subject: Re: How Much Data to Provide?

    "Kevin Frey" <> wrote in message
    news:%...

    > So do I take from your comments that if there was a substantial amount

    of
    > data (let's say 500 records) you would only retrieve 10 or so at a

    time?

    I certainly would - the trick with web performance is many small
    requests...

    > How would you then handle the situation if the user wants a different
    > sorting order for their data, but still wants to page through it?


    Sorting changes the sort order, and sets the page of data back to 1.

    Paging retains the sort order.


    --
    Mark Rae
    ASP.NET MVP
    http://www.markrae.net
     
    Dave Bush, Nov 30, 2007
    #6
  7. Kevin Frey

    Kevin Frey Guest

    > Paging retains the sort order.

    So you will only allow sorting where there is a corresponding index on the
    database?
     
    Kevin Frey, Dec 2, 2007
    #7
  8. "Kevin Frey" <> wrote in message
    news:...

    >> Paging retains the sort order.

    >
    > So you will only allow sorting where there is a corresponding index on the
    > database?


    Precisely the other way round, otherwise you have IT driving the business
    requirements which is total lunacy...

    If a set of data needs to be sortable on a column, then that column needs to
    be indexed otherwise you risk forcing your RDBMS to perform a table scan,
    which can seriously cripple your performance...


    --
    Mark Rae
    ASP.NET MVP
    http://www.markrae.net
     
    Mark Rae [MVP], Dec 2, 2007
    #8
  9. Kevin Frey

    Kevin Frey Guest

    > Precisely the other way round, otherwise you have IT driving the business
    > requirements which is total lunacy...
    >
    > If a set of data needs to be sortable on a column, then that column needs
    > to be indexed otherwise you risk forcing your RDBMS to perform a table
    > scan, which can seriously cripple your performance...


    I didn't mean to imply that IT was driving the business requirement, simply
    that there needed to be an index in place for any column which was to be
    sortable.
     
    Kevin Frey, Dec 3, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Troll
    Replies:
    35
    Views:
    2,349
    Troll
    Sep 4, 2003
  2. Daniel Mark
    Replies:
    4
    Views:
    24,153
    Brian Blais
    Sep 23, 2006
  3. Froefel

    Howto provide form data to a BLL/DAL?

    Froefel, Jul 5, 2007, in forum: ASP .Net
    Replies:
    0
    Views:
    372
    Froefel
    Jul 5, 2007
  4. cpp4ever
    Replies:
    3
    Views:
    377
    Francesco
    Sep 8, 2009
  5. Raymond Schanks
    Replies:
    0
    Views:
    533
    Raymond Schanks
    Apr 11, 2010
Loading...

Share This Page