Database Statistics... Best way to maintain stats???

Lucas Tam · Aug 26, 2005

Hi all,

I have an application which logs a considerable amount of data. Each day,
we log about 50,000 to 100,000 rows of data.

We like to report on this data... currently I'm using a stored procedure to
calculate the statistics, however since this is an ad hoc, reports take a
while to generate.

So how do you guys handle large amounts of data? Is there a good way to
precalculate a a set of statistics to handle ad hoc queries (i.e. By Hour,
By Day, By Week, By Month). Our application also provides near realtime
statistics... so precalculation has to be done on a continual basis. Does
..NET have any statistics classes that might help out with this sort of
thing? I don't think Performance counters will work since they don't log
persistent data.

Any ideas?

Thanks!

Bruce Barker · Aug 27, 2005

you might look at using cubes for rollups. if you need realtime adhoc stats
(with slice and dice), then you want to use a snowflake or star schema.

since you are only doing 100k rows a day (pretty small actually), a simple
star schema should doit (with so rollups). with this low of volume, i'd
update the rollup tables in real time (simple trigger). i'd expect no query
to take over a second or two (unless it was years of detail).

-- bruce (sqlwork.com)

John Rivers · Aug 27, 2005

or, if you don't want to get into hypercubes, you can

"squash" the data from the first table into a second table

with an aggregate query which adds, at minimum, a RecordCount column
and removes as much unnecessary detail as possible

then, if you need to include both tables in a query, you
can create a union query that simply adds a RecordCount of 1
to the first table so it has the same number of cols

ie: select field1, field2, field3, 1 as recordcount from table1 union
all select field1, field2, field3, recordcount from table2

there is an efficient way to do the transfer

you create a transactional stored procedure that uses a temporary table
to index the primary keys
of the records to be transferred (usually based on date, say 10,000
every 5 minutes or whatever fits)
so it looks roughly like this:

select top 10000 primary key into #temp from table1 order by date
--
insert into table2 (field1, field2, field3, recordcount)
select field1, field2, field3, count(*)
from table1
where primarykey in (select primarykey from #temp)
--
delete
from table1
where primarykey in (select primarykey from #temp)

(you will have to add the transaction and error handling bits yourself)

i used that when getting around 100,000 records per day with heaps
of detail and it compressed nicely about 10 to 1 and my nastiest
olap-style
t-sql queries (12 months of data) took around 15 seconds

and you can easily simulate olaps intermediate calculations
with a single table refreshed every day or on demand

Storing Web stats to DB	0	Mar 6, 2007
Best Way to Maintain User Security Token Across Multiple Servers?	1	Sep 16, 2005
Interpreting Web statistics	10	Feb 20, 2006
Web Stats? Which to use? Which is best?	3	Feb 9, 2006
Best Way to Handle Unknown Data Sizes?	5	Apr 13, 2012
Best way to automatically copy out attachments from an email	0	Jan 12, 2011
Best way to disconnect from ldap?	12	Mar 21, 2012
Best Way to do Mail Merge	0	Apr 9, 2008

Database Statistics... Best way to maintain stats???

Lucas Tam

Bruce Barker

John Rivers

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads