seeking highly efficient caches scheme for demanding engineering computing?

L

Luna Moon

seeking highly efficient caches scheme for demanding engineering computing?

HI all,

To same the time of costly function evaluation, I want to explore the
possibility of caching.

Suppose in millions of calls to the function evaluation, there are some
computations, in which only a portion of the parameters change, others
remain the same as the parameters that are used in some other function
evaluations some time before. To save time, I can group the varying
parameters into sub-groups, and cache the results for later use. This needs
a highly efficient cache and a efficient organzation and look-up. Also, the
decision of whether a group of parameters has been seen before should be
based on data trunk in binary form, instead of decimal formats, so I can
compare memory data trunks directly using memory comparison.

Does anybody have good suggestions and pointers on this approach?

Thanks!
 
V

Victor Bazarov

Luna said:
To same the time of costly function evaluation, I want to explore the
possibility of caching.

Good idea.
Suppose in millions of calls to the function evaluation, there are
some computations, in which only a portion of the parameters change,
others remain the same as the parameters that are used in some other
function evaluations some time before. To save time, I can group the
varying parameters into sub-groups, and cache the results for later
use. This needs a highly efficient cache and a efficient organzation
and look-up. Also, the decision of whether a group of parameters has
been seen before should be based on data trunk in binary form,
instead of decimal formats, so I can compare memory data trunks
directly using memory comparison.
Does anybody have good suggestions and pointers on this approach?

In some earlier posts you indicated that you spent 1+ years on the
algorithm. I won't pretend that in a few minutes reading about it
and typing my answer I can solve the problem, BUT it does sound like
your data structures would probably benefit from a redesign. Don't
take it personally, and don't dismiss it right away. It is very hard
to introduce a decent (and working) caching scheme into a very rigid
code structure (from both the data and procedural point of view) and
that's why you might want to reorganize the entire calculation to pull
the common bits out and perform them either beforehand (if guaranteed
that all of them are needed) or as you go, if you can record the fact
of them having been calculated (lazy calculation). It's a sheer
speculation on my part, no doubt, but what if you could identify the
common bits by means of the object being passed by a reference into
all those functions (thus making the whole thing OO, but slightly
differently than what you have now). It would be *like* caching,
yet you won't base it on the values of the arguments themselves, but
instead you'll build it into the algorithm...

If it all sounds awfully difficult or insane, just disregard.

V
 
B

Bob Masta

seeking highly efficient caches scheme for demanding engineering computing?

HI all,

To same the time of costly function evaluation, I want to explore the
possibility of caching.

Suppose in millions of calls to the function evaluation, there are some
computations, in which only a portion of the parameters change, others
remain the same as the parameters that are used in some other function
evaluations some time before. To save time, I can group the varying
parameters into sub-groups, and cache the results for later use. This needs
a highly efficient cache and a efficient organzation and look-up. Also, the
decision of whether a group of parameters has been seen before should be
based on data trunk in binary form, instead of decimal formats, so I can
compare memory data trunks directly using memory comparison.

Does anybody have good suggestions and pointers on this approach?

Thanks!


The problem with all caching schemes is determining whether
you have the required data in the cache, and fetching it or
updating it.

There are hardware solutions to help with CPU and disk caching, but
in your case you would be doing this all yourself. So it would
seem this is only feasible if the savings in computation time
is much more than the cost in housekeeping overhead.
Seems doubtful for most things...

Best regards,


Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
 
B

Bob Masta

The problem with all caching schemes is determining whether
you have the required data in the cache, and fetching it or
updating it.

There are hardware solutions to help with CPU and disk caching, but
in your case you would be doing this all yourself. So it would
seem this is only feasible if the savings in computation time
is much more than the cost in housekeeping overhead.
Seems doubtful for most things...

On further thought, your original description is a pretty
good description of the Fast Fourier Transform. It
accomplishes the same ultimate calculation as the
Discrete Fourier Transform, but it avoids redundant
computation of sub-products. It doesn't use a specific
cache structure, but accomplishes the same result.

So you might want to apply the same sort of strategy
to your computation. Analyze the heck out of it, and
figure out where those redundant portions are, and
rearrange things to use the partial results most efficiently.
In the FFT case that means doing things in an order
different from the "obvious" DFT order. You may find
it helpful to use arrays of indices that point to what
sub-results to use when, and where to put the results.

Best regards,




Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
 
A

Alan Woodland

Luna said:
To same the time of costly function evaluation, I want to explore the
possibility of caching.

Suppose in millions of calls to the function evaluation, there are some
computations, in which only a portion of the parameters change, others
remain the same as the parameters that are used in some other function
evaluations some time before. To save time, I can group the varying
parameters into sub-groups, and cache the results for later use. This needs
a highly efficient cache and a efficient organzation and look-up. Also, the
decision of whether a group of parameters has been seen before should be
based on data trunk in binary form, instead of decimal formats, so I can
compare memory data trunks directly using memory comparison.

Does anybody have good suggestions and pointers on this approach?
Caching the results of calculations is always a trade-off between
several things. If I had a computer with infinite memory I'd pre-compute
the result of every function and then never actually calculate anything
at runtime. In the real world though you find that you can't cache
absolutely everything, and therefore it's best to cache only "the most
worthwhile" results, where "worthwhile" is usually some function of
frequency of hits and cost to calculate.

The other big thing to bear in mind is the cost of accessing the cache -
if the function is really cheap to compute (and may get inlined
automatically by any half decent optimising compiler if it makes sense
on your platform to do so) then there's no point adding a complicated
caching mechanism that will never be inlined and requires you getting a
global lock that's being contended for by all your threads. And you
could still find that on your architecture a (CPU) cache miss which
results in a fetch from RAM or worse still swap will be more expensive
than the function you're trying to cache.

It's been said a million times before, but profiling really is the only
sane way to start optimizing things.

I believe hash_map either is shortly will be standardised, with a struct
for that enables you to use your function parameters as a key with some
kind of added mask, a suitable hashing function, and some kind of RW
locking mechanism for MT support it should be possible to do what you want.

Alan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top