Using c++ template metaprogramming for high performance number crunching?

Discussion in 'C++' started by Ted, Sep 21, 2007.

  1. Ted

    Ted Guest

    I have cross posted this to comp.lang.c++ and to sci.math.num-
    analysis
    in the belief that the topic is of interest to some in both groups.

    I am building my toolkit, in support of my efforts in producing high
    performance C++ code; code that is provably correct. The issue here
    is orthogonal to the question of expression templates, which at
    present seem to me to be relatively simple.


    I began with the dimensional analysis as described in Abrahams and
    Gurtovoy's book on template metaprogramming. From there, I developed
    a template class hierarchy specific to simulation, that avoids
    unnecessary copying of data during a simulation. All works fine, and
    my simulation templates interact with the dimensional analysis
    templates perfectly. The cost of using my templates is about the
    same
    as using the corresponding pods. At some point, if there is interest,
    I can provide a little zip archive showing what I have done so far.
    Just tell me how you want to receive it if you want to see some
    trivially simple code.


    Anyway, the problem I am wrestling with, so far unsuccessfully,
    relates to the question of units of measurement, and especially how
    to
    avoid what ought to be unnecessary flops during the simulation.
    Suppose I have a volume variable, with measurements in cubic meters,
    and a density variable, expressed in ounces per cubic centimeter, and
    a mass expressed in kilograms. Obviously contrived though this
    example is, it is simple enough to illustrate what I need to do.
    Obviously, templates adequate from the perspective of dimensional
    analysis as described in Abrahams and Gurtovoy's book would allow me
    to assign a quotient involving my mass and volume variables to the
    density variable. Also obviously such an assignment, that ignores
    units of measurement, is likely to be wrong in most cases.


    What I want to do, ideally, is do a little extra metaprogramming to
    provide for implicit conversions, both for situations where I have
    one
    length in miles and another in kilometers, and need to convert
    between
    the two, and for composite variables (e.g. converting between one
    density expressed in ounces per cubic foot and another expressed in
    kilograms per cubic meter), while correctly refusing to convert
    obviously wrong attempts to convert, e.g., between a length in feet
    and a mass in kilograms. And since all valid conversions are known a
    priori, well before any attempt to compile code, one would want the
    conversions to happen, ideally, at compile time, or at worst before
    any simulation begins. If one thinks of the cost of iterating a
    function a million times, and a particular conversion of a constant
    used in the function involves several flops, the simulation wastes
    millions of flops during the simulation if I fail to find a way to
    automate the conversions so that it happens either at compile time or
    at a minimum before the simulation begins.


    I am not sure I can get what I want because most of the conversions I
    am after require use of floating point numbers, and IIRC, they can't
    be used as template arguments.


    While one might argue that this is a waste of time, since one can
    always do it manually, as one writes the code, failing to find a way
    to automate it results in code that seems to me to be unnecessarily
    rigid. Imagine that a simulation engine is developed using this,
    embedded within an application that takes equations provided by the
    user, parses them into a DLL that can be then loaded and used
    (obviously such an application would need to be distributed with a
    decent compiler able to produce a DLL, but that is another matter),
    and concommitantly takes data from the user for use in parameterizing
    his model. If the model involves energy, and the equations use it as
    joules, while the only source of data the user has at hand expresses
    values in calories, one imposes extra work, vulnerable to error, on
    the user.


    Is what I am after possible? If you think so, how would you do it?
    My own efforts, along the lines of a traits template parameter,
    representing units and operations on them, have not yet paid off. I
    find myself floundering, in part on the problem of producing a
    suitably flexible template class that provides correct implicit
    conversions, while rejecting incorrect conversions. I am also having
    trouble figuring out the correct partial specializations required. I
    am now not convinced I am even on the right track for solving this
    particular problem. It is frustrating to have achieved everything
    else I wanted in my library intended to support high performance
    simulation, and not to see clearly the solution to this problem. :-(


    Any ideas?


    Ted
    Ted, Sep 21, 2007
    #1
    1. Advertising

  2. Ted

    Guest

    Ted,

    You might want to take a look at the Boost.Units library (written by
    myself and Steven Watanabe). It is currently available on the Boost
    sandbox SVN repository (http://svn.boost.org/svn/boost/sandbox/). The
    problem of zero-overhead compile time units is both very interesting
    and surprisingly subtle and complex. We have come up with an
    implementation that is significantly more flexible than, e.g. the
    outline given in Gurtovoy & Abrahams' book, allowing definition and
    mixing of essentially arbitrary units of measure (assuming that they
    obey the conventional rules of dimensional analysis). Our
    implementation allows fine-grained control on a per base unit level
    over implicit conversion, and performs unit conversions at compile
    time. The code is quite demanding of compilers, so you should have a
    recent one (gcc 4/VC8) if you are hoping to achieve zero-overhead.
    Documentation is not complete, but there is an extensive set of
    examples. You might also want to follow the threads on the Boost
    mailing list, especially the discussion during the formal review of
    the library.

    Regards,

    Matthias
    , Sep 22, 2007
    #2
    1. Advertising

  3. Ted

    Ted Guest

    On Sep 21, 7:21 pm, wrote:
    > Ted,
    >
    > You might want to take a look at the Boost.Units library (written by
    > myself and Steven Watanabe). It is currently available on the Boost
    > sandbox SVN repository (http://svn.boost.org/svn/boost/sandbox/). The
    > problem of zero-overhead compile time units is both very interesting
    > and surprisingly subtle and complex. We have come up with an
    > implementation that is significantly more flexible than, e.g. the
    > outline given in Gurtovoy & Abrahams' book, allowing definition and
    > mixing of essentially arbitrary units of measure (assuming that they
    > obey the conventional rules of dimensional analysis). Our
    > implementation allows fine-grained control on a per base unit level
    > over implicit conversion, and performs unit conversions at compile
    > time. The code is quite demanding of compilers, so you should have a
    > recent one (gcc 4/VC8) if you are hoping to achieve zero-overhead.
    > Documentation is not complete, but there is an extensive set of
    > examples. You might also want to follow the threads on the Boost
    > mailing list, especially the discussion during the formal review of
    > the library.
    >
    > Regards,
    >
    > Matthias


    Thanks Matthias,

    Is there a simple way to download your library, or do I have to do it
    a file at a time?

    BTW: I have both gcc 4.2.1 and VC8.

    Thanks again,

    Ted
    Ted, Sep 22, 2007
    #3
  4. Re: Using c++ template metaprogramming for high performance numbercrunching?

    Ted wrote:
    ....
    > Is there a simple way to download your library, or do I have to do it
    > a file at a time?


    You can :

    a) Use subversion (TortoiseSVN)
    b) Use a DAV client (WinXP has one).
    Gianni Mariani, Sep 22, 2007
    #4
  5. Ted

    Ted Guest

    On Sep 21, 7:21 pm, wrote:
    > Ted,
    >
    > You might want to take a look at the Boost.Units library (written by
    > myself and Steven Watanabe). It is currently available on the Boost
    > sandbox SVN repository (http://svn.boost.org/svn/boost/sandbox/). The
    > problem of zero-overhead compile time units is both very interesting
    > and surprisingly subtle and complex. We have come up with an
    > implementation that is significantly more flexible than, e.g. the
    > outline given in Gurtovoy & Abrahams' book, allowing definition and
    > mixing of essentially arbitrary units of measure (assuming that they
    > obey the conventional rules of dimensional analysis). Our
    > implementation allows fine-grained control on a per base unit level
    > over implicit conversion, and performs unit conversions at compile
    > time. The code is quite demanding of compilers, so you should have a
    > recent one (gcc 4/VC8) if you are hoping to achieve zero-overhead.
    > Documentation is not complete, but there is an extensive set of
    > examples. You might also want to follow the threads on the Boost
    > mailing list, especially the discussion during the formal review of
    > the library.
    >
    > Regards,
    >
    > Matthias


    Thanks Matthias

    I now have it, and will be working through it. It appears, though, to
    be quite large and complex. You clearly put a lot of time and effort
    into it. Any chance of getting a quick and dirty explanation of your
    design rationale?

    I find it quite interesting that you mention in your message the
    "problem of zero-overhead compile time". I can understand why, but I
    am much more concerned with the question of run time overhead. I have
    worked on simulation problems, related to contaminant transport, in
    which the original code could take a day or more to complete, and by
    the time I had reimplemented the simulation model, and verified it to
    be correct, the same simulation would run to completion in a matter of
    minutes. A contaminant transport model might consist of a system of
    rate equations combining the influence of half a dozen state
    variables, or more, each measured in different units. Of course, the
    model parameters would be constructed in a way that allows you to
    include, e.g. a temperature in an equation used to calculate a change
    in concentration. If that is badly done, one could waste countles
    mflops over the course of a simulation. The cost of that is of much
    greater concern to me than the chance I might have to wait an extra
    ten minutes for a compile to complete. I see you support computing
    conversions at compile time. I am wondering how flexible that is.
    For example, I may find myself in a situation where the conversion
    needed is not known until run time, so I may need a library structured
    so that I can call for the conversion early during a run, to set up a
    parameter to be used for a simulation, but still have the simulation
    run as fast as it would if I specified everything at compile time.
    For example, have the simulation hard coded to run using MKS, but
    allow the user to provide input in cgs or even imperial units, and,
    after the simulation is over, get the output in the units the user
    favours (or specifically requests). Do you see this as being
    practicable using your units library?

    Thanks again.

    Ted
    Ted, Sep 23, 2007
    #5
  6. Ted

    Guest

    > > Ted,
    >
    > > You might want to take a look at the Boost.Units library (written by
    > > myself and Steven Watanabe). It is currently available on the Boost
    > > sandbox SVN repository (http://svn.boost.org/svn/boost/sandbox/). The
    > > problem of zero-overhead compile time units is both very interesting
    > > and surprisingly subtle and complex. We have come up with an
    > > implementation that is significantly more flexible than, e.g. the
    > > outline given in Gurtovoy & Abrahams' book, allowing definition and
    > > mixing of essentially arbitrary units of measure (assuming that they
    > > obey the conventional rules of dimensional analysis). Our
    > > implementation allows fine-grained control on a per base unit level
    > > over implicit conversion, and performs unit conversions at compile
    > > time. The code is quite demanding of compilers, so you should have a
    > > recent one (gcc 4/VC8) if you are hoping to achieve zero-overhead.
    > > Documentation is not complete, but there is an extensive set of
    > > examples. You might also want to follow the threads on the Boost
    > > mailing list, especially the discussion during the formal review of
    > > the library.

    >
    > > Regards,

    >
    > > Matthias

    >
    > Thanks Matthias
    >
    > I now have it, and will be working through it. It appears, though, to
    > be quite large and complex. You clearly put a lot of time and effort
    > into it. Any chance of getting a quick and dirty explanation of your
    > design rationale?


    The basic ideas guiding the library design were :

    1) errors arising from use of dimensionally inconsistent units should
    be caught at compile time
    2) use of compile-time dimension checking should have zero runtime
    overhead on a good compiler
    3) the system should not mandate any specific set of units and should
    be extensible to user-defined units

    If, after looking at the examples, you have more specific questions
    about the design, feel free to contact me. I only
    read this newsgroup irregularly, so if you cc me, I should get back to
    you faster...

    > I find it quite interesting that you mention in your message the
    > "problem of zero-overhead compile time". I can understand why, but I
    > am much more concerned with the question of run time overhead. I have


    I probably stated things in a confusing way. What I meant was that the
    library imposes zero runtime overhead for
    unit analysis (assuming you do not allow implicit unit conversions),
    and performs compile-time checking for dimensional
    correctness...

    > worked on simulation problems, related to contaminant transport, in
    > which the original code could take a day or more to complete, and by
    > the time I had reimplemented the simulation model, and verified it to
    > be correct, the same simulation would run to completion in a matter of
    > minutes.


    The whole idea behind this library is to allow you to have simulation
    codes
    run as efficiently as they do with POD variables, but providing the
    added
    safety of checking of all unit computations to verify that results are
    dimensionally correct. So, if you took one of your codes and replaced
    the
    double precision variables with the appropriate
    boost::quantity<double,unit>,
    you should see no performance degradation and gain the benefit of
    having your
    equations checked for dimensional correctness.

    > in concentration. If that is badly done, one could waste countles
    > mflops over the course of a simulation. The cost of that is of much
    > greater concern to me than the chance I might have to wait an extra
    > ten minutes for a compile to complete.


    Then Boost.Units is for you! ;^) There is quite a lot of
    metaprogramming
    going on behind the scenes, so codes with lots of different units
    will
    definitely see added compilation time. But, as I already pointed out,
    the
    benefit if dimension checking without runtime overhead.

    > I see you support computing
    > conversions at compile time. I am wondering how flexible that is.
    > For example, I may find myself in a situation where the conversion
    > needed is not known until run time, so I may need a library structured
    > so that I can call for the conversion early during a run, to set up a
    > parameter to be used for a simulation, but still have the simulation
    > run as fast as it would if I specified everything at compile time.
    > For example, have the simulation hard coded to run using MKS, but
    > allow the user to provide input in cgs or even imperial units, and,
    > after the simulation is over, get the output in the units the user
    > favours (or specifically requests). Do you see this as being
    > practicable using your units library?


    This is easy to do with Boost.Units. For example (N.B. not tested):

    boost::units::quantity<double,SI::length>
    get_length_from_cin()
    {
    using namespace boost::units;

    double value;
    std::string unit_string;

    std::cout << "Enter a quantity: ";
    std::cin >> value >> unit_string;

    if (unit_string == "cm")
    return quantity<double,SI::length>(value*CGS::centimeters);
    else if (unit_string == "m")
    return quantity<double,SI::length>(value*SI::meters);
    else
    throw Err("unknown input unit");
    }

    While, by default, implicit unit conversions are prohibited, explicit
    conversion
    is allowed, so the first constructor converts cm into m. Naturally, at
    the end you
    can do the reverse to get output in your desired units...

    Hope this helps.

    Matthias
    , Sep 24, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin Ankerl

    EJB + number crunching

    Martin Ankerl, Aug 18, 2005, in forum: Java
    Replies:
    8
    Views:
    396
    Roedy Green
    Aug 23, 2005
  2. Shuo Xiang

    big number crunching in C

    Shuo Xiang, Sep 25, 2003, in forum: C Programming
    Replies:
    3
    Views:
    447
    Glen Herrmannsfeldt
    Sep 26, 2003
  3. Replies:
    21
    Views:
    1,663
    northerntechie
    Mar 26, 2008
  4. Replies:
    40
    Views:
    1,238
    Roedy Green
    Jun 23, 2008
  5. Kevin McMurtrie
    Replies:
    35
    Views:
    1,085
    Arne Vajhøj
    Aug 23, 2009
Loading...

Share This Page