Algorithmic vs design complexity

Discussion in 'C++' started by Vidar Hasfjord, Mar 27, 2011.

  1. Hi all,

    In my field of work I am often faced with algorithms where the
    traversal of one data structure (or series) involves the traversal of
    a secondary structure. For example:

    // Calculate heights for bus stops along a route.
    BOOST_FOREACH (const Stop& s, stops)
    output (s.distance, profile.height (s.distance));

    Here the query "height" makes a linear traversal of the profile
    structure to calculate the height at the given distance. This leads to
    an algorithmic complexity of O(n*m) where n and m is the size of the
    structures.

    The interesting point is that very often the traversal is dependent
    (ordered) and could progress in parallel for both structures, leading
    to linear complexity O(n+m). E.g. in the example above, bus stops are
    sorted on distance.

    But parallel traversal involves complicating the design; either in
    effect reimplementing the inner traversal in-place, inverting the
    calling sequence (callback), or in some other way exchanging the
    parallel state of the traversal.

    Any advice on how to attack this problem, suitable design patterns and
    C++ idioms, would be very much appreciated.

    Regards,
    Vidar Hasfjord
    Vidar Hasfjord, Mar 27, 2011
    #1
    1. Advertising

  2. Vidar Hasfjord

    ralph Guest

    On Sun, 27 Mar 2011 08:35:54 -0700 (PDT), Vidar Hasfjord
    <> wrote:

    >Hi all,
    >
    >In my field of work I am often faced with algorithms where the
    >traversal of one data structure (or series) involves the traversal of
    >a secondary structure. For example:
    >
    > // Calculate heights for bus stops along a route.
    > BOOST_FOREACH (const Stop& s, stops)
    > output (s.distance, profile.height (s.distance));
    >
    >Here the query "height" makes a linear traversal of the profile
    >structure to calculate the height at the given distance. This leads to
    >an algorithmic complexity of O(n*m) where n and m is the size of the
    >structures.
    >
    >The interesting point is that very often the traversal is dependent
    >(ordered) and could progress in parallel for both structures, leading
    >to linear complexity O(n+m). E.g. in the example above, bus stops are
    >sorted on distance.
    >
    >But parallel traversal involves complicating the design; either in
    >effect reimplementing the inner traversal in-place, inverting the
    >calling sequence (callback), or in some other way exchanging the
    >parallel state of the traversal.
    >
    >Any advice on how to attack this problem, suitable design patterns and
    >C++ idioms, would be very much appreciated.
    >


    I'm not sure I understand what you are trying to do... Well... strike
    that. I'm positive I don't understand what you are trying to do. But
    since I've never let not knowing all the facts deter me from providing
    an opinion ...

    It appears you have some kind of View thingy that needs to render Stop
    thingies differently based on their position known by the View thingy.

    The first solution that presents itself is to have each Stop thingy
    provide a Rendering method that accepts positon information to return
    a properly scaled attribute for rendering. Then all the View thingy
    needs to do is reach into its bag of Stops and call each of their
    Rendering methods. It is as simple as that.

    If these thingies need to traverse complex data structures, or in
    anyway get involved with "exchanging the parallel state of the
    traversal" or other mystical processes, then I suggest you might have
    wonderful structures and algorithms but you don't have any first-class
    objects nor thought-out rules for working with them.

    Go back to the whiteboard.

    -ralph
    ralph, Mar 29, 2011
    #2
    1. Advertising

  3. On 27 mar, 17:35, Vidar Hasfjord <> wrote:
    > Hi all,
    >
    > In my field of work I am often faced with algorithms where the
    > traversal of one data structure (or series) involves the traversal of
    > a secondary structure. For example:
    >
    >   // Calculate heights for bus stops along a route.
    >   BOOST_FOREACH (const Stop& s, stops)
    >     output (s.distance, profile.height (s.distance));
    >
    > Here the query "height" makes a linear traversal of the profile
    > structure to calculate the height at the given distance. This leads to
    > an algorithmic complexity of O(n*m) where n and m is the size of the
    > structures.
    >
    > The interesting point is that very often the traversal is dependent
    > (ordered) and could progress in parallel for both structures, leading
    > to linear complexity O(n+m). E.g. in the example above, bus stops are
    > sorted on distance.
    >
    > But parallel traversal involves complicating the design; either in
    > effect reimplementing the inner traversal in-place, inverting the
    > calling sequence (callback), or in some other way exchanging the
    > parallel state of the traversal.
    >
    > Any advice on how to attack this problem, suitable design patterns and
    > C++ idioms, would be very much appreciated.


    You could use a proxy for computing the height that memoize the result/
    state of the previous computation and call

    ProfileHeight height(profile);
    BOOST_FOREACH (const Stop& s, stops)
    output (s.distance, height (s.distance));

    The ProfileHeight class would simply keep an iterator on its current
    position in height data with relevant associated partial sums
    (distance, heigh).

    Another solution is to have profile provides the calculation of
    distance from a sorted iterator range, written in output iterator (or
    functor):

    template<class TFwdItInput, class TOutputIt>
    TOutputIt Profile::height(TFwdItInput it, TFwdItInput end, TOutputIt
    out )
    {
    for( ; it != end ; ++it )
    {
    // compute height for it keeping context for next iteration
    *out++ = std::make_pair(*it,height);
    // functor: out(*it;height);
    }
    return out;
    }

    --
    Michael
    Michael Doubez, Mar 29, 2011
    #3
  4. Hi Ralph,

    On Mar 29, 3:20 am, ralph <> wrote:
    > I'm not sure I understand what you are trying to do... Well... strike
    > that. I'm positive I don't understand what you are trying to do. But
    > since I've never let not knowing all the facts deter me from providing
    > an opinion ...


    :)

    Thank you for your reply; it is very kind of you to take the time,
    despite my failure to clearly put across the problem. I was concious
    of making my post too long-winded and complex, so in my attempt to
    simplify, I probably made it unclear and too abstract.

    > It appears you have some kind of View thingy that needs to render Stop
    > thingies differently based on their position known by the View thingy.


    Actually, the bus-stop example is just theoretical. My field involves
    similar things though; such as calculating road centre-lines on the
    map, longitudinal sections (peaks and troughs), cross-sections,
    earthwork and that kind of stuff. So it involves a lot of work with
    geometrical objects and queries on these objects; such as the height
    of a point on a profile given the horizontal distance along it. To do
    various calculations I write code that combine these queries, and a
    lot of the time the functions combine in the way described in my
    original post.

    > If these thingies need to traverse complex data structures, or in
    > anyway get involved with "exchanging the parallel state of the
    > traversal" or other mystical processes, then I suggest you might have
    > wonderful structures and algorithms but you don't have any first-class
    > objects nor thought-out rules for working with them.


    The thing is that I have very nicely encapsulated object-oriented
    structures with simple implementations of key queries; such as the
    height of a profile. The problem is one of efficiency, and it occurs
    when one function runs in a loop (n) where each iteration makes a
    query that itself needs to loop (m) to calculate the result. The
    result is a quadratic number of iterations (n*m).

    But, sometimes there is an opportunity for the loops to progress in
    parallel (along distance in my example), thus reducing the number of
    iterations drastically (n+m). The problem is: my nice encapsulated
    design doesn't support that, and it is unclear what the cleanest
    solution to the problem is. In my bus-stop example, the brute-force
    solution would be to re-implement the height function within the bus-
    stop loop, keeping track of where we are on the profile along the way.
    This totally breaks the encapsulation of the profile height
    calculation.

    To make things more concrete, I've included a complete code example
    below based on a simpler scenario. Note the messy implementation of
    the optimized parallel traversal. The aim is to somehow structure the
    code so that the query remains encapsulated and usage remains simple
    and clean; while running just as fast.

    > Go back to the whiteboard.


    In spirit, that has been my location throughout. :)

    My aim is to rewrite and improve a lot of my code, so thanks again for
    your reply. It has forced me to more clearly formulate the problem,
    and it has helped me better characterize it and recognize when it
    applies. In fact, a better subject title might be "Optimizing ordered
    queries" or something like that.

    Here is an attempt at a more concise characterization of the problem:

    * Function A makes a series of calls (queries) to function B.
    * The queries are ordered in terms of a query argument x.
    * B iterates in the same domain and order as x.

    The above characteristics indicate that the code can be optimized by
    using a single traversal over the domain of the query argument and
    keeping parallel track of the state of A and B.

    Here's the code example:

    <code>
    #include <iostream>
    #include <vector>
    #include <algorithm>
    #include <numeric>
    using namespace std;

    struct Floor
    {
    int height;
    };

    struct Building
    {
    typedef vector <Floor> Floors;
    Floors floors;

    int floor (int height)
    {
    int hf = 0;
    const int nf = floors.size ();
    for (int i = 0; i != nf; ++i)
    {
    hf += floors .height;
    if (height < hf) return i;
    }
    return nf;
    }
    };

    struct Object
    {
    int height;
    };

    bool less_height (const Object& a, const Object& b)
    {
    return a.height < b.height;
    }

    void output (int height, int floor)
    {
    cout << "object height: " << height
    << ", floor: " << floor << "\n";
    }

    int main ()
    {
    Building building;
    const int nf = 6;
    Floor floors [nf] = {5, 4, 4, 4, 3, 3};
    building.floors.assign (floors, floors + nf);

    const int no = 5;
    Object objects [no] = {21, 7, 14, 11, 8};

    cout << "Unordered traversal:\n\n";
    for (int i = 0; i != no; ++i)
    {
    int h = objects .height;
    output (h, building.floor (h));
    }

    // Now let's sort the objects on height.

    sort (objects, objects + no, &less_height);
    cout << "\nOrdered traversal:\n\n";
    for (int i = 0; i != no; ++i)
    {
    int h = objects .height;
    output (h, building.floor (h));
    }

    // Keeping the objects sorted opens an opportunity
    // to optimize algorithms that traverse them.

    cout << "\nParallel traversal:\n\n";
    {
    int hf = 0; // absolute height of floor
    int io = 0; // object index
    for (int i = 0; i != nf; ++i)
    {
    hf += floors .height;
    for (; io != no; ++io)
    {
    int h = objects [io].height;
    if (h > hf) break;
    output (h, i);
    }
    }
    }

    cin.get (); // pause
    return 0;
    }
    </code>

    Regards,
    Vidar Hasfjord
    Vidar Hasfjord, Mar 30, 2011
    #4
  5. Hi Michael,

    On Mar 29, 9:20 am, Michael Doubez <> wrote:
    > You could use a proxy for computing the height that memoize the result/
    > state of the previous computation and call
    >
    > ProfileHeight height(profile);
    > BOOST_FOREACH (const Stop& s, stops)
    > output (s.distance, height (s.distance));


    Nice. This is the solution I'm leaning towards. The most important
    characteristics are encapsulation and scalability. It scores better in
    the latter respect compared to your alternative suggestion (iterator/
    callback). E.g. if there is a second query involved in the calculation
    which also iterates in the same domain, this solution will support it
    nicely.

    Here's an implementation for the example I posted to Ralph:

    <code>
    struct FloorQuery
    {
    Building* b;
    int floor;
    int height;

    FloorQuery (Building& br)
    : b (&br), floor (0), height (0)
    {height += b->floors [0].height;}

    int operator () (int h)
    {
    while (h > height)
    {
    ++floor;
    height = (floor == b->floors.size ()) ?
    numeric_limits <int>::max () :
    height + b->floors [floor].height;
    }
    return floor;
    }
    };

    cout << "\nParallel traversal 2:\n\n";
    {
    FloorQuery get_floor (building);
    for (int i = 0; i != no; ++i)
    {
    int h = objects .height;
    output (h, get_floor (h));
    }
    }
    </code>

    > Another solution is to have profile provides the calculation of
    > distance from a sorted iterator range, written in output iterator (or
    > functor):


    You mean "calculation of height for a sorted sequence of distances
    given by an iterator range", I presume. It is an interesting
    alternative, but it seems less scalable than your previous suggestion.
    For example, what if our function involved querying both the height
    and curvature of the profile? You could use result vectors to hold the
    results of the queries and then compute on those, but this requires
    extra memory and seems less elegant than your previous suggestion. The
    callback alternative would not scale at all, as far as I can see,
    although the outer function could be implemented as a functor and
    passed to a generic algorithm such as for_each. For example,

    for_each(objects, objects + no, CalcFloor (building));

    Thanks for your excellent suggestions!

    Regards,
    Vidar Hasfjord
    Vidar Hasfjord, Mar 30, 2011
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xah Lee

    algorithmic mathematical art

    Xah Lee, May 7, 2006, in forum: Java
    Replies:
    4
    Views:
    1,128
    Alex Hunsley
    May 7, 2006
  2. Generic Usenet Account

    Algorithmic complexity & STL

    Generic Usenet Account, Apr 11, 2005, in forum: C++
    Replies:
    3
    Views:
    4,937
    Mark P
    Apr 11, 2005
  3. Roy Smith

    Algorithmic complexity of StringIO?

    Roy Smith, Sep 13, 2003, in forum: Python
    Replies:
    2
    Views:
    335
    anton muhin
    Sep 13, 2003
  4. Roy Smith

    Algorithmic complexity of len (list)?

    Roy Smith, Jul 3, 2004, in forum: Python
    Replies:
    8
    Views:
    337
    Leif K-Brooks
    Jul 6, 2004
  5. Xah Lee

    algorithmic mathematical art

    Xah Lee, May 7, 2006, in forum: Python
    Replies:
    4
    Views:
    317
    Alex Hunsley
    May 7, 2006
Loading...

Share This Page