Profiler class, suggestions and comments welcome

F

Francesco

Hi everybody,
with this very post I'm starting my first thread into this NG.

I know that my behavior so far has been a bit uncommon. It's uncommon
to drop into a technical NG and start giving comments and advices
before asking for comments and advices in first place - even more if
the person doing it is a hobbyist like me.

I hope you'll be so kind to step over the above mentioned facts and
comment my code without any kind of prejudgment.

So then, here is the point.

I lack the knowledge to properly use debuggers and profilers that come
with Code::Blocks. Actually, I'm not even sure about whether it
includes a profiler... never mind.

Hence, I usually debug my code by printing out variable values to the
console and the alike.

About measuring the performance of my code, I used functions like
"StartTimer(string timername)" and "StopTimer(string timername)".

After a short while I got tired of calling such functions at the begin
and at the end of the functions I was profiling - along with having to
recall the string IDs or creating local strings to avoid recalling
them.

Hence I created a profiler class that automatically logs the start/
stop times upon construction/destruction.

Here is my profiler class. A couple of questions right after it:

-------
// combining header and implementation for easy copy&paste

// header (double-inclusion check snipped out)
#include <string>

namespace profiler {
class Profile {
public:
Profile(std::string = "");
~Profile();
private:
std::string id;
};
std::string report();
}

// implementation
#include <ctime>
#include <map>
#include <sstream>
// profiler's header inclusion snipped

struct time_pair {
clock_t start;
clock_t stop;
};

typedef std::map<std::string, time_pair> time_map_type;

time_map_type time_map;

profiler::profile::profile(std::string s) {
id = s;
clock_t& rct = time_map[id].start;
rct = clock();
}

profiler::profile::~Profile() {
clock_t ct = clock();
time_map[id].stop = ct;
}

std::string profiler::report() {
time_map_type::const_iterator iter;
std::stringstream res;
for(iter = time_map.begin(); iter != time_map.end(); ++iter) {
res << iter->first << "\n" << "Start: ";
res << iter->second.start << ", stop: ";
res << iter->second.stop << ", total: ";
res << (iter->second.stop - iter->second.start) << "\n";
}
return res.str();
}
-------

I use it in this way:

-------
void f() {
profiler::profile p("f()");
// ... code to be profiled ...
}
-------

And here are the questions:

- Why must I set "~Profile()" as public? (apart from the fact that the
compiler refuses to compile it if I set it to private or protected)
Shouldn't it be something to be left out from the public interface so
that users can not mess with it? <--- academic question, I don't have
such issues (unfortunately...)

- Notice how I used an additional variable and a reference to avoid
including the map lookup time into the logged time, in the ctor and
the dtor. Is this a smart move or is it just paranoia?

I don't have any other specific question about this very simple piece
of code, nonetheless I'll be happy to read your comments and your
advices on any subject - naming style included.

Some time later I'll ask for your advice about something more
interesting (more interesting to me: a simulation of ants harvesting
food around their nest). I'm not so sure, though. It will involve
posting the SDL interface code, asking questions about generic OOP and
about ants' behavior, hence I'm not sure this will be the right place
to post it - assuming I'll be ever brave enough to show my
"abomination" to the world ;-)

Thank you for your attention,
best regards,
Francesco
 
V

Victor Bazarov

Francesco said:
Hi everybody,
with this very post I'm starting my first thread into this NG.

I know that my behavior so far has been [..]

If you didn't write this, nobody would notice. Everybody here couldn't
care less about your behavior unless it's against the spirit of the
forum or the rules of netiquette.
So then, here is the point.

I lack the knowledge to properly use debuggers and profilers that come
with Code::Blocks.

Don't worry. I don't even know what "Code::Blocks" is.
> Actually, I'm not even sure about whether it
includes a profiler... never mind.

Hence, I usually debug my code by printing out variable values to the
console and the alike.

About measuring the performance of my code, I used functions like
"StartTimer(string timername)" and "StopTimer(string timername)".

After a short while I got tired of calling such functions at the begin
and at the end of the functions I was profiling - along with having to
recall the string IDs or creating local strings to avoid recalling
them.

Hence I created a profiler class that automatically logs the start/
stop times upon construction/destruction.

Here is my profiler class. A couple of questions right after it:

-------
// combining header and implementation for easy copy&paste

// header (double-inclusion check snipped out)
#include <string>

namespace profiler {
class Profile {
public:
Profile(std::string = "");

For the sake of performance (and with a profiler it's kind of important,
wouldn't you say?) you should *never* pass objects of class type by
value. *Always* prefer passing by a const ref, unless there is a strong
reason not to.
~Profile();
private:
std::string id;

Since your individual object survives only until the end of the scope in
which it was created, consider storing the start clock value in it,
instead of in a separate table. Then in the destructor calculate the
difference and store it in the table (also see below for more
recommendations).
};
std::string report();
}

// implementation
#include <ctime>
#include <map>
#include <sstream>
// profiler's header inclusion snipped

struct time_pair {
clock_t start;
clock_t stop;
};

typedef std::map<std::string, time_pair> time_map_type;

time_map_type time_map;

profiler::profile::profile(std::string s) {
id = s;

*Always* prefer initialisation (using the initialiser list) over
assignment. There is a FAQ about that. Have you actually read the FAQ?
clock_t& rct = time_map[id].start;
rct = clock();
}

So, your function here could be written as


profiler::profile::profile(std::string const & s)
: id(s)
{
time_map[id].start = clock();
}
profiler::profile::~Profile() {
clock_t ct = clock();
time_map[id].stop = ct;

Since you don't really know in what order the operands of an assignment
expression are going to be evaluated, it is probably a good idea to
enforce the order like you did here.
}

std::string profiler::report() {
time_map_type::const_iterator iter;
std::stringstream res;
for(iter = time_map.begin(); iter != time_map.end(); ++iter) {
res << iter->first << "\n" << "Start: ";
res << iter->second.start << ", stop: ";
res << iter->second.stop << ", total: ";
res << (iter->second.stop - iter->second.start) << "\n";
}
return res.str();
}
-------

I use it in this way:

-------
void f() {
profiler::profile p("f()");
// ... code to be profiled ...
}
-------

And here are the questions:

- Why must I set "~Profile()" as public?

Because a local object 'p' is created *AND* destroyed from a scope that
has no special access to your 'Profile' class.
> (apart from the fact that the
compiler refuses to compile it if I set it to private or protected)
Shouldn't it be something to be left out from the public interface so
that users can not mess with it? <--- academic question, I don't have
such issues (unfortunately...)

Destructors aren't usually called explicitly. It's object creation and
their disposal that are controlled by the user.
- Notice how I used an additional variable and a reference to avoid
including the map lookup time into the logged time, in the ctor and
the dtor. Is this a smart move or is it just paranoia?

Pat yourself on the back. After you have stood in the corner for not
passing arguments by const ref.
I don't have any other specific question about this very simple piece
of code, nonetheless I'll be happy to read your comments and your
advices on any subject - naming style included.

See above.

Also, consider that the results of profiling of any particular function
are overridden by a subsequent calls, so you might consider improving
your system to (a) count the number of calls to that particular function
and (b) accumulate the overall time spent in it (instead of recording
start-stop ticks):

struct timing_result {
int hit_count;
clock_t total_time_spent;
};
Some time later I'll ask for your advice about something more
interesting (more interesting to me: a simulation of ants harvesting
food around their nest). I'm not so sure, though. It will involve
posting the SDL interface code, asking questions about generic OOP and
about ants' behavior, hence I'm not sure this will be the right place
to post it - assuming I'll be ever brave enough to show my
"abomination" to the world ;-)

Remember that we often do not have time to delve into the intricacies of
your model domain. If you have a language problem or a question, we're
happy to help. Modelling and OOA/OOD are discussed in comp.object.

V
 
F

Francesco

Francesco said:
Hi everybody,
with this very post I'm starting my first thread into this NG.
I know that my behavior so far has been [..]

If you didn't write this, nobody would notice. Everybody here couldn't
care less about your behavior unless it's against the spirit of the
forum or the rules of netiquette.

Uh... well, OK, I'll stop worrying about such things. Now I was about
to explain why I did it, but it's pretty obvious and nobody would
care, again. Sort of a good thing to know, I must remark ;-)

[snip]
For the sake of performance (and with a profiler it's kind of important,
wouldn't you say?) you should *never* pass objects of class type by
value. *Always* prefer passing by a const ref, unless there is a strong
reason not to.

You're right, of course performance is important, even more in this
particular case - self-slap on the forehead :-/

[snip]
*Always* prefer initialisation (using the initialiser list) over
assignment. There is a FAQ about that. Have you actually read the FAQ?

Yes, I read it. Actually, my intention here was to hide the fact of
using "clock()" and "clock_t" by not including <ctime> into the
profiler's header. Now I realize that hiding that implementation
detail is way less important than writing the constructor in the most
efficient way.
clock_t& rct = time_map[id].start;
rct = clock();
}

So, your function here could be written as

profiler::profile::profile(std::string const & s)
: id(s)
{
time_map[id].start = clock();
}

I've followed the suggestion and extended it a bit, see the new code
below.

[snip]
Because a local object 'p' is created *AND* destroyed from a scope that
has no special access to your 'Profile' class.

Yes, I've realized it just after posting the "how to ensure dynamic
allocation" question elsewhere and after reading Alf Steinbach's
response. But your details helped me refining further this new thing
I'm learning.

[snip]
Pat yourself on the back. After you have stood in the corner for not
passing arguments by const ref.

Eheheheh, thank you very much for your "slaps" - luckily, this wasn't
a real-life code review ;-)

[snip]
Also, consider that the results of profiling of any particular function
are overridden by a subsequent calls, so you might consider improving
your system to (a) count the number of calls to that particular function
and (b) accumulate the overall time spent in it (instead of recording
start-stop ticks):

struct timing_result {
int hit_count;
clock_t total_time_spent;
};

This is really an important thing I missed, thank you very much for
pointing it out.
Remember that we often do not have time to delve into the intricacies of
your model domain. If you have a language problem or a question, we're
happy to help. Modelling and OOA/OOD are discussed in comp.object.

I guessed right, thanks for the pointers and finally thank you very
much for your quick and constructive reply, Victor.

Here is the improved version, applying all of your fixes and
suggestions:

-------
// header (double-inclusion check snipped out)
#include <ctime>
#include <string>

namespace profiler {
class Profile {
public:
Profile(const std::string& sid)
: id(sid), start_time(clock()) {}
~Profile();
private:
std::string id;
clock_t start_time;
};
std::string report();
}

// implementation
#include <map>
#include <sstream>
// profiler header inclusion snipped

namespace {
struct timing_record {
clock_t used_time;
int hits;
timing_record() : used_time(0), hits(0) {}
};
typedef std::map<std::string, timing_record> timing_map_type;
timing_map_type t_map;
}

profiler::profile::~Profile() {
clock_t ct = clock() - start_time;
timing_record& tr = t_map[id];
tr.used_time += ct;
++tr.hits;
}

std::string profiler::report() {
timing_map_type::const_iterator iter;
std::stringstream res;
for(iter = t_map.begin(); iter != t_map.end(); ++iter) {
res << iter->first << "\n";
const timing_record& tr = iter->second;
res << "Calls: " << tr.hits << ", ";
res << "avg time: " << tr.used_time / tr.hits << "\n";
}
return res.str();
}
-------

Any further comments and fixes - and slaps ;-) - welcome.

Thank you for your attention,
Francesco
 
V

Victor Bazarov

Francesco said:
[..]
std::string profiler::report() {
timing_map_type::const_iterator iter;
std::stringstream res;
for(iter = t_map.begin(); iter != t_map.end(); ++iter) {
res << iter->first << "\n";
const timing_record& tr = iter->second;
res << "Calls: " << tr.hits << ", ";
res << "avg time: " << tr.used_time / tr.hits << "\n";
}
return res.str();
}
-------

A style nitpick. Since 'iter' is not used after the loop, I would
declare it in the 'for' statement itself (so that the scope is not
polluted by the variables that aren't used). And since 't_map' is not
changing during this, a *potential* performance improvement might be to
get the 'end' once, like so:

std::eek:stringstream res;
for (timing_map_type::const_iterator iter = t_map.begin(),
e = t_map.end();
iter != e;
++iter) {
....
}

(you don't have to keep linebreaks like that, of course).

That actually comes from the usual way of iterating over a vector.
Instead of

for (int i = 0; i < vec.size(); ++i)

do

for (int i = 0, s = vec.size(); i < s; ++i)

(Disclaimer: yes, I know of signed-unsigned mismatch, that's not the
point here)

For some reason even when the call to 'size' isn't expensive, our
environment (and my measurements) have showed that if I avoid calling a
function too many times, I can shave off sometimes quite a few
milliseconds off some algorithm. Just something to consider...

V
 
F

Francesco

Francesco said:
[..]
std::string profiler::report() {
  timing_map_type::const_iterator iter;
  std::stringstream res;
  for(iter = t_map.begin(); iter != t_map.end(); ++iter) {
    res << iter->first << "\n";
    const timing_record& tr = iter->second;
    res << "Calls: " << tr.hits << ", ";
    res << "avg time: " << tr.used_time / tr.hits << "\n";
  }
  return res.str();
}
-------

A style nitpick.  Since 'iter' is not used after the loop, I would
declare it in the 'for' statement itself (so that the scope is not
polluted by the variables that aren't used).  And since 't_map' is not
changing during this, a *potential* performance improvement might be to
get the 'end' once, like so:

    std::eek:stringstream res;
    for (timing_map_type::const_iterator iter = t_map.begin(),
                                            e = t_map.end();
         iter != e;
         ++iter) {
      ....
    }

Good idea. I'll perform the change you suggested to my code.

I recall doing something like this somewhere else in my past coding.
But I did it declaring and setting those variables before the loop -
just like I here above. Thank you for recalling me of that
optimization.

About the rest of my code I assume "news.none() == news.good()" ;-)

Cheers,
Francesco
 
F

Francesco

Francesco said:
[..]
About the rest of my code I assume "news.none() == news.good()" ;-)

Hehehe.  Now you're entering the best phase and the worst phase: bug
finding time.  Exercise the hell out of your class to see if there is
something you might want or something that doesn't work...  The code
wasn't that extensive, and I'm not so eager to pick any more nits.
Somebody else will undoubtedly feel generous, just you wait.

Uhm. OK, just leave me some time and I'll do my best to find them by
myself. Once I'll feel I cannot really find anything, I'll let you
(all) know, then feel free to start over with the slaps.

Honestly, I like being criticized, I cannot fix the errors that I
cannot see by myself. That's of course valid for everybody, but that's
quite more valid for an isolated programmer like me.

Cheers,
Francesco
 
J

James Kanze

Hence I created a profiler class that automatically logs the
start/ stop times upon construction/destruction.
Here is my profiler class. A couple of questions right after it:

Just a note: most functions will take so little time that the
results of measuring one execution isn't reliable. And what
your class does isn't so much profiling, but benchmarking.
Which has it's own set of problems. (I'd recommend your having
a look at my BenchHarness class, but my site still isn't
working. I'd say next week-end, but I'm in the middle of
moving, which takes up a lot of time.)
// header (double-inclusion check snipped out)
#include <string>
namespace profiler {
class Profile {
public:
Profile(std::string = "");
~Profile();
private:
std::string id;
};
std::string report();
}
// implementation
#include <ctime>
#include <map>
#include <sstream>
// profiler's header inclusion snipped
struct time_pair {
clock_t start;
clock_t stop;
};
typedef std::map<std::string, time_pair> time_map_type;
time_map_type time_map;
profiler::profile::profile(std::string s) {
id = s;
clock_t& rct = time_map[id].start;
rct = clock();
}
profiler::profile::~Profile() {
clock_t ct = clock();
time_map[id].stop = ct;
}
std::string profiler::report() {
time_map_type::const_iterator iter;
std::stringstream res;
for(iter = time_map.begin(); iter != time_map.end(); ++iter) {
res << iter->first << "\n" << "Start: ";
res << iter->second.start << ", stop: ";
res << iter->second.stop << ", total: ";
res << (iter->second.stop - iter->second.start) << "\n";
}
return res.str();
}

Both time_map (and it's type) and the function report should be
static members of profiler. I'd also make the time_pair type a
member (and give it a more significant name, something like
Interval).

(And while I'm at it, your naming convention doesn't seem to
regular: some types start with capital letters, others don't.
As a general rule, it's probably a good idea to distinguish
between types and things like variables and functions in the
naming convention---starting types with a capital letter is
probably the most frequent convention. I use the same naming
convention for namespaces, since they tend to occur in similar
contexts as class names, e.g. to the left of the :: operator.)
I use it in this way:
-------
void f() {
profiler::profile p("f()");
// ... code to be profiled ...}

And here are the questions:
- Why must I set "~Profile()" as public? (apart from the fact that the
compiler refuses to compile it if I set it to private or protected)

The simple reason is because the standard says so. But it's
logical---you're calling the destructor in the non-member
function f. In general, when defining a variable with static,
automatic or temporary scope, the destructor must be accessible
at the site of the definition.
Shouldn't it be something to be left out from the public
interface so that users can not mess with it? <--- academic
question, I don't have such issues (unfortunately...)

What do you mean by: "that users cannot mess with it"? If the
user constructs an instance, they have to be able to destruct
it (with a few exceptions).
- Notice how I used an additional variable and a reference to
avoid including the map lookup time into the logged time, in
the ctor and the dtor. Is this a smart move or is it just
paranoia?

It's one solution. I'd probably have had Profile contain a
private (non-static) time_pair, and only inserted in the map in
the destructor. But given that for many smaller functions, the
lookup in the map could easily be longer than the function, you
obviously have to move that out of the interval being timed.
 
J

James Kanze

Don't worry. I don't even know what "Code::Blocks" is.

Still, learning to use a debugger and a profiler is essential.
Unless you never need them. (It's not that frequent to need a
debugger; I probably use one about twice a year. A profiler is
almost essential in any application where CPU time is
relevant---which isn't all that many, either.)
For the sake of performance (and with a profiler it's kind of
important, wouldn't you say?) you should *never* pass objects
of class type by value. *Always* prefer passing by a const
ref, unless there is a strong reason not to.

Well, his class isn't really a profiler:), and the pass by
value occurs outside of the code being benchmarked, so it
doesn't matter. But while it's probably a blatant example of
premature optimization, the usual convention is to pass class
types and (usually) unknown types (e.g. in templates) as const
references, other types by value. And as usual, deviating from
an established convention causes confusion in the reader, and
should be avoided if not necessary.
Since your individual object survives only until the end of
the scope in which it was created, consider storing the start
clock value in it, instead of in a separate table. Then in
the destructor calculate the difference and store it in the
table (also see below for more recommendations).
*Always* prefer initialisation (using the initialiser list)
over assignment. There is a FAQ about that. Have you
actually read the FAQ?
clock_t& rct = time_map[id].start;
rct = clock();
}
So, your function here could be written as
profiler::profile::profile(std::string const & s)
: id(s)
{
time_map[id].start = clock();
}

Except that, as he mentions, this might result in the time used
to lookup the value in the map being included in the measured
time. Or not, depending on the compiler, the various
optimization options, etc.

Your suggestion to store the start time in a member was, IMHO,
the right one. In particular, it probably means that he doesn't
need to store the start/stop times in the map, only the
interval.
Also, consider that the results of profiling of any particular
function are overridden by a subsequent calls, so you might
consider improving your system to (a) count the number of
calls to that particular function and (b) accumulate the
overall time spent in it (instead of recording start-stop
ticks):
struct timing_result {
int hit_count;
clock_t total_time_spent;
};

This is, I think, a very good idea. At least if he wants a
profiler, and not a benchmark harness:). (His solution
implements more of a benchmark harness. Your's really is a
profiler.)
 
J

James Kanze

For the sake of performance (and with a profiler it's kind
of important, wouldn't you say?) you should *never* pass
objects of class type by value. *Always* prefer passing by
a const ref, unless there is a strong reason not to.
[/QUOTE]
You're right, of course performance is important, even more in
this particular case

Yes and no. I suspect that the difference in performance here
would be negligible in most applications; performance typically
isn't important, at least at this level. (Not using O(n^2)
algorithms when you know you have a lot of data is important,
for example.) And even here, the copy doesn't occur in the part
of the code being measured.

Still, conventions are conventions, and the usual convension is
to pass class types by const reference, so I'd do it. (Even in
cases where it might result in a slight loss of performance,
e.g. for the copy assignment operator.)
[snip]
*Always* prefer initialisation (using the initialiser list)
over assignment. There is a FAQ about that. Have you
actually read the FAQ?
Yes, I read it. Actually, my intention here was to hide the
fact of using "clock()" and "clock_t" by not including <ctime>
into the profiler's header.

But the constructor isn't in the profiler's header.
Now I realize that hiding that implementation detail is way
less important than writing the constructor in the most
efficient way.

No. Hiding the implementation details is far more important
than performance (until actual measurements show otherwise). On
the other hand, the rules of C++ don't allow perfect hiding, and
<time.h> isn't exactly a header that's constantly changing, so
using clock_t in the class wouldn't bother me. (Using some
complicated user class might, depending on the context, and for
all but the simplest classes, I tend to adopt the compilation
firewall idiom automatically. But I don't avoid dependencies on
standard headers.)

[...]
-------
// header (double-inclusion check snipped out)
#include <ctime>
#include <string>

namespace profiler {
class Profile {
public:
Profile(const std::string& sid)
: id(sid), start_time(clock()) {}

Does this really have to be inline.
~Profile();
private:
std::string id;
clock_t start_time;

And the order of the declarations here is extremely important!
You got it right, but since it's so exceptional for the order to
be important, I'd add a comment stating why, so that a
maintenance programmer doesn't accidentally change it.
};
std::string report();
}

What happens if you copy this class? Do you want to allow that?
(IMHO, it doesn't make sense.) If not, you need to inhibit it.
// implementation
#include <map>
#include <sstream>
// profiler header inclusion snipped

The comment takes as much place as the inclusion:).

Also, it's a good idea to include the header for the class
you're implementing first, in order to ensure that it doesn't
accidentally contain other dependencies (e.g. that it only
compiles if the user has included said:
namespace {
struct timing_record {
clock_t used_time;
int hits;
timing_record() : used_time(0), hits(0) {}};
typedef std::map<std::string, timing_record> timing_map_type;
timing_map_type t_map;
}
profiler::profile::~Profile() {
clock_t ct = clock() - start_time;
timing_record& tr = t_map[id];
tr.used_time += ct;
++tr.hits;
}
std::string profiler::report() {
timing_map_type::const_iterator iter;

Never define a variable until you're ready to initialize it.
std::stringstream res;

Unless you can't avoid it:).
for(iter = t_map.begin(); iter != t_map.end(); ++iter) {

I'd have defined the iterator in the for.
res << iter->first << "\n";
const timing_record& tr = iter->second;
res << "Calls: " << tr.hits << ", ";
res << "avg time: " << tr.used_time / tr.hits << "\n";

There are two things wrong with this statement. First, you
don't scale the time by CLOCKS_PER_SEC, so you're outputting
in some implementation defined unit. (Posix requires that
CLOCKS_PER_SEC be 1000000, but I've seen other values.
Including 60 on some older systems.) The second is that you
probably want the results in a floating point type; for a very
short function with a lot of hits, the used_time might actually
be less than hits, and even in less extreme cases, the results
may not be very accurate. Try something like:

res << "avg time (ms): "
<< 1000.0 * tr.used_time / CLOCKS_PER_SEC / tr.hits
<< std::endl ;

(In general, until you really, really know what you're doing,
and can write practically error free code from the start, you
should prefer std::endl to '\n'.)
}
return res.str();}

More generally, I'd have written this function:

void Profiler::report(
std::eek:stream& dest )
{
// ...
}

It seems likely to me that most of the time, it will be called
for direct output, so you might as well format directly into the
target stream. This also gives the user the possibility of
setting various format flags (probably not very useful here, but
you never know---if you adopt floating point, precision could be
relevant). If you really do want to offer the possibility to
get a string, you can always overload with:

std::string Profiler::report()
{
std::eek:stringstream result ;
report( result ) ;
return result.str() ;
}
 
J

James Kanze

Francesco said:
[..]
std::string profiler::report() {
timing_map_type::const_iterator iter;
std::stringstream res;
for(iter = t_map.begin(); iter != t_map.end(); ++iter) {
res << iter->first << "\n";
const timing_record& tr = iter->second;
res << "Calls: " << tr.hits << ", ";
res << "avg time: " << tr.used_time / tr.hits << "\n";
}
return res.str();
}
-------
A style nitpick. Since 'iter' is not used after the loop, I
would declare it in the 'for' statement itself (so that the
scope is not polluted by the variables that aren't used).

And if it was used after the loop, I'd still initialize it the
definition.
And since 't_map' is not changing during this, a *potential*
performance improvement might be to get the 'end' once, like
so:
std::eek:stringstream res;
for (timing_map_type::const_iterator iter = t_map.begin(),
e = t_map.end();
iter != e;
++iter) {
....
}
(you don't have to keep linebreaks like that, of course).
That actually comes from the usual way of iterating over a vector.
Instead of
for (int i = 0; i < vec.size(); ++i)

for (int i = 0, s = vec.size(); i < s; ++i)
(Disclaimer: yes, I know of signed-unsigned mismatch, that's
not the point here)
For some reason even when the call to 'size' isn't expensive,
our environment (and my measurements) have showed that if I
avoid calling a function too many times, I can shave off
sometimes quite a few milliseconds off some algorithm. Just
something to consider...

Or not. That's really a case of premature optimization. Their
is a measurable difference (unlike the distinction ++iter and
iter++), but it's generally now worth worrying about.
Especially here, where he's outputing in the loop (which will
take so much time, any differences in the loop management will
be lost in the noise).
 
F

Francesco

Just a note: most functions will take so little time that the
results of measuring one execution isn't reliable.  And what
your class does isn't so much profiling, but benchmarking.
Which has it's own set of problems.  (I'd recommend your having
a look at my BenchHarness class, but my site still isn't
working.  I'd say next week-end, but I'm in the middle of
moving, which takes up a lot of time.)

Well, I just ignored the correct definition of profiling and
benchmarking - not a big surprise, I suppose.

I'll have a look to your website once it will be online again.

[snip]
Both time_map (and it's type) and the function report should be
static members of profiler.  I'd also make the time_pair type a
member (and give it a more significant name, something like
Interval).

Unless there is a good reason for doing otherwise, I'd prefer leaving
the report function separate, as well I'd like to keep the map detail
hidden in the module (the first version actually added names to the
global namespace, I've solved this using the anonymous namespace
inside the implementation file, in the second version).
(And while I'm at it, your naming convention doesn't seem to
regular: some types start with capital letters, others don't.
As a general rule, it's probably a good idea to distinguish
between types and things like variables and functions in the
naming convention---starting types with a capital letter is
probably the most frequent convention.   I use the same naming
convention for namespaces, since they tend to occur in similar
contexts as class names, e.g. to the left of the :: operator.)

This is something I must really get coherent. I fear that the third
version I'm about to post later will suffer of the same problem - as a
minimum.

[ already resolved issues snipped ]

Thanks a lot,
Francesco
 
F

Francesco

Yes and no.  I suspect that the difference in performance here
would be negligible in most applications; performance typically
isn't important, at least at this level.  (Not using O(n^2)
algorithms when you know you have a lot of data is important,
for example.)  And even here, the copy doesn't occur in the part
of the code being measured.

Yes, you're right. With "particular case" I meant "creating a
profiler" not the specific case of the Profile ctor. I suppose that I
could profile nested functions, hence I suppose it would be good to
save as many CPU cycles as possible. Sure, profiling nested functions
will lead to "inflated" results on the outermost functions, but those
results could have a use anyway - easy on my "suppositions", they're
sincere suppositions, not sarcasm
Still, conventions are conventions, and the usual convension is
to pass class types by const reference, so I'd do it.  (Even in
cases where it might result in a slight loss of performance,
e.g. for the copy assignment operator.)

Just like for the pointer arithmetic issue, I surely will take on
these good habits. Please point out as many of these as you see
fitting.
[snip]
profiler::profile::profile(std::string s) {
  id = s;
*Always* prefer initialisation (using the initialiser list)
over assignment.  There is a FAQ about that.  Have you
actually read the FAQ?
Yes, I read it. Actually, my intention here was to hide the
fact of using "clock()" and "clock_t" by not including <ctime>
into the profiler's header.

But the constructor isn't in the profiler's header.

Fine, you spotted another misconception of mine. I thought I could use
the initializer list only when defining the function inside of the
class itself.
No.  Hiding the implementation details is far more important
than performance (until actual measurements show otherwise).  On
the other hand, the rules of C++ don't allow perfect hiding, and
<time.h> isn't exactly a header that's constantly changing, so
using clock_t in the class wouldn't bother me.  (Using some
complicated user class might, depending on the context, and for
all but the simplest classes, I tend to adopt the compilation
firewall idiom automatically.  But I don't avoid dependencies on
standard headers.)

As above, I meant to refer to this very case of making a profiler.
But for the general rules I definitely agree with you.
    [...]
namespace profiler {
  class Profile {
    public:
      Profile(const std::string& sid)
        : id(sid), start_time(clock()) {}

Does this really have to be inline.

I don't know, maybe I'll try to profile it to check whether it makes
any difference.
And the order of the declarations here is extremely important!
You got it right, but since it's so exceptional for the order to
be important, I'd add a comment stating why, so that a
maintenance programmer doesn't accidentally change it.

Very good point. I'll add that comment.
What happens if you copy this class?  Do you want to allow that?
(IMHO, it doesn't make sense.)  If not, you need to inhibit it.

Good point, I will.

The comment takes as much place as the inclusion:).

Also, it's a good idea to include the header for the class
you're implementing first, in order to ensure that it doesn't
accidentally contain other dependencies (e.g. that it only
compiles if the user has included <map> first).

I never considered this, thanks a lot for pointing it out. Another
good habit to take on.
namespace {
struct timing_record {
  clock_t used_time;
  int hits;
  timing_record() : used_time(0), hits(0) {}};
typedef std::map<std::string, timing_record> timing_map_type;
timing_map_type t_map;
}
profiler::profile::~Profile() {
  clock_t ct = clock() - start_time;
  timing_record& tr = t_map[id];
  tr.used_time += ct;
  ++tr.hits;
}
std::string profiler::report() {
  timing_map_type::const_iterator iter;

Never define a variable until you're ready to initialize it.
  std::stringstream res;

Unless you can't avoid it:).
  for(iter = t_map.begin(); iter != t_map.end(); ++iter) {

I'd have defined the iterator in the for.

Good points, taken all of them after Victor's and your advices.
There are two things wrong with this statement.  First, you
don't scale the time by CLOCKS_PER_SEC, so you're outputting
in some implementation defined unit.  (Posix requires that
CLOCKS_PER_SEC be 1000000, but I've seen other values.
Including 60 on some older systems.)  The second is that you
probably want the results in a floating point type; for a very
short function with a lot of hits, the used_time might actually
be less than hits, and even in less extreme cases, the results
may not be very accurate.  Try something like:

    res << "avg time (ms): "
        << 1000.0 * tr.used_time / CLOCKS_PER_SEC / tr.hits
        << std::endl ;

Very good point, I'll take on the scaling.
(In general, until you really, really know what you're doing,
and can write practically error free code from the start, you
should prefer std::endl to '\n'.)

Sorry but I don't understand this, what do you mean exactly?
Do you mean typo-free (about the escaping) or something else related
to streams and flushing them with endl?
More generally, I'd have written this function:

    void Profiler::report(
        std::eek:stream&       dest )
    {
        //  ...
    }

It seems likely to me that most of the time, it will be called
for direct output, so you might as well format directly into the
target stream.  This also gives the user the possibility of
setting various format flags (probably not very useful here, but
you never know---if you adopt floating point, precision could be
relevant).  If you really do want to offer the possibility to
get a string, you can always overload with:

    std::string Profiler::report()
    {
        std::eek:stringstream  result ;
        report( result ) ;
        return result.str() ;
    }

Good features to consider, I'll take them on.

Thanks a lot for your in-depth advices, once more.

Francesco
 
J

Jorgen Grahn

Hi everybody,
with this very post I'm starting my first thread into this NG.

I know that my behavior so far has been a bit uncommon. It's uncommon
to drop into a technical NG and start giving comments and advices
before asking for comments and advices in first place - even more if
the person doing it is a hobbyist like me.

I don't find it uncommon at all. That's what I do; I find other
people's questions (and the discussions they start) much more
interesting than my own.

And hopefully a lot of people here are hobbyists. Healthy languages
have hobbyists. Hobbyists wrote most of the programs I use and rely on
daily (but yes, I know many key Linux developers etc get paid today).

/Jorgen
 
F

Francesco

Here is the third version of my profiler, added at the bottom of my
post.

GoogleGroups will eventually cut it and add a "read more" link, hence
I'm posting my questions here.

- Shall I start adding "throw()" to all functions - and ensure that
none of them actually throws by wrapping the code into try / catch
blocks?

- Does it make any difference to use clock_t against std::clock_t (and
the alike, using C functions incorporated into the std namespace)?

- Why do I get strange warnings about uninitialized variables about
the instantiation of the STL containers? Is my code faulty or it is
due to some gcc oddity?

One warning for all:
-------
C:/Programmi/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../
include/c++/3.4.5/bits/stl_vector.h: In member function
`std::vector<_Tp, _Alloc>& std::vector<_Tp, _Alloc>::eek:perator=(const
std::vector<_Tp, _Alloc>&) [with _Tp = profiler_utils::TimingRec,
_Alloc = std::allocator<profiler_utils::TimingRec>]':
C:/Programmi/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../
include/c++/3.4.5/bits/stl_vector.h:715: warning: '__result' might be
used uninitialized in this function
-------

As for the 2nd version, the only real issue I was able to find by
myself was about a potential division by zero, now made harmless -
although, sticking to the code I posted, wasn't really likely to
happen.

About the new version, the new features are quite straightforward to
understand, here is the code, I'll be happy to read your comments,
fixes and, once more, to receive your slaps for my bad habits.

No disclaimer. If I left something in, it means I meant to leave it in
or, otherwise, I missed to get the point, hence feel free to dissect.

Thank you very much for your attention and for your help,
best regards,
Francesco

-------
// **********************************
// profiler utility, 3rd version
// all files thrown together
// **********************************

// **********************************
// profiler_utils.h
// (double-incl checks snipped)
// **********************************

#include <iostream>
#include <string>
#include <vector>
#include <ctime>

namespace profiler_utils {

class Profiler {
public:
Profiler(const std::string& sid)
: id(sid),
start_time(std::clock()) {} // start_time must be initialized
last
~Profiler();
private:
Profiler(const Profiler&);
const Profiler& operator=(const Profiler&);
std::string id;
std::clock_t start_time; // start_time must be declared last
};

struct TimingRec {
std::string id;
double avg_time;
double total_time;
int calls_count;
TimingRec() : avg_time(0), total_time(0), calls_count(0) {}
};

inline std::eek:stream& operator<<(std::eek:stream& os, const TimingRec& r)
{
os << " Id: " << r.id << "\n";
os << " Avg time: " << r.avg_time << " msecs\n";
os << " Total time: " << r.total_time << " msecs\n";
os << "Calls count: " << r.calls_count << "\n";
return os;
}

void report(std::eek:stream&);
std::string report();
void reset();
TimingRec get_record(const std::string&);
std::vector<TimingRec> most_called();
std::vector<TimingRec> most_expensive();
std::vector<TimingRec> most_timespent();

} // end profiler namespace

// **********************************
// profiler.cpp
// **********************************

//#include "profiler_utils.h"
#include <map>
#include <sstream>

namespace {

struct timing_record {
std::clock_t total_time;
int calls_count;
timing_record(): total_time(0), calls_count(0) {}
};

inline profiler_utils::TimingRec makeTimingRec(const std::string& s,
const timing_record& r) {
profiler_utils::TimingRec res;
res.id = s;
if (r.calls_count) {
res.avg_time = 1000.0 * r.total_time / CLOCKS_PER_SEC /
r.calls_count;
} else {
res.id += " [bad record]";
}
res.total_time = 1000.0 * r.total_time / CLOCKS_PER_SEC;
res.calls_count = r.calls_count;
return res;
}

typedef std::map<std::string, timing_record> timing_map_type;
typedef timing_map_type::const_iterator t_map_iter;

// main data depot
timing_map_type t_map;

// stats indexes
std::map<double, std::vector<std::string> > avg_t_map;
std::map<std::clock_t, std::vector<std::string> > total_t_map;
std::map<int, std::vector<std::string> > calls_map;

bool must_rebuild;

void rebuild_indexes() {
if (must_rebuild) {
avg_t_map.clear();
total_t_map.clear();
calls_map.clear();
for (t_map_iter it = t_map.begin(), end = t_map.end(); it != end; +
+it) {
const std::string& id = it->first;
const timing_record& tr = it->second;
if (tr.calls_count) {
avg_t_map[double(tr.total_time) / tr.calls_count].push_back
(id);
}
total_t_map[tr.total_time].push_back(id);
calls_map[tr.calls_count].push_back(id);
}
must_rebuild = false;
}
}

void fill_vector(const std::vector<std::string>& ids,
std::vector<profiler_utils::TimingRec>* recs) {
for (size_t i = 0, end = ids.size(); i < end; ++i) {
t_map_iter it = t_map.find(ids);
if (it != t_map.end()) {
recs->push_back(makeTimingRec(it->first, it->second));
}
}
}

} // end anonymous namespace

profiler_utils::profiler::~Profiler() {
std::clock_t ct = std::clock() - start_time;
timing_record& tr = t_map[id];
tr.total_time += ct;
++tr.calls_count;
must_rebuild = true;
}

void profiler_utils::report(std::eek:stream& stream) {
stream << "\n---[ profiler report ]---\n\n";
if (t_map.size()) {
for (t_map_iter it = t_map.begin(), end = t_map.end(); it != end; +
+it) {
stream << makeTimingRec(it->first, it->second) << "\n";
}
} else {
stream << "Nothing to report.\n\n";
}
stream << "---[ end report ]--------\n";
}

std::string profiler_utils::report() {
std::eek:stringstream res;
report(res);
return res.str();
}

void profiler_utils::reset() {
t_map.clear();
must_rebuild = true;
}

profiler_utils::TimingRec profiler_utils::get_record(const
std::string& id) {
t_map_iter it = t_map.find(id);
if (it != t_map.end()) {
return makeTimingRec(it->first, it->second);
} else {
return makeTimingRec("", timing_record());
}
}

std::vector<profiler_utils::TimingRec> profiler_utils::most_called() {
rebuild_indexes();
std::vector<TimingRec> res;
if (calls_map.size()) {
fill_vector(calls_map.rbegin()->second, &res);
}
return res;
}

std::vector<profiler_utils::TimingRec> profiler_utils::most_expensive
() {
rebuild_indexes();
std::vector<TimingRec> res;
if (avg_t_map.size()) {
fill_vector(avg_t_map.rbegin()->second, &res);
}
return res;
}

std::vector<profiler_utils::TimingRec> profiler_utils::most_timespent
() {
rebuild_indexes();
std::vector<TimingRec> res;
if (total_t_map.size()) {
fill_vector(total_t_map.rbegin()->second, &res);
}
return res;
}

// **********************************
// main.cpp - test case
// **********************************

//#include "profiler_utils.h"

using namespace std;
using namespace profiler_utils;

const string sMC = "Most called function";
const string sME = "Most expensive function";
const string sMT = "Most timespent function";

void Test();
void MostCalled();
void MostExpensive();
void MostTimeSpent();

int main()
{
for (size_t i = 0; i < 50; ++i) {
MostCalled();
}
MostExpensive();
for (size_t i = 0; i < 20; ++i) {
MostTimeSpent();
}
const string sep = string(50, '*') + "\n";
cout << sep << " Test results:\n" << sep;
Test();
report(cout);
reset();
cout << endl;
cout << sep << " Test results after resetting:\n" << sep;
Test();
cout << report() << endl;
return 0;
}

void Test() {
cout << "\nGetting most_called()..." << endl;
vector<TimingRec> res = most_called();
if (res.size()) {
if (res[0].id == sMC) {
cout << res[0];
} else {
cout << "Check failed!" << endl;
}
} else {
cout << "Nothing got!" << endl;
}
cout << endl;
cout << "Getting most_expensive()..." << endl;
res = most_expensive();
if (res.size()) {
if (res[0].id == sME) {
cout << res[0];
} else {
cout << "Check failed!" << endl;
}
} else {
cout << "Nothing got!" << endl;
}
cout << endl;
cout << "Getting most_timespent()..." << endl;
res = most_timespent();
if (res.size()) {
if (res[0].id == sMT) {
cout << res[0];
} else {
cout << "Check failed!" << endl;
}
} else {
cout << "Nothing got!" << endl;
}
cout << endl;
cout << "Getting get_record(\"" << sMC << "\")..." << endl;
cout << get_record(sMC) << endl;
cout << "Getting get_record(\"" << sME << "\")..." << endl;
cout << get_record(sME) << endl;
cout << "Getting get_record(\"" << sMT << "\")..." << endl;
cout << get_record(sMT) << endl;
}

void MostCalled() {
Profiler p(sMC);
}

void MostExpensive() {
Profiler p(sME);
for (size_t i = 0; i < 10000; ++i) {
cout << ".\b";
}
}

void MostTimeSpent() {
Profiler p(sMT);
for (size_t i = 0; i < 1000; ++i) {
cout << ".\b";
}
}
-------
 
J

Jorgen Grahn

I lack the knowledge to properly use debuggers and profilers that come
with Code::Blocks. Actually, I'm not even sure about whether it
includes a profiler... never mind.

Hence, I usually debug my code by printing out variable values to the
console and the alike.

I don't care much for debuggers, but IMHO you should expect a real
profiler to be available, and spend the time it takes to learn what it
can do, and to learn how to interpret it. In my case, I'm pretty happy
with Gnu gprof.

For benchmarking I always time a complete program, make sure startup
time is negligable, and let system tools tell me things. In my case
it's Unix time(1), which breaks down wall-clock time into user, system
and waiting time with pretty decent resolution.

Code::Blocks appears to be just an IDE, which can use one of many
different compilers/linkers/etc. If it doesn't expose important tools
like profilers and timing utilities, maybe you should learn to live
without it.

/Jorgen
 
F

Francesco

I don't find it uncommon at all. That's what I do; I find other
people's questions (and the discussions they start) much more
interesting than my own.

And hopefully a lot of people here are hobbyists. Healthy languages
have hobbyists. Hobbyists wrote most of the programs I use and rely on
daily (but yes, I know many key Linux developers etc get paid today).

Well, your experience in such groups is surely longer than mine.
That's comforting to read that my behavior isn't uncommon.

I don't care much for debuggers, but IMHO you should expect a real
profiler to be available, and spend the time it takes to learn what it
can do, and to learn how to interpret it. In my case, I'm pretty happy
with Gnu gprof.

For benchmarking I always time a complete program, make sure startup
time is negligable, and let system tools tell me things. In my case
it's Unix time(1), which breaks down wall-clock time into user, system
and waiting time with pretty decent resolution.

Code::Blocks appears to be just an IDE, which can use one of many
different compilers/linkers/etc. If it doesn't expose important tools
like profilers and timing utilities, maybe you should learn to live
without it.

About the debugger, I'm sure it's there - and I'm starting to fiddle
with it. The profiler seems to be missing - one reason more to
continue the task of this thread ;-)

So far Code::Blocks fulfills my needs, but I admit it would be a very
good thing to get accustomed with different development environments,
hence I'll try to get some in the next future.

Thanks a lot for the details about the way you do benchmarks. I'll add
them to my profiling method(s), along with the suggestions I got from
another poster by e-mail - i.e. saving the profiler data in a file to
recover it between different program runs, as well as replacing the
actual instantiation of the Profiler classes with a macro, so that I
can throw out all the profiling code at once when I no longer need it.

Cheers,
Francesco
 
F

Francesco

I don't know, maybe I'll try to profile it to check whether it makes
any difference.

Thinking again I've got your question right. You didn't mean whether
it made any difference, but whether the savings where signifying.

I used my profiler onto the two ctor versions using this code:

-------
#include "profiler_utils.h" // 3rd version as posted before

using namespace profiler_utils;
using namespace std;

struct inline_ctor {
string s;
clock_t t;
inline_ctor(const string& rs = "") : s(rs), t(clock()) {}
};

struct separate_ctor {
string s;
clock_t t;
separate_ctor(const string& = "");
};

const size_t arr_max = 10000;

template<class T> void test(const string& s) {
Profiler p(s);
T* ptr = new T[arr_max];
for(size_t i = 0; i < arr_max; ++i) {
ptr = T("text");
}
delete[] ptr;
}

separate_ctor::separate_ctor(const string& rs) : s(rs), t(clock()) {}

int main() {
for(size_t i = 0; i < arr_max; ++i) {
test<separate_ctor>("separate");
test<inline_ctor>("inline");
}
report(cout);
return 0;
}
-------

Here is the output I got:

---[ profiler report ]---

Id: inline
Avg time: 14.752 msecs
Total time: 147520 msecs
Calls count: 10000

Id: separate
Avg time: 15.2241 msecs
Total time: 152241 msecs
Calls count: 10000

---[ end report ]--------

Now the problem is: how should I interpret this result?

If I got it right, this means that the inline ctor beats the the
separated one by just...

(15.2241 - 14.752) / 20000 == 0.000023605 milliseconds per constructor
call

....which seems a ridiculously small saving.

Any comment welcome, this is all new stuff for me and I'd like to
understand if I'm getting things right.

Francesco
 
J

James Kanze

On 20 Ago, 10:49, James Kanze <[email protected]> wrote:

[...]
Sorry but I don't understand this, what do you mean exactly?
Do you mean typo-free (about the escaping) or something else
related to streams and flushing them with endl?

Unless you're a really, really good programmer (a lot better
than me), you're code will crash from time to time. If you're
flushing after each line, you'll see everything through the last
line output on standard out; this can be a very big help in
understanding where and why you're crashing.

Using std::endl does have significant performance implications.
With experience, you'll get a more or less intuitive feel about
when it's appropriate, and when to prefer '\n'. But until then,
std::endl will cause you a lot less problems.
 
J

James Kanze

Thinking again I've got your question right. You didn't mean
whether it made any difference, but whether the savings where
signifying.

Exactly. Or more to the point, whether you knew that it would
make a significant difference.
I used my profiler onto the two ctor versions using this code:
using namespace profiler_utils;
using namespace std;
struct inline_ctor {
string s;
clock_t t;
inline_ctor(const string& rs = "") : s(rs), t(clock()) {}
};
struct separate_ctor {
string s;
clock_t t;
separate_ctor(const string& = "");
};
const size_t arr_max = 10000;
template<class T> void test(const string& s) {
Profiler p(s);
T* ptr = new T[arr_max];
for(size_t i = 0; i < arr_max; ++i) {
ptr = T("text");
}
delete[] ptr;
}


Note that for such a small array, the allocation could easily
take up a significant amount of time, and affect your
measurements. And that your new T[ arr_max ] also calls the
default constructor. And that your loop includes an assignment
operator as well. Also, a good compiler will note that elements
in the array are never accessed after the assign, and end up
suppressing the entire loop.

Benchmarking is a tricky art---you want the compiler to apply
all relevant optimizations, but not to optimize away what you're
actually trying to measure. My BenchHarness class uses a
virtual function to encapsulate what I'm trying to measure; in
the base class, this function is empty, and in the derived
classes, I put the code I'm trying to measure. When I set the
iteration count (via a static member function in BenchHarness),
I run the test on the base class, and save the results (which
correspond to the time necessary to call the empty function).
When I run benchmarks with the derived classes, I subtract this
saved value from the total, so that (hopefully), I end up with
only the time taken for what I'm trying to measure. And at
least with the compilers I generally use, calling through a
virtual function seems to stump the optimizer, so my loops don't
get optimized out. I also run each test once immediately before
starting the measurements, so that I'm not measuring cache
misses the first time through.
separate_ctor::separate_ctor(const string& rs) : s(rs), t(clock()) {}

Given that this definition is in the same file, a lot (most?)
compilers will generate it inline anyway. (There are some
compilers which will generate it inline even if the definition
is in a different file, but they aren't legion.)
int main() {
for(size_t i = 0; i < arr_max; ++i) {
test<separate_ctor>("separate");
test<inline_ctor>("inline");
}
report(cout);
return 0;}

Here is the output I got:
---[ profiler report ]---
Id: inline
Avg time: 14.752 msecs
Total time: 147520 msecs
Calls count: 10000
Id: separate
Avg time: 15.2241 msecs
Total time: 152241 msecs
Calls count: 10000
---[ end report ]--------
Now the problem is: how should I interpret this result?
If I got it right, this means that the inline ctor beats the
the separated one by just...
(15.2241 - 14.752) / 20000 == 0.000023605 milliseconds per constructor
call

That's about 24 nanoseconds. That sounds about right for the
cost of a function call on a modern machine. On the other hand,
the difference in the measurements is about 3%. I'm not sure
your harness provides that sort of precision.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top