Code critique

  • Thread starter Martin Eisenberg
  • Start date
M

Martin Eisenberg

Hi,

I've written a program that I'd like to hear some opinions on
regarding my C++ usage. As a program it's smallish, but on Usenet 300
lines seem a bit much. Do you think it's appropriate to post the
code? Alternatively, maybe someone would be willing to put it on the
web for some days? I'm reluctant to sign up with a freespace provider
just for this one occasion; besides, most disallow use of their
space as pure file storage.

Thanks,
Martin
 
P

Petec

Martin said:
Hi,

I've written a program that I'd like to hear some opinions on
regarding my C++ usage. As a program it's smallish, but on Usenet 300
lines seem a bit much. Do you think it's appropriate to post the
code? Alternatively, maybe someone would be willing to put it on the
web for some days? I'm reluctant to sign up with a freespace provider
just for this one occasion; besides, most disallow use of their
space as pure file storage.

Thanks,
Martin

I'd be happy to critique your code, and I think as long as you put (long) in
the subject it'd be fine. :)

- Pete
 
D

David Harmon

On 6 Apr 2004 22:31:34 GMT in comp.lang.c++, Martin Eisenberg
I've written a program that I'd like to hear some opinions on
regarding my C++ usage. As a program it's smallish, but on Usenet 300
lines seem a bit much. Do you think it's appropriate to post the
code?

I think you should ask the specific questions that you wish to ask, and
post as much code as is needed for the context for the questions. The
more specific the questions, the better answers you will get. The more
relevant the code is to the specific questions, the better answers you
will get.
 
M

Martin Eisenberg

David said:
I think you should ask the specific questions that you wish to
ask,

Is my code in good style from the Standard C++ viewpoint? It doesn't
get any more specific than that.
and post as much code as is needed for the context for the
questions.

Which would be the whole program.


Martin
 
A

Alf P. Steinbach

* Martin Eisenberg said:
Is my code in good style from the Standard C++ viewpoint? It doesn't
get any more specific than that.


Which would be the whole program.

Just post, put "(long)" in the subject line.
 
M

Martin Eisenberg

OK, thanks for the encouragements. Note that I'm well aware that
I won't solve the Collatz conjecture that way. It is the canvas,
C++ is the picture. Also, I know the standard doesn't speak to
those MSVC artifacts, but if you think they could be better
handled I'm very interested in hearing about that as well.
There are three files; word wrap is off.


Martin

// main.cpp
/* This program searches for a seed value
yielding a Collatz sequence that is long
in relation to the seed's magnitude.
*/

#ifdef _MSC_VER
// "identifier was truncated to '255' characters in the debug information"
#pragma warning(disable: 4786)
#endif

#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
#include "collatztree.h"

using namespace std;

const string KSeqFile = "collatz.txt";
const unsigned KCheckInterval = 10000000;

//-----------------------------------------------

void awaitEnter()
{
char c[2];
cout << "Program finished. Hit <Enter> to close.";
cin.getline(c, 2);
}

bool prompt(const string& message)
{
char c;
cout << message << " (y/n <Enter>) ";
cin >> c;
#ifdef WIN32
cin.ignore(); // Swallow Windows newline's second character.
#endif
cout << '\n';
return c == 'y';
}

bool checkUserBreak()
{ return !prompt("Continue searching?"); }

void writeBest(const CollatzTree::Best& best,
ostream& os, unsigned nodeCount = 0)
{
os << "The best seed found ";
if(nodeCount > 0) os << "among the first " << nodeCount << " nodes ";
os << "is n = " << best.pNode->value
<< "\nwith number of iterations it(n) = " << best.depth
<< "\nand cost c(n) = it(n) / log2(n) = " << best.cost << ".\n";
}

void writeBest(const CollatzTree::Best& best, unsigned nodeCount)
{ writeBest(best, cout, nodeCount); }

void writeSequence(const CollatzTree::Best& best)
{
if(prompt("Write the longest sequence found to " + KSeqFile + "?")) {
ofstream ofs(KSeqFile.c_str());
writeBest(best, ofs);
ofs << '\n';
CollatzTree::pNode pn = best.pNode;
ofs << pn->value;
while(!pn->isRoot()) {
pn = pn->parent;
ofs << ", " << pn->value;
}
ofs << '\n';
}
}

//-----------------------------------------------

int main()
{
CollatzTree ct(writeBest, checkUserBreak, KCheckInterval);
cout << "Building Collatz tree...\n\n" << fixed << setprecision(3);
do ct.addLevel(); while(!ct.isFinished());
writeSequence(ct.getBest());
awaitEnter();
return 0;
}

// end of file ----------------------------------------------------------------

// collatztree.h
#ifndef COLLATZTREE_H
#define COLLATZTREE_H

#include <vector>
#include <list>
#include <utility>
#include <memory>

/* The class CollatzTree creates, level for level, the
tree of terminating Collatz sequences that do not
exceed the range of type unsigned. It also tracks
the seed value n with largest quotient of the number
of iterations from n and log2(n), and periodically
outputs the most expensive seed found so far.
Smaller values are favored in case of equal cost.
The user can interrupt the class' operation.
The possibility of memory exhaustion is ignored.
*/

class CollatzTree {
public:
struct Node;
struct Best;
typedef void (*WriteBest)(const Best&, unsigned nodeCount);
typedef bool (*CheckUserBreak)();
typedef const Node* PNode;

struct Node {
Node(unsigned value, PNode parent);
bool isRoot() const;

unsigned value;
PNode parent;
};

struct Best {
PNode pNode;
unsigned depth;
float cost;
};

//--------------

CollatzTree(WriteBest writeBest,
CheckUserBreak checkUserBreak, unsigned checkInterval);
~CollatzTree();
void addLevel();
const Best& getBest() const;
bool isFinished() const;

private:
typedef std::vector<Node> Level;
typedef std::vector<Level*> Levels;
typedef std::pair<unsigned, unsigned> Children;

PNode addInitial(unsigned value, PNode parent);
bool insert(unsigned value, PNode parent);
Children getChildren(unsigned value);
size_t getNewLevelSize();

const WriteBest writeBest_;
const CheckUserBreak checkUserBreak_;
const unsigned checkInterval_;
Levels levels_;
Best best_;
unsigned nodeCount_;
bool isFinished_;
};

#endif // COLLATZTREE_H
// end of file ----------------------------------------------------------------

// collatztree.cpp
#include "collatztree.h"

#if defined(_MSC_VER) && _MSC_VER == 1200 // MSVC 6
namespace std {
#include <math.h>
}
#else
#include <cmath>
#endif

#include <limits>

// CollatzTree ----------------------------------------------------------------
// public

CollatzTree::Node::Node(unsigned value, CollatzTree::pNode parent)
: value(value), parent(parent)
{}

bool CollatzTree::Node::isRoot() const
{ return parent == 0; }

//-----------------------------------------------

/* The constructor creates the entries up to 8 so that
getChildren() need not handle 1 as a spurious child of 4.
*/

CollatzTree::CollatzTree(CollatzTree::WriteBest writeBest,
CollatzTree::CheckUserBreak checkUserBreak, unsigned checkInterval)
: writeBest_(writeBest), checkUserBreak_(checkUserBreak),
checkInterval_(checkInterval), nodeCount_(4), isFinished_(false)
{
PNode pn = addInitial(1, 0);
pn = addInitial(2, pn);
best_.pNode = pn;
best_.depth = 1;
best_.cost = 1;
addInitial(8, addInitial(4, pn));
}

CollatzTree::~CollatzTree()
{
Levels::iterator pl = levels_.begin();
for(; pl != levels_.end(); ++pl) delete *pl;
}

/* The method addLevel() finds the numbers that a single
Collatz iteration takes to those in the previous top level.
It controls the new level's capacity to avoid memory waste.
*/

void CollatzTree::addLevel()
{
if(isFinished_) return;
const Level& oldTop = *levels_.back();
levels_.push_back(new Level());
levels_.back()->reserve(getNewLevelSize());

Children children;
Level::const_iterator parent = oldTop.begin();
for(; parent != oldTop.end() && !isFinished_; ++parent) {
children = getChildren(parent->value);
if(children.first != 0)
isFinished_ = insert(children.first, parent);
if(children.second != 0 && !isFinished_)
isFinished_ = insert(children.second, parent);
}
}

const CollatzTree::Best& CollatzTree::getBest() const
{ return best_; }

bool CollatzTree::isFinished() const
{ return isFinished_; }

// private --------------------------------------

CollatzTree::pNode
CollatzTree::addInitial(unsigned value, CollatzTree::pNode parent)
{
Level* pl = new Level();
levels_.push_back(pl);
pl->push_back(Node(value, parent));
return &pl->back();
}

bool CollatzTree::insert(unsigned value, CollatzTree::pNode parent)
{
static const float KLn2 = std::log(2.f);
static const unsigned KMaxNodeCount = std::numeric_limits<unsigned>::max();

levels_.back()->push_back(Node(value, parent));
unsigned depth = levels_.size() - 1;
float cost(depth / std::log(float(value)) * KLn2);
if(cost >= best_.cost && (cost > best_.cost || value < best_.pNode->value)) {
best_.pNode = &levels_.back()->back();
best_.depth = depth;
best_.cost = cost;
}
if(++nodeCount_ % checkInterval_ == 0) {
writeBest_(best_, nodeCount_);
if(checkUserBreak_()) return true;
}
return nodeCount_ == KMaxNodeCount;
}

CollatzTree::Children CollatzTree::getChildren(unsigned value)
{
static const unsigned
KHalfMaxValue = std::numeric_limits<unsigned>::max() / 2;

unsigned c1 = value * 2, c2 = (value - 1) / 3;
if(value > KHalfMaxValue) c1 = 0;
if(c2 % 2 == 0 || c2 * 3 + 1 != value) c2 = 0;
return Children(c1, c2);
}

/* Empirically determining the factors in getNewLevelSize() is easy,
but I have not proven that there are no outliers further up.
*/

size_t CollatzTree::getNewLevelSize()
{
static const float KFactors[] = {
0.f, 1.f, 1.f, 1.f, 1.f, 2.f, 1.f, 2.f, 1.f, 1.5f, 1.f,
1.34f, 1.25f, 1.4f, 1.29f, 1.34f, 1.21f, 1.25f, 1.23f, 1.32f
};
static const size_t NFactors(sizeof(KFactors) / sizeof(KFactors[0]));

float factor;
unsigned depth = levels_.size() - 1;
if(depth < NFactors) factor = KFactors[depth]; else factor = 1.28f;
return static_cast<size_t>(std::ceil(levels_[depth - 1]->size() * factor));
}

// end of file ----------------------------------------------------------------
 
M

Martin Eisenberg

<sigh> ...and sorry for forgetting the "(long)" in the subject.


Martin
 
K

Kevin Goodsell

Martin said:
OK, thanks for the encouragements. Note that I'm well aware that
I won't solve the Collatz conjecture that way. It is the canvas,
C++ is the picture. Also, I know the standard doesn't speak to
those MSVC artifacts, but if you think they could be better
handled I'm very interested in hearing about that as well.
There are three files; word wrap is off.

This is just a few quick comments, no time for a complete run-down right
now.
Martin

// main.cpp
/* This program searches for a seed value
yielding a Collatz sequence that is long
in relation to the seed's magnitude.
*/

#ifdef _MSC_VER
// "identifier was truncated to '255' characters in the debug information"
#pragma warning(disable: 4786)
#endif

Sucks, don't it? Not much else you can do about this, though.
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
#include "collatztree.h"

using namespace std;

This should be avoided, but is not terrible, particularly for small
programs. Generally, import only what you need where you need it, so you
could use

using std::string;
using std::vector;

and you could even put these inside the functions that use those items.
const string KSeqFile = "collatz.txt";
const unsigned KCheckInterval = 10000000;

This value is much too large to portably put in an unsigned int. The
portable maximum for unsigned int is 65535. Better make it unsigned long.
//-----------------------------------------------

void awaitEnter()
{
char c[2];
cout << "Program finished. Hit <Enter> to close.";
cin.getline(c, 2);
}

bool prompt(const string& message)
{
char c;
cout << message << " (y/n <Enter>) ";
cin >> c;
#ifdef WIN32
cin.ignore(); // Swallow Windows newline's second character.
#endif

Eh? I don't think this is right. This should be the same, Windows or
not. Besides that, you can't control how many characters are actually
entered, so you should probably extract the entire line, and check the
first character.

That's about all I can do right now.

-Kevin
 
J

John Harrison

This should be avoided, but is not terrible, particularly for small
programs. Generally, import only what you need where you need it, so you
could use

using std::string;
using std::vector;

Unfortunately MSVC 6 does not implement this usage correctly.
and you could even put these inside the functions that use those items.

Has E Robert Tisdale got to you? It's not advice I would give.
const string KSeqFile = "collatz.txt";
const unsigned KCheckInterval = 10000000;

This value is much too large to portably put in an unsigned int. The
portable maximum for unsigned int is 65535. Better make it unsigned long.
//-----------------------------------------------

void awaitEnter()
{
char c[2];
cout << "Program finished. Hit <Enter> to close.";
cin.getline(c, 2);
}

bool prompt(const string& message)
{
char c;
cout << message << " (y/n <Enter>) ";
cin >> c;
#ifdef WIN32
cin.ignore(); // Swallow Windows newline's second character.
#endif

Eh? I don't think this is right. This should be the same, Windows or
not. Besides that, you can't control how many characters are actually
entered, so you should probably extract the entire line, and check the
first character.

This is a well known MSVC 6 bug, it was fixed in a service pack (which the
OP should certainly get hold of) but you can also get this fix and more from
here

http://www.dinkumware.com/vc_fixes.html

john
 
I

Ivan Vecerina

....
#ifdef _MSC_VER
// "identifier was truncated to '255' characters in the debug information"
#pragma warning(disable: 4786)
#endif
Note that you can disable such warnings in the command line,
or in the project's settings. This keeps implementation-
specific stuff out of your source code.
void awaitEnter()
{
char c[2]; ....
cin.getline(c, 2);
The following is a more explicit way to do this:
cin.ignore( std::numeric_limits<int>::max(), '\n' );
(first value is a max # of characters: you could
probably use any large integer as well)
bool prompt(const string& message)
{
char c;
cout << message << " (y/n <Enter>) ";
cin >> c;
#ifdef WIN32
cin.ignore(); // Swallow Windows newline's second character.
Needed? The stream library normally strips-off the
windows-specific '\r' (any platform-specific line termination
is converted to '\n' when it text mode).
Maybe you want to use 'getline'. Or ignore all characters
to the end of line (inclusive) -- see the call I posted above.
In any case, no Windows-specific handling should be needed here.

....
typedef void (*WriteBest)(const Best&, unsigned nodeCount);
typedef bool (*CheckUserBreak)();
typedef const Node* PNode;
The first two typedefs are very reasonable, but I would
recommend dropping PNode: It is usually best to be
explicit about pointers and their const-qualifications
-- use Node* or Node const* instead.
typedef std::vector<Node> Level;
typedef std::vector<Level*> Levels;
A container of naked pointers ?
This smells like memory leaks are going to happen
( unless you are using some special type of
memory allocation policy ).
Recommendations:
- use std::vector<Level> unless performance problem
is established. Consider std::list<Level>
or std::deque<Level> also (may address performance
issues you observe).
- use std::vector<boost::shared_ptr<Level>> : will
manage the memory deallocation for you automatically.
shared_ptr can be found on www.boost.org , and is
expected to be part of the next revision of the
C++ standard.
- Use a container such as std::list or std::deque
to store any objects you allocate -- i.e. replace:
Level* p = new Level();
with
levelPool.push_back( Level() );
Level* p = &levelPool.back();
after adding a data member such as:
std::deque<Level> levelPool;
At destruction time, the 'levelPool' container
will safely destroy and deallocate all Level instances.
Other schemes are possible, but these 3 are easy to use
and will improve the safety of your code.
(even though this might not matter for this application).
CollatzTree::CollatzTree(CollatzTree::WriteBest writeBest,
CollatzTree::CheckUserBreak checkUserBreak, unsigned checkInterval)
: writeBest_(writeBest), checkUserBreak_(checkUserBreak),
checkInterval_(checkInterval), nodeCount_(4), isFinished_(false)
{
PNode pn = addInitial(1, 0); ....
}

CollatzTree::~CollatzTree()
{
Levels::iterator pl = levels_.begin();
for(; pl != levels_.end(); ++pl) delete *pl;
}
Infortunately, this is not enough to avoid all memory leaks.
For example, if your constructor (which adds items) fails
with an exception (e.g. during a memory allocation), objects
will be leaked.
/* The method addLevel() finds the numbers that a single
Collatz iteration takes to those in the previous top level.
It controls the new level's capacity to avoid memory waste.
*/

void CollatzTree::addLevel()
{
if(isFinished_) return;
const Level& oldTop = *levels_.back();
levels_.push_back(new Level());
levels_.back()->reserve(getNewLevelSize());

Here too, if the push_back() call fails, you will be
left with a memory leak.

Also a serious problem: levels_.push_back() might
reallocate the contents of the levels_ vector,
and invalidate the reference you stored in oldTop
(unless you make sure to first call 'reserve').
Suggestion: initialize oldTop after the push_back:
const Level& oldTop = levels_[ levels_.size()-2 ];



Well, this is as far as my attention span goes...

I hope this helps,
Ivan
 
D

David Harmon

On Wed, 7 Apr 2004 06:34:42 +0100 in comp.lang.c++, "John Harrison"
This is a well known MSVC 6 bug, it was fixed in a service pack (which the
OP should certainly get hold of) but you can also get this fix and more from
here

http://www.dinkumware.com/vc_fixes.html

I don't think so (I could be wrong.) The problem I remember there was
with getline() and did not affect operator>>.
 
J

John Harrison

David Harmon said:
On Wed, 7 Apr 2004 06:34:42 +0100 in comp.lang.c++, "John Harrison"


I don't think so (I could be wrong.) The problem I remember there was
with getline() and did not affect operator>>.

Yes I think you are right.

It would be interesting to know what the OP thought was the problem with
WIN32 systems in particular. As you say the code shouldn't have to be
different for Windows.

john
 
K

Kevin Goodsell

John said:
Put all using ... at the top of the file, after all the includes.

Sure, that works well enough most of the time. But you /could/ restrict
it to individual functions, if you wanted to.

-Kevin
 
J

John Harrison

Kevin Goodsell said:
Sure, that works well enough most of the time. But you /could/ restrict
it to individual functions, if you wanted to.

I'm not denying it, it's just something I would almost never do.

Since generally speaking using ... refers to names that have been included
from a header file, it makes sense to me to put that declaration/directive
(which is it?) close to the header file it refers to.

john
 
K

Kevin Goodsell

John said:
I'm not denying it, it's just something I would almost never do.

Since generally speaking using ... refers to names that have been included
from a header file, it makes sense to me to put that declaration/directive
(which is it?) close to the header file it refers to.

That seems reasonable. The only problem is if you have similar names in
multiple namespaces, but I've never heard of this being much of a
problem in reality.

-Kevin
 
M

Martin Eisenberg

Thanks, Kevin!

Kevin said:
Martin Eisenberg wrote:

This should be avoided, but is not terrible, particularly for
small programs.

That's what I figured. I did it because of all the console usage; in
collatztree.cpp I qualified explicitly. But I guess it's Good to put
more specific declarations, as you suggest.
This value is much too large to portably put in an unsigned int.
The portable maximum for unsigned int is 65535. Better make it
unsigned long.

Whoa :)
Besides that, you can't control how many
characters are actually entered, so you should probably extract
the entire line, and check the first character.

Good point. I'll do it like that.


Martin
 
M

Martin Eisenberg

I wrote further upthread:
cin >> c;
#ifdef WIN32
// Swallow Windows newline's second character.
cin.ignore();
#endif

I have SP5, so that would be SP6?

I got those, but now realize that I failed to rebuild the library.
It would be interesting to know what the OP thought was the
problem with WIN32 systems in particular. As you say the code
shouldn't have to be different for Windows.

Easy. Premises:
- newline has two characters on Windows and one on Unices.
- Every second call to the function excerpted at
the top does not wait for me to touch anything.
Conclusion using Modus Praematurus:
- I need to special-case for Windows.


Martin
 
M

Martin Eisenberg

Ivan, thanks for the response.

Ivan said:
cin.ignore( std::numeric_limits<int>::max(), '\n' );

I've seen this before, but it didn't stick. To get really picky for a
moment, would size_t or streamsize or something be more fitting than
int?
The first two typedefs are very reasonable, but I would
recommend dropping PNode: It is usually best to be
explicit about pointers and their const-qualifications
-- use Node* or Node const* instead.

What are some pitfalls that this abstraction would promote?
A container of naked pointers ?

First I had a vector of auto_ptr. When I read why that doesn't work I
changed it to pointers.
Recommendations:
- use std::vector<Level> unless performance problem
is established. Consider std::list<Level>
or std::deque<Level> also (may address performance
issues you observe).

Hmm, using a list here right away actually seems very sensible to me.
I don't need random access, it's less typing, and it avoids the
potential reference invalidation you pointed out below.
- Use a container such as std::list or std::deque
to store any objects you allocate -- i.e. replace:
Level* p = new Level();
with
levelPool.push_back( Level() );
Level* p = &levelPool.back();
after adding a data member such as:
std::deque<Level> levelPool;
At destruction time, the 'levelPool' container
will safely destroy and deallocate all Level instances.

Did you intend levelPool to be of a class of my own creation which
takes ownership of inserted pointers? If not, what's the difference
to the other quoted option?
Infortunately, this is not enough to avoid all memory leaks.
For example, if your constructor (which adds items) fails
with an exception (e.g. during a memory allocation), objects
will be leaked.

Would it suffice to wrap the current ctor body in a try block and
then in the handler, put the above dtor code and rethrow? Or maybe
build a local structure and assign that to the member if all is well
in the end?
Here too, if the push_back() call fails, you will be
left with a memory leak.

Level* pl = new Level();
try {
levels_.push_back(pl);
} catch(...) {
delete pl;
throw;
}

Is that reasonable?
Also a serious problem: levels_.push_back() might
reallocate the contents of the levels_ vector,
and invalidate the reference you stored in oldTop

Indeed. Good you caught that one.
Well, this is as far as my attention span goes...

You've already been very helpful, and you're in the
pole position so far ;) Thanks again.


Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top