Priority Queue - very slow

C

chenbo09

Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.

The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.

I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task. I observed the intermediate output and found
that the maximum size of my queue was about 20,000.

The following is my code, please help, thanks a lot!

********************************************************
while( !theQueue.empty())
{
tempNode = theQueue.top();
row = tempNode.row;
col = tempNode.col;
theQueue.pop();

z = DEMFill.get_cellValue(row, col);
for(i = 0; i < 8; i++)
{
irow = Get_rowTo(i, row);
icol = Get_colTo(i, col);
if(DEM.is_inGrid(irow, icol) && DEM.is_InternalCell(irow, icol) &&
DEMFill.is_nodataCell(irow, icol))
{
double iz = DEM.get_cellValue(irow, icol);
if( iz < z) { iz = z; }
DEMFill.set_cellValue(irow, icol, iz);
progress++;

tempNode.row = irow;
tempNode.col = icol;
tempNode.spill = iz;
//theQueue.push( tempNode );
}
}

if ((progress % 5000) == 0)
{
cout << theQueue.size() << " " << progress << " " <<
DEMFill.get_nValidCells() << endl;
}
if (progress == DEMFill.get_nValidCells())
{
cout << "Finished!" << endl;
break;
}
}
 
J

Jorgen Grahn

Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.

The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.

I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task.

A factor 3600 -- that's a HUGE difference. My gut feeling is that
there's a difference in algorithm, not just overhead from the priority
queue, a badly implemented operator< () etc.

I haven't read the code. Try to run your code through a profiler
(gprof if you're running GCC) and it will tell you where the algorithm
spends its time. The results from profiling often surprise you.

[snip code]

/Jorgen
 
J

Joshua Maurice

Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.

The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.

I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task. I observed the intermediate output and found
that the maximum size of my queue was about 20,000.

The following is my code, please help, thanks a lot!

********************************************************
        while( !theQueue.empty())
        {
                tempNode = theQueue.top();
                row = tempNode.row;
                col = tempNode.col;
                theQueue.pop();

                z = DEMFill.get_cellValue(row, col);
                for(i = 0; i < 8; i++)
                {
                        irow = Get_rowTo(i, row);
                        icol = Get_colTo(i, col);
                        if(DEM.is_inGrid(irow, icol) && DEM.is_InternalCell(irow, icol) &&
DEMFill.is_nodataCell(irow, icol))
                        {
                                double iz = DEM.get_cellValue(irow, icol);
                                if( iz < z)  { iz = z; }
                                DEMFill.set_cellValue(irow, icol, iz);
                                progress++;

                                tempNode.row = irow;
                                tempNode.col = icol;
                                tempNode.spill = iz;
                                //theQueue.push( tempNode );
                        }
                }

                if ((progress % 5000) == 0)
                {
                        cout << theQueue.size() << "   " << progress << "        " <<
DEMFill.get_nValidCells() << endl;
                }
                if (progress == DEMFill.get_nValidCells())
                {
                        cout << "Finished!" << endl;
                        break;
                }
        }

Compiler options? Did you turn optimization on? Also, always please
post a complete compileable / linkable / executable code sample.
 
C

chenbo09

Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.
The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.
I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task.

A factor 3600 -- that's a HUGE difference. My gut feeling is that
there's a difference in algorithm, not just overhead from the priority
queue, a badly implemented operator< () etc.

I haven't read the code. Try to run your code through a profiler
(gprof if you're running GCC) and it will tell you where the algorithm
spends its time.  The results from profiling often surprise you.

[snip code]

/Jorgen
Hello Jorgen & Joshua,

Thanks for your help!

I am new to C++, so I don't know what a profiler is! But for your
information, I think I should provide more information about my code:

I defined a Grid class, among the data members, two dynamic 2D
vectors of size nrows * ncols are defined. one is used to contain the
elevation data, and the other is an identifier matrix indicating if a
cell is on the border, is an internal cell, or is no data value. I
think these two might consume a lot of memory. I am wondering if there
are ways to handle huge arrays.

Regarding the implementation of the operator <, here is my code:
class SpillNode
{
public:
unsigned row;
unsigned col;
double spill;

struct Greater : public binary_function <SpillNode, SpillNode, bool>
{
bool operator ()(const SpillNode& lhs, SpillNode& rhs) const
{
return lhs.spill > rhs.spill;
}
};
};


Joshua, can you tell me how to turn on optimization? Yes, I would like
to post my code here, but it is a relatively long multiple-file
project.

BTW, the code finished in a second if I common out the line
"theQueue.push( tempNode );". While if I don't then it is time
consuming, and this is the reason why I think it might be because of
the slow speed a priority queue. Probably my guess is wrong, and I am
not trying to offend C++ lovers.
 
J

Jorgen Grahn

Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.
The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.
I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task.

A factor 3600 -- that's a HUGE difference. My gut feeling is that
there's a difference in algorithm, not just overhead from the priority
queue, a badly implemented operator< () etc.

I haven't read the code. Try to run your code through a profiler
(gprof if you're running GCC) and it will tell you where the algorithm
spends its time.  The results from profiling often surprise you.

[snip code]
....

I am new to C++, so I don't know what a profiler is!

<http://en.wikipedia.org/wiki/Profiling_(computer_programming)>

Compile your program with extra profiling flags, run it, and get a
listing of how much time was spent in the different functions. That's
how gprof works, but I assume your compiler environment (whatever it
is) works in more or less the same way. The documentation will tell
you the details.

....
Joshua, can you tell me how to turn on optimization? Yes, I would like
to post my code here, but it is a relatively long multiple-file
project.

No need to post the code. Just read the documentation for your compiler.
It will surely describe how to turn on optimization.
BTW, the code finished in a second if I common out the line
"theQueue.push( tempNode );". While if I don't then it is time
consuming, and this is the reason why I think it might be because of
the slow speed a priority queue.

It's probably mostly because your code (which I still haven't read)
does something completely different when you don't populate theQueue.
You can do no work, on an empty queue, really fast ...

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top