my code seems slow

G

Gernot Frisch

Hi,

my code is slow. I have to rethink of it. Can you help?

I'm displaying a logfile from a license server.

I have a vector (m_lines) of
struct LINE
{
STATUS status; // LOGIN or LOGOUT
int hex; // the "user" id
time_t time; // time of login/out
};
which are sorted by time.

Now for each hex's LOGIN there must be a LOGOUT.
If there is none (no idea why), I'll have to add it.

Here's my code so far:



void ForceLogouts()
{
int num = (int)m_lines.size();
for(int i=0; i<num; ++i)
{
LINE& line1 = m_lines;
if(line1.status==ST_LOGIN)
{
for(int j=i+1; j<num; ++j)
{
LINE& line2 = m_lines;
// good logout
if(line2.hex == line1.hex && line2.status==ST_LOGOUT)
{
goto skipme;
}
}
// this one needs a logout badly!
LINE lt(line1);
tmday = &line1.time + 10000;
lt.status=ST_LOGOUT;
m_lines.push_back(lt);
}
skipme:;
}
// sort(m_lines, ...); // sort in new entries by time
}


What can I do? Sort the data by 'hex' first?
 
P

Paolo

I think you should sort the vector, so when you find a login line at
line i, you just have to check line i++ to see if there's the logout,
otherwise you msyt create it.

Hope it helps!

Paolo


Gernot Frisch ha scritto:
 
B

Bart

Gernot said:
I have a vector (m_lines) of
struct LINE
{
STATUS status; // LOGIN or LOGOUT
int hex; // the "user" id
time_t time; // time of login/out
};
which are sorted by time.

It would really be better if this was encapuslated.
void ForceLogouts()
{
int num = (int)m_lines.size();
for(int i=0; i<num; ++i)

You should use iterators instead, so that you don't have to change your
code when switching to a different container. And switch you should.
Your algorithm is O(n^2) when it shouldn't take more than O(log n) to
do a search and O(log n) to insert. To do that use an associative
container such as std::map instead of std::vector.
goto skipme;

That's really not necessary. Please read about the 'continue'
statement.
// sort(m_lines, ...); // sort in new entries by time

Do you really want to sort everytime? Use std::map instead.
What can I do? Sort the data by 'hex' first?

Figure out your algorithms and data structures before writing code.

Regards,
Bart.
 
G

Gernot Frisch

Your algorithm is O(n^2) when it shouldn't take more than O(log n)
to
do a search and O(log n) to insert. To do that use an associative
container such as std::map instead of std::vector.

I can't use a std::map here, since I _must_ have a vector where I'm
drawing the whole thing. A map would ridiculously drop my drawing
time, which is much more important.

Do you really want to sort everytime? Use std::map instead.
Sorting takes 1 second - no problem.
Figure out your algorithms and data structures before writing code.

I know of std::map and stuff. I know why I had to chose vector.
But making a temporarily std::map for the hex codes and storing a
reference to the timesteps might make sense. Thank you,
-Gernot
 
J

Jerry Coffin

[email protected] says... said:
I'm displaying a logfile from a license server.

I have a vector (m_lines) of
struct LINE
{
STATUS status; // LOGIN or LOGOUT
int hex; // the "user" id
time_t time; // time of login/out
};
which are sorted by time.

I'd build a temporary map with hex as the key, and an int as the value.
When you encounter a LOGIN record, increment the value. When you
encounter a LOGOUT record, decrement the value (and if the value is
zero, you can optionally remove the record). When you're done, each
value in the map tells you how many LOGOUT records need to be added for
that user id.

Right now you're using an O(N*N) algorithm. This is roughly O(N lg N).
If you remove records as they hit zero, it's O(N lg M), where M is the
average number of unmatched LOGIN records over given time as you
traverse the file.

Whether to remove zero records is mostly a question of how often a
specific user is likely to login/out during a run. If they'll login
relatively soon after they log out, you probably want to keep the record
(to avoid deleting and allocating records many times). If logins are
mostly to unique users, so chances are good you won't see the same user
log back in later, you might as delete each record as it hits zero --
there's little chance you'll get to reuse it, and removing it from the
map reduces total memory usage, and reduces the number of records to
search through.
 
G

Gernot Frisch

Right now you're using an O(N*N) algorithm. This is roughly O(N lg
N).
If you remove records as they hit zero, it's O(N lg M), where M is
the
average number of unmatched LOGIN records over given time as you
traverse the file.

I don't quite understant this O(xy) thing. Where can I read about it?
Whether to remove zero records is mostly a question of how often a
specific user is likely to login/out during a run. If they'll login
relatively soon after they log out, you probably want to keep the
record
(to avoid deleting and allocating records many times). If logins are
mostly to unique users, so chances are good you won't see the same
user
log back in later, you might as delete each record as it hits
zero --
there's little chance you'll get to reuse it, and removing it from
the
map reduces total memory usage, and reduces the number of records to
search through.

The hex numbers occour usually only twice - one per login, one per
logout. Each user gets a unique hex for each login (unless he log's
into the same program again)
Anyway:
I've std::sorted the vector with
struct SRT
{
bool operator()(const LINE& l1, const LINE& l2)
{
if(l1.hex==l2.hex)
return l1.time<l2.time;
return l1.hex < l2.hex;
}
};

Then I have to look at each element and it's neighbour only. The
speedup is amazingly! From 10 minutes to 15 seconds now!

Thank you all.
-Gernot
 
J

Jerry Coffin

[email protected] says... said:
I don't quite understant this O(xy) thing. Where can I read about it?

Just about any decent book on algorithms should cover it to at least
some degree. _The Art of Computer Programming_ (~3.3 volumes out of a
planned 7 volumes are complete, at least count) by Donald Knuth is
probably the best known choice. There's also _Introduction to
Algorithms_ by Cormen, Lieserson, Rivest and (in the latest edition) a
fourth guy whose name I can't remember. Knuth is also one of the co-
authors of _Concrete Mathematics_. This isn't exactly a traditional
algorithms book, but it's certainly interesting and useful, and has a
fair amount about computational complexity and big-O notation.

There are quite a few others around as well and a fair number are quite
good, but I feel obliged to caution against anything by Robert
Sedgewick. While I've little doubt that Robert is a very smart guy (he's
invented some nice algorithms), his books pretty much blow. What he
teaches as an algorithm is sometimes almost completely different from
what the rest of the world recognizes that particular algorithm to be.
His exposition of an algorithm is often thoroughly incomplete, and his
explanations so poor they can actually even confuse somebody who already
knows the algorithm in question. I can find nothing about his books to
make them worth recommending at all.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top