maximum cases in a switch ?

J

Juha Nieminen

Paul said:
From behind which rock have you crawled?

Please explain to us how using a jump table to implement a switch block
of 525 cases could possibly take 1500 clock cycles.
 
N

Nick Keighley

I already have explained it, obviosly this is a very rough estimate

ITYM "wrong estimate"
as not
all CPU's are the same. Can you explain how many clock cycles you think it'd
take?

the number taken is independent of the number of entries in the table

goto table;

we're talking half a dozen instructions and a couple of memory
accesses
 
J

Juha Nieminen

Paul said:
You can't have a switch statement with 525 cases executed in 6 clock cycles
you fucking moronic idiot.

You see, that's why they recommended you to go and do some research on
jump tables.

What you are saying above is completely equivalent to saying "you can't
access an element in an array of 525 elements in a couple of clock cycles".
Of course you can. The amount of elements in the array doesn't matter.
 
L

Leo Equinox Gaspard

Le 14/08/2011 00:16, Victor Bazarov a écrit :
[..]
Some examples of POD and non-POD types :

struct POD { int i; }; // Only POD members
struct NonPOD { int * p; }; // Pointer isn't POD

Really? No, REALLY?!! Any proof (quote from the Standard, a book or some
such)?
[..]

Now, am I mistaking anywhere ?

I don't know. I stopped reading after you called a pointer non-POD...

V

If you read the beginning before the code, you would have seen
" <<< Errors in, see after >>> " in my message, meaning that I knew I
was wrong.

My message was composed of two parts, one written using my memory (and,
while I
was saying a pointer wasn't POD, I was pretty sure of that, but I was wrong,
because, while pointers /often/ causes classes not to be POD, it's indeed
the classic use of them that is in cause : delete ptr_ in the
destructor), which
had been corrupted by time without checking the definition (last time I
heard of
PODs was at least one year ago).

The second part (starting by BTW) was written with the standard (more
precisely
the last freely available version of the C++11 standard), and declared
that I was
wrong. As I don't like the idea of removing something I wrote (maybe is
it because
I tend to act like if writing a letter), I decided to add these two
messages :
" <<< Errors in, see after >>> " and "Am I wrong anywhere ? <<< Yes, see
below >>> "

So, maybe, even not reading the whole message, but at least the
beginning before the
end, would help you to understand the true message I posted.

Leo
 
J

Juha Nieminen

Paul said:
I am not even talking about a jump table, what makes you think a jump table
has any relevance to what I said?

Read your own text above. It says "you can't have a switch statement with
525 cases executed in 6 clock cycles". It doesn't have any qualifiers or
conditions. You are making an absolute assertion: That it's impossible.

Now, after several people have hammered into you the concept of jump
tables, and seemingly you finally understand what that means and how they
can be applied by a compiler to implement a switch block, now you can't
just admit that you didn't understand it first but now you do. Instead,
you try to weasel yourself out of the situation by some vague claims that
you were "not talking about a jump table".

It's ok to say "ok, I had a misunderstanding, it is indeed possible to
execute a large switch clock in just a few clock cycles using a jump table".
Such an admission may hurt your pride for a while, but in the long run you
will be much better off than by stubborningly sticking to your original
claim. Just swallow your pride and learn to become a nicer person. You will
be happier in the long run.
 
V

Victor Bazarov

Le 14/08/2011 00:16, Victor Bazarov a écrit :
[..]
Some examples of POD and non-POD types :

struct POD { int i; }; // Only POD members
struct NonPOD { int * p; }; // Pointer isn't POD

Really? No, REALLY?!! Any proof (quote from the Standard, a book or some
such)?
[..]

Now, am I mistaking anywhere ?

I don't know. I stopped reading after you called a pointer non-POD... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

V

If you read the beginning before the code, you would have seen
" <<< Errors in, see after >>> " in my message, meaning that I knew I
was wrong.

Yes, if I read. I didn't. And I shouldn't have to. If you knew that
you were wrong, delete the text before posting and type in the correct
part. Better yet, reconsider posting. I understand that you might have
felt that since you spent time typing it, it was somehow worth
publishing (to offset the expense of typing it up). It wasn't. While
you did waste time (yours) typing nonsense, you wasted even more of it
(mine and others') by not removing the nonsense from your message before
posting the whole thing.
My message was composed of two parts, one written using my memory (and,
while I
was saying a pointer wasn't POD, I was pretty sure of that, but I was
wrong,
because, while pointers /often/ causes classes not to be POD,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You keep doing that, don't you? Where do you get this nonsense? And if
you can't recognize it as nonsense, perhaps you should get into habit of
verifying everything that you're about to publish? Yes, we understand
that human memory is not perfect, but, man, if you know that yours is
especially vulnerable, stop relying on it.
it's indeed
the classic use of them that is in cause : delete ptr_ in the
destructor), which
had been corrupted by time without checking the definition (last time I
heard of
PODs was at least one year ago).

The second part (starting by BTW) was written [..]

So, maybe, even not reading the whole message, but at least the
beginning before the
end, would help you to understand the true message I posted.

No, sorry. It would not. Pure noise. Total waste of time.

V
 
J

Juha Nieminen

Paul said:
I made it clear that the estimated number of clock cycles was based on a
worse case scenario, where all conditions were checked.

You made it clear? Where exactly? I went through the conversation again
and nowhere did you make any mention whatsoever that you were talking about
"the worst case scenario", not directly, nor even in context. The only
thing you did was to insult the poster who mentioned jump tables and then
when I asked to explain yourself, you simply said that it's impossible to
execute a large switch block on just 6 clock cycles.

It's rather clear that you didn't know about the jump table trick,
didn't understand it when you were first told, and thought that the
only way to compile a switch block is to effectively compile a conditional
for each case. Clearly you later understood how a jump table works but
now are unable to admit that you made a mistake.
 
L

Leo Equinox Gaspard

Le 16/08/2011 15:04, Victor Bazarov a écrit :
Le 14/08/2011 00:16, Victor Bazarov a écrit : [...]

If you read the beginning before the code, you would have seen
" <<< Errors in, see after >>> " in my message, meaning that I knew I
was wrong.

Yes, if I read. I didn't. And I shouldn't have to.

You shouldn't have read the beginning before the end ? I hope you don't
read books from the end to the beginning, otherwise you would hardly ever
understand its content.
If you knew that you
were wrong, delete the text before posting and type in the correct part.
Better yet, reconsider posting. I understand that you might have felt
that since you spent time typing it, it was somehow worth publishing (to
offset the expense of typing it up). It wasn't.

It was. I think it could be useful for others to know the error I did, for
them not to do the same.
While you did waste time
(yours) typing nonsense, you wasted even more of it (mine and others')
by not removing the nonsense from your message before posting the whole
thing.

I think the only waste of time here is this discussion we are having,
as if you had only limited time you could just have jumped the whole
"nonsense" to go directly to the interesting part.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You keep doing that, don't you? Where do you get this nonsense? And if
you can't recognize it as nonsense, perhaps you should get into habit of
verifying everything that you're about to publish? Yes, we understand
that human memory is not perfect, but, man, if you know that yours is
especially vulnerable, stop relying on it.

Maybe reading the end of the sentence before answering, it would help you
not to write nonsense. Why don't you just read the end of the parenthesis ?
"delete ptr_ in the destructor". And you couldn't say that a class having
"delete ptr_" in the destructor isn't POD, as I state in my message, due to
sections 9.10 and 9.6, stating that a POD is a trivial class, that a trivial
class is trivially copyable, and that a trivially copyable class need a
trivial destructor, which is not the case of a class containing "delete
ptr_"
in its destructor.
it's indeed
the classic use of them that is in cause : delete ptr_ in the
destructor), which
had been corrupted by time without checking the definition (last time I
heard of
PODs was at least one year ago).

The second part (starting by BTW) was written [..]

So, maybe, even not reading the whole message, but at least the
beginning before the
end, would help you to understand the true message I posted.

No, sorry. It would not. Pure noise. Total waste of time.

Leo
 
L

Leo Equinox Gaspard

Le 12/08/2011 22:22, Joe Greer a écrit :
POD = Plain Old Data i.e. a struct or class with no special behavior in
the case.

Only you know what your app needs to do. I am just exploring
possibilities that you can either use or reject as suits your needs.

joe

I already answered, but it looks like there had been a misunderstanding, so,
in order to be nice, I will re-post with removing the useless parts.

AFAIK a POD may have special behavior.

I just checked the latest free paper about C++0x, and I saw this.

(9.10)
A POD struct is a class that is both a trivial class and a standard-layout
class, and has no non-static data members of type non-POD struct, non-POD
union (or arrays of such types). Similarly, a POD union is a union that is
both a trivial class and a standard layout class, and has no non-static
data members of type non-POD struct, non-POD union (or arrays of such
types). A POD class is a class that is either a POD struct or a POD union.

So, as we are speaking of POD classes, we need :
- trivial
- standard-layout
- no non-static data members that are non-POD

Now, trivial classes.

(9.6)
A trivial class is a class that has a trivial default constructor (12.1)
and is trivially copyable.
[Note: In particular, a trivially copyable class does not have virtual
functions or virtual base classes. - end note]

So, we need :
- trivial default constructor
- trivially copyable
- no virtual functions
- no virtual base classes
- standard-layout
- no non-static data members that are non-POD

Now, trivially copyable.

(9.6)
A trivially copyable class is a class that:
- has no non-trivial copy constructors (12.8),
- has no non-trivial move constructors (12.8),
- has no non-trivial copy assignment operators (13.5.3, 12.8),
- has no non-trivial move assignment operators (13.5.3, 12.8),
- has a trivial destructor (12.4).

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- standard-layout
- no non-static data members that are non-POD

Now, standard-layout.

(9.7)
A standard-layout class is a class that:
- has no non-static data members of type non-standard-layout class
(or arrays of such types) or reference,
- has no virtual functions (10.3) and no virtual base classes (10.1),
- has the same access control (Clause 11) for all non-static data
members,
- has no non-standard-layout base classes,
- either has no non-static data members in the most derived class and
at most one base class with non-static data members, or has no base
class with non-static data members, and
- has no base classes of the same type as the first non-static data
member.

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- no non-static data members that are not standard-layout
- no virtual functions, base classes
- only one access control for all non-static data members
- no non-standard-layout base classes
- only one class in the hierarchy that has non-static data members
- no base classes of the same type as the first non-static data member
- no non-static data members that are non-POD

Restrictions 3 and 5 are the same.
Restriction 4 is a subset of restriction 10.

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- only one access control for all non-static data members
- no non-standard-layout base classes
- only one class in the hierarchy that has non-static data members
- no base classes of the same type as the first non-static data member
- no non-static data members that are non-POD

So, we can do anything we want with static members and member functions.

Am I mistaking anywhere ?

Cheers & hth,
Leo
 
I

Ian Collins

Le 16/08/2011 15:04, Victor Bazarov a écrit :

Maybe reading the end of the sentence before answering, it would help you
not to write nonsense. Why don't you just read the end of the parenthesis ?
"delete ptr_ in the destructor". And you couldn't say that a class having
"delete ptr_" in the destructor isn't POD, as I state in my message, due to
sections 9.10 and 9.6, stating that a POD is a trivial class, that a trivial
class is trivially copyable, and that a trivially copyable class need a
trivial destructor, which is not the case of a class containing "delete
ptr_" in its destructor.

A class isn't a POD if it has a destructor, trivial or not (9 pp4).
 
I

Ian Collins

I already answered, but it looks like there had been a misunderstanding, so,
in order to be nice, I will re-post with removing the useless parts.

AFAIK a POD may have special behavior.

I just checked the latest free paper about C++0x, and I saw this.

Check 9 pp4 in the 2003 standard and the third example in 9 pp10 in the
new one.
 
L

Leo Equinox Gaspard

Le 17/08/2011 00:24, Ian Collins a écrit :
On 08/17/11 10:11 AM, Leo Equinox Gaspard wrote: [...]

A class isn't a POD if it has a destructor, trivial or not (9 pp4).

Excuse me, but ... On the version I am using, 9pp4 is about nonzero
size for complete objects and member subobjects.

Or maybe am I misunderstanding the "pp" part ?

Leo
 
L

Leo Equinox Gaspard

Le 17/08/2011 00:29, Ian Collins a écrit :
Check 9 pp4 in the 2003 standard and the third example in 9 pp10 in the
new one.

Oh, I answered your first message without understanding that you were
speaking of the 2003 standard.
However, I think third example of 9pp10 implies the destructor's body,
not shown here, is doing something.

And I apologize for my message saying 9.6 in place of 9pp6.

The standard's text is never mentioning the absence of destructors, but
only the triviality of destructors.

Leo
 
I

Ian Collins

Le 17/08/2011 00:29, Ian Collins a écrit :

Oh, I answered your first message without understanding that you were
speaking of the 2003 standard.
However, I think third example of 9pp10 implies the destructor's body,
not shown here, is doing something.

And I apologize for my message saying 9.6 in place of 9pp6.

The standard's text is never mentioning the absence of destructors, but
only the triviality of destructors.

The wording has changed in the new standard, introducing trivial
destructors. The 2003 version is clearer:

"A POD-struct is an aggregate class that has no non-static data members
of type non-POD-struct, non-POD-union (or array of such types) or
reference, and has no user-defined copy assignment operator and no
user-defined destructor."
 
N

Nick Keighley

NOTE WELL: "jump tables"

--ITYM "wrong estimate"

There is no "right" estimate,

something that is sufficiently non-right can be deemed to be wrong.
You are 2 or 3 /orders of magnitude/ out. When I studied physics 1
order of magnitude (a factor 10) out was generally taken to be wrong.

a quadcore cpu  with precaching and all the
fancy tech they have nowadays will probably do it alot faster than an
ancient AMD cpu.

not if the quad core cpu was running your code
So you are "wrong" to say my estimate was "wrong".


--the number taken is independent of the number of entries in the table

NOTE WELL: "the number [of clock cycles] taken is independent of the
number of entries in the table"

 --   goto table;


think about how the above pseudo-code might be implemented

--we're talking half a dozen instructions and a couple of memory
--accesses

Start Process:
Load reg1 with comparee (Only done once at start)

Load reg2 with case option
Load reg3 with jump address
Jump if reg1and reg2 are equal .

repeat above until the condition is met.

Why do you need to repeat 6 instructions ?

I don't
And what are you talking about memeory access? L1 cache? or pipeline cache
or prefetched queues?

in this case it doesn't really matter. I'm assuming there's no
cacheing.
Modern CPUs have everything all prefetched into caches and its a bit more
complicated than you seem to understand.

it's less complicated than you understand
You can try to pretend you know more than me about it and I don't really
care if you do or don't but when you challenge something I said as wrong and
present "your opinion" as right, I'd suggest you get your facts right first
and provide some kind of evidence which supports your "claim".

I suggest you think about it a bit more (and try being politer)
Even if your opinion that it was 6x525 clock cycles ,

not my claim. Re-read my post.
for any given CPU, was
100% exact then  that would make my estimate close enough to be accurate,
no

because we are talking about cpu speeds in gigahertz now.

hence my "estimate" is in clock cycles (instructions actually)
 
J

Juha Nieminen

Paul said:
I said "even is the last condition is the only one that is true".


No you are twisting the context from that of an unordered switch case, to
what you incorrectly assumed to be that of an ordered switch case that was
implemented as a jump table.

Nowhere did you make any mention whatsoever about "an unordered switch
case". And even if you had done it, it would still be wrong. The order
in which the case blocks are specified doesn't matter, the jump table
technique still works equally well. (The only thing that changes is
the order in which the code is arranged inside the function. Indexing
the jump table still works exactly the same way as if the cases were
ordered in whichever order you want.)

The situation where a jump table cannot be used for an O(1) jump to
the proper case block is when the case values are sparse (in other words
not consecutive or with only very small gaps between them). In that case
the jump table would become too large to be feasible.

However, even in this situation a binary (instead of a linear) search
can be performed on the jump table. This increases the execution time to
O(log n), which is still significantly lower than your proposed O(n).
(In other words, rather than talking about 6 clock cycles we might be
talking about a hundred or so at most. Nowhere near your 5000.)

However, that's not the point. The point is that you insulted the poster
who talked about jump tables. You speak as if you had first suggested your
linear-time execution estimation, then someone had responded with the
jump table solution, and then you had objected that you were not talking
about that. No, that's not what happened. Your very first post was in
response to the jump table post. You were objecting directly to that. You
insinuated that it's impossible and resorted to namecalling.
 
S

SG

Leigh Johnston said:
Epic fail yet again.  Get a clue.

Yes epic fail on your behalf.

How many times in the past have you been proven inferior?

I can't recall a single instance. The only thing he is guilty of is
wasting time with you.
No need to reply I think we all know the correct answer to that question.

Don't say "we". You should really do something about the reality
distsortion field that is surrounding you.


SG
 
L

Larry Evans

--hence my "estimate" is in clock cycles (instructions actually)

If you think you are so smart lets see the C++ and the assembly and tell
me which compiler produced it.
Hi Paul,

With the attached source code, switch.cpp, and makefile, switch.mk, two
assembler files(also attached) were produced:

switch.3.s
switch.10.s

The 1st is for 3 case labels, the 2nd is for 10.
The compiler was:

gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)


The boost preprocessor library:
http://www.boost.org/doc/libs/1_47_0/libs/preprocessor/doc/index.html
was used to generate the tags and switch statements. The attached
switch.3.E.cpp shows output of preprocessor for the case of 3 case
statement. That is attached just to reassure people that the
boost pp macros work as expected.

I'm no assember expert, but it appears the 3.s file does do
a number of compares (I assume that's the cmpl instructions)
that increases with number of cases. OTOH, it appears
the 10.s file may have something like a jump table since
there's only one cmpl instruction there.

Could someone with more assembler experience confirm the above
conclusions?

HTH and TiA..

-regards,
Larry
 
T

Thomas David Rivers

Here's an example of a, what I think is an "unordered" switch()
statement. I'm taking "unordered" to mean that the order of
appearance of the case lables is not related to the order
(from least to greatest) of the integer values in the case labels.

extern int bar(void);

int
func(void)
{
int i;
int retval = 0;

i = bar();
switch(i) {
case 1: retval = 10; break;
case 3: retval = 30; break;
case 5: retval = 50; break;
case 2: retval = 20; break;
case 6: retval = 60; break;
case 7: retval = 70; break;
case 4: retval = 40; break;
}

return retval;
}


when compiled with the Dignus C++ compiler for IBM mainframes v1.95
without any optimization options, it generates this assembly code:

*** func(void)
* *** {
* *** int i;
* *** int retval = 0;
LA 2,0(0,0) ; 0
* ***
* *** i = bar();
L 15,@lit_7_1 ; bar()
BALR 14,15
* *** switch(i) {
B @L1
DS 0D
@FRAMESIZE_7 DC F'96'
@lit_7_1 DC V(_$Z3barv)
@lit_7_9 DC F'1' 0x00000001
@lit_7_10 DC F'7' 0x00000007
* *** case 1: retval = 10; break;
@L2 DS 0H
LA 2,10(0,0) ; 10
B @L3
* *** case 3: retval = 30; break;
@L4 DS 0H
LA 2,30(0,0) ; 30
B @L3
* *** case 5: retval = 50; break;
@L5 DS 0H
LA 2,50(0,0) ; 50
B @L3
* *** case 2: retval = 20; break;
@L6 DS 0H
LA 2,20(0,0) ; 20
B @L3
* *** case 6: retval = 60; break;
@L7 DS 0H
LA 2,60(0,0) ; 60
B @L3
* *** case 7: retval = 70; break;
@L8 DS 0H
LA 2,70(0,0) ; 70
B @L3
* *** case 4: retval = 40; break;
@L9 DS 0H
LA 2,40(0,0) ; 40
B @L3
@L1 DS 0H
C 15,@lit_7_9
BL @L3
C 15,@lit_7_10
BH @L3
S 15,@lit_7_9
SLL 15,3(0)
LA 1,@@gen_label0
B 0(1,15)
@@gen_label0 DS 0H
B @L2
DC 4X'00'
B @L6
DC 4X'00'
B @L4
DC 4X'00'
B @L9
DC 4X'00'
B @L5
DC 4X'00'
B @L7
DC 4X'00'
B @L8
DC 4X'00'
@L3 DS 0H
* *** }
* ***
* *** return retval;
LR 15,2
* *** }

Note that I only included a few case labels in the switch()
statement, so that the entire assembly source would reasonably
fit. You could increase that number 10-fold following the pattern
established in the source and the generated code for the switch
evaluation would not be different.

Now - let's take an example where the return value
from the invocation of bar() was, say, 4. Here is the
instruction sequence that would be executed after
the BALR instruction (the return value from bar() is
in Register #15. I've also added comments for those
unfamiliar with the IBM instruction set.) We pick up
execution right after the BALR instruction:

B @L1 Jump to the switch table evaluation
C 15,@lit_7_9
BL @L3 Is the value less than 1? if so,
branch to @L3
C 15,@lit_7_10
BH @L3 Is the value greater than 7? if so
branch to @L3
S 15,@lit_7_9 Register 15 <- Register 15 - 1
SLL 15,3(0) Register 15 is left shifted 3
(multiply by 8)
LA 1,@@gen_label0 Register 1 is loaded with address of the
branch table
B 0(1,15) Indexed branch to offset in the
branch table
B @L9 Branch to start of case 4:

So, to branch to the appropriate case label for the value 4, it
takes 10 instructions. (If we make the grossly naive assumption
that one instruction is one cycle, this would be 9 "cycles", but
what is a "cycle" on a z/Architecture machine vs. an x86? So I'd
rather talk in terms of instructions.)

Note this would still take 10 instructions, regardless of the "order" of the
case labels in the source code, or the number of case labels in the body
of the switch statement. The compiler can sort the case label values and
generate the branch table in sorted order, so that the table can be
trivially
indexed by the value of the switch().

The table itself is simply a list of branch instructions that branch to the
start of each case label; which can be seen in the assembly code above.

If the value is outside of the range of the case labels, then it's
even fewer instructions... the comparisons look for values outside
of the range of the case labels and will "jump" to the appropriate
location for those (either a default: label, or the code after the
switch() body.)

Using jump tables like this can make for very efficient switch()
evaluation at runtime.

However, if the values in the set of case labels is not dense,
then the table would be filled with entries that jump to the
default: label (or after the switch). For example, if you had
2 case labels with the values 1 and 100 and one default: label,
it might not be prudent to generate a 99-entry jump table with
98 of the entries being a branch to the default:. That's a lot of
wasted space when 2 if-then-else statements would suffice.

Compilers use heuristics to determine between various switch
implementation strategies. "Denseness" of the case labels (defined
as the number of non-default branch targets in a table divided by
the total size of the table) would be one measure used to
decide which strategy is best. However, the order of the case
labels in the source is (typically) not important.

You can actually run this compiler yourself and see the
generated assembly code at http://www.dignus.com.

- Dave Rivers -
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top