Duff's Device

  • Thread starter Hallvard B Furuseth
  • Start date
H

Hallvard B Furuseth

I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:

void duff(const char *str, int len) {
int n = (len + 7) / 8;
switch (len % 8) {
case 0: do{ foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
} while (--n > 0);
}
}

instead of this?

void duff2(const char *str, int len) {
switch (len % 8) {
case 0: while ((len -= 8) >= 0) {
foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
}
}
}

The original has an extra '+' and doesn't handle len=0.
Nor does it need the divide by 8, though I realize n-=1
may be cheaper than n-=8 on some architectures.
People have had 18 years to notice now:)
 
H

Hallvard B Furuseth

I said:
I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:
(...)

Er, a bit silly question - "that's the way it's published" of course.
I meant, is there a good reason to write it that way - produces
better code on some machine?
 
L

Laurent Deniau

Hallvard said:
I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:

void duff(const char *str, int len) {
int n = (len + 7) / 8;
switch (len % 8) {
case 0: do{ foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
} while (--n > 0);
}
}

instead of this?

void duff2(const char *str, int len) {
switch (len % 8) {
case 0: while ((len -= 8) >= 0) {
foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
}
}
}

The original has an extra '+' and doesn't handle len=0.
Nor does it need the divide by 8, though I realize n-=1
may be cheaper than n-=8 on some architectures.
People have had 18 years to notice now:)

I did some speed test some time ago with gcc on x86 and the fastest
version I was able to find was:

static inline void
ooc_memchr_copy( char *restrict dst,
const char *restrict src, size_t cnt)
{
size_t rem = cnt % 8;
cnt = (cnt / 8) + 1;

switch (rem)
do { *dst++ = *src++;
case 7: *dst++ = *src++;
case 6: *dst++ = *src++;
case 5: *dst++ = *src++;
case 4: *dst++ = *src++;
case 3: *dst++ = *src++;
case 2: *dst++ = *src++;
case 1: *dst++ = *src++;
case 0: ;
} while(--cnt);
}

which is not the one published and works for cnt==0 as well as for
cnt==(size_t)-1. Why do you think that the orignal version is always the
one used?

a+, ld.
 
R

Rod Pemberton

Hallvard B Furuseth said:
Anyone know why Duff's device is usually written
like this (snip) instead of this? (snip)

Yes.

This is his original post:
http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en

This is another post from him with clarifications to various questions from
individuals on c.l.c:
http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en

From the original post, he (indirectly) states that the design of Duff's
Device in C was the direct result of his understanding of how to generate
efficient unrolled loops in assembly language for the VAX. At least, that
is the one thing other than Duff's Device that you should get from his
message...


FYI, others have pointed out Simon Tatham's "Coroutines in C":
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Protothreads is also based on a similar mechanism:
http://www.sics.se/~adam/pt/


Rod Pemberton
 
K

Keith Thompson

Rod Pemberton said:
Yes.

This is his original post:
http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en

This is another post from him with clarifications to various questions from
individuals on c.l.c:
http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en
[...]

And here's the scary part:

| It amazes me that after 10 years of writing C there are still little
| corners that I haven't explored fully. (Actually, I have another
| revolting way to use switches to implement interrupt driven state
| machines but it's too horrid to go into.)

Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).
 
C

Christopher Benson-Manica

Keith Thompson said:
Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).

One might even call it "Duff's Last Theorem" :)
 
G

Guest

Keith said:
And here's the scary part:

| It amazes me that after 10 years of writing C there are still little
| corners that I haven't explored fully. (Actually, I have another
| revolting way to use switches to implement interrupt driven state
| machines but it's too horrid to go into.)

Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
 
H

Hallvard B Furuseth

Laurent said:
I did some speed test some time ago with gcc on x86 and the fastest
version I was able to find was:

static inline void
ooc_memchr_copy( char *restrict dst,
const char *restrict src, size_t cnt)
{
size_t rem = cnt % 8;
cnt = (cnt / 8) + 1;
(...)

Heh. Strange, keeps an add but still speeds it up.
which is not the one published and works for cnt==0 as well as for
cnt==(size_t)-1. Why do you think that the orignal version is always
the one used?

It isn't; it's just the one I've _usually_ seen. (Not that I've seen
 
H

Hallvard B Furuseth

Rod said:
FYI, others have pointed out Simon Tatham's "Coroutines in C":
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Cool. Why didn't anyone tell me about that __LINE__ trick before?:)
I wouldn't call them coroutines though. Too limited.

Personally I don't see anything ugly about it, BTW. Hiding ugly stuff
is one of the things macros are _for_. Except the crFinish macro, but
that one is not necessary:

#define AutoSwitch(state) /* state=integer state variable */\
switch (state) case 0:
#define AScontrol(state, stmt) /* stmt=return/break/continue/goto */\
if (1) { (state) = __LINE__; stmt; case __LINE__:; } else (void)(0)

Now it's just a normal switch in that break/continue/etc work as
normally - it's just that the case statements look nicer this way.

int foo(void)
{
static int state, cur;
AutoSwitch(state)
for (cur = 1; cur < 10; cur++)
AScontrol(state, return cur*cur);
return 0;
}
 
H

Hallvard B Furuseth

Christopher said:
Go look at the Putty code and report back to us.

OK, now that has way too many magic macros to keep track of.
(Which is one reason I made 'return' etc arguments to my variants BTW,
that way if there is a return statement somewhere it is in the body.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,016
Latest member
TatianaCha

Latest Threads

Top