Duff's Device

Hallvard B Furuseth · Sep 28, 2006

I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:

void duff(const char *str, int len) {
int n = (len + 7) / 8;
switch (len % 8) {
case 0: do{ foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
} while (--n > 0);
}
}

instead of this?

void duff2(const char *str, int len) {
switch (len % 8) {
case 0: while ((len -= 8) >= 0) {
foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
}
}
}

The original has an extra '+' and doesn't handle len=0.
Nor does it need the divide by 8, though I realize n-=1
may be cheaper than n-=8 on some architectures.
People have had 18 years to notice now

Hallvard B Furuseth · Sep 28, 2006

I said:
I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:
(...)

Er, a bit silly question - "that's the way it's published" of course.
I meant, is there a good reason to write it that way - produces
better code on some machine?

Laurent Deniau · Sep 28, 2006

Hallvard said:
I've been wondering sometimes:
Anyone know why Duff's device is usually written like this:

void duff(const char *str, int len) {
int n = (len + 7) / 8;
switch (len % 8) {
case 0: do{ foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
} while (--n > 0);
}
}

instead of this?

void duff2(const char *str, int len) {
switch (len % 8) {
case 0: while ((len -= 8) >= 0) {
foo(*str++);
case 7: foo(*str++);
case 6: foo(*str++);
...
case 1: foo(*str++);
}
}
}

The original has an extra '+' and doesn't handle len=0.
Nor does it need the divide by 8, though I realize n-=1
may be cheaper than n-=8 on some architectures.
People have had 18 years to notice now

I did some speed test some time ago with gcc on x86 and the fastest
version I was able to find was:

static inline void
ooc_memchr_copy( char *restrict dst,
const char *restrict src, size_t cnt)
{
size_t rem = cnt % 8;
cnt = (cnt / 8) + 1;

switch (rem)
do { *dst++ = *src++;
case 7: *dst++ = *src++;
case 6: *dst++ = *src++;
case 5: *dst++ = *src++;
case 4: *dst++ = *src++;
case 3: *dst++ = *src++;
case 2: *dst++ = *src++;
case 1: *dst++ = *src++;
case 0: ;
} while(--cnt);
}

which is not the one published and works for cnt==0 as well as for
cnt==(size_t)-1. Why do you think that the orignal version is always the
one used?

a+, ld.

Rod Pemberton · Sep 28, 2006

Hallvard B Furuseth said:
Anyone know why Duff's device is usually written
like this (snip) instead of this? (snip)

Yes.

This is his original post:
http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en

This is another post from him with clarifications to various questions from
individuals on c.l.c:
http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en

From the original post, he (indirectly) states that the design of Duff's
Device in C was the direct result of his understanding of how to generate
efficient unrolled loops in assembly language for the VAX. At least, that
is the one thing other than Duff's Device that you should get from his
message...

FYI, others have pointed out Simon Tatham's "Coroutines in C":
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Protothreads is also based on a similar mechanism:
http://www.sics.se/~adam/pt/

Rod Pemberton

Keith Thompson · Sep 29, 2006

Rod Pemberton said:
Yes.

This is his original post:
http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en

This is another post from him with clarifications to various questions from
individuals on c.l.c:
http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en

[...]

And here's the scary part:

| It amazes me that after 10 years of writing C there are still little
| corners that I haven't explored fully. (Actually, I have another
| revolting way to use switches to implement interrupt driven state
| machines but it's too horrid to go into.)

Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).

Christopher Benson-Manica · Sep 29, 2006

Keith Thompson said:
Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).

One might even call it "Duff's Last Theorem"

Guest · Sep 29, 2006

Keith said:
And here's the scary part:

| It amazes me that after 10 years of writing C there are still little
| corners that I haven't explored fully. (Actually, I have another
| revolting way to use switches to implement interrupt driven state
| machines but it's too horrid to go into.)

Does anyone know the details? If the orginal Duff's Device wasn't
"too horrid to go into" ... (*shudder*).

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Mabden · Oct 2, 2006

Harald van D?k said:
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

The best part is it ends with "Share and Enjoy"*!

*"Share and Enjoy" is, of course, the company motto of the hugely successful
Sirius Cybernetics Corporation Complaints division, which now covers the
major land masses of three medium sized planets and is the only part of the
Corporation to have shown a consistent profit in recent years.

Hallvard B Furuseth · Oct 3, 2006

Laurent said:
I did some speed test some time ago with gcc on x86 and the fastest
version I was able to find was:

static inline void
ooc_memchr_copy( char *restrict dst,
const char *restrict src, size_t cnt)
{
size_t rem = cnt % 8;
cnt = (cnt / 8) + 1;
(...)

Heh. Strange, keeps an add but still speeds it up.

which is not the one published and works for cnt==0 as well as for
cnt==(size_t)-1. Why do you think that the orignal version is always
the one used?

It isn't; it's just the one I've _usually_ seen. (Not that I've seen

Hallvard B Furuseth · Oct 3, 2006

Rod said:
FYI, others have pointed out Simon Tatham's "Coroutines in C":
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Cool. Why didn't anyone tell me about that __LINE__ trick before?

I wouldn't call them coroutines though. Too limited.

Personally I don't see anything ugly about it, BTW. Hiding ugly stuff
is one of the things macros are _for_. Except the crFinish macro, but
that one is not necessary:

#define AutoSwitch(state) /* state=integer state variable */\
switch (state) case 0:
#define AScontrol(state, stmt) /* stmt=return/break/continue/goto */\
if (1) { (state) = __LINE__; stmt; case __LINE__:; } else (void)(0)

Now it's just a normal switch in that break/continue/etc work as
normally - it's just that the case statements look nicer this way.

int foo(void)
{
static int state, cur;
AutoSwitch(state)
for (cur = 1; cur < 10; cur++)
AScontrol(state, return cur*cur);
return 0;
}

Christopher Layne · Oct 3, 2006

Hallvard said:
Personally I don't see anything ugly about it, BTW. Hiding ugly stuff

Go look at the Putty code and report back to us.

Hallvard B Furuseth · Oct 3, 2006

Christopher said:
Go look at the Putty code and report back to us.

OK, now that has way too many magic macros to keep track of.
(Which is one reason I made 'return' etc arguments to my variants BTW,
that way if there is a return statement somewhere it is in the body.)

If(strcmp(str, "") == 0) - What does this line of code mean?	0	Aug 8, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
In C, the longest palindromic subsequence multithread exists	0	Nov 23, 2022
Need help! Following code isnt working fully Comparison of integer and pointer	0	Nov 20, 2022
Help in this program.	2	May 14, 2022
Adding adressing of IPv6 to program	1	Feb 16, 2023
optimizing the switch statement in Duff's Device (casting a label, label abuse)	10	Oct 10, 2007
validcstring function	12	Sep 28, 2013

Duff's Device

Hallvard B Furuseth

Hallvard B Furuseth

Laurent Deniau

Rod Pemberton

Keith Thompson

Christopher Benson-Manica

Guest

Mabden

Hallvard B Furuseth

Hallvard B Furuseth

Christopher Layne

Hallvard B Furuseth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads