But if you feel like posting a better, faster version, fabulous! Do
it! That's precisely the kind of thing this group is for.
and in a follow up he wrote
Nevertheless! Thanks for playing. Seriously. Just because /I/ found
your code hard to read, that doesn't mean others will, and in any
case I found your willingness to share to be constructive and
helpful.
Like Richard, I would like to see examples of *good* code
in this ng. Don't know if my effort was "good" or "bad",
but in the latter case *someone* should post an improved version.
Recently we saw that c.l.c'ers have wide agreement on
one issue: Don't bother with bitfields!
But there *is* a need for "hand-rolled" bitfielding
macros or functions, i.e. getbits() and setbits().
These *should* be standard library functions.
In data compression, *many* statistical coding methods,
including Huffman, quasi-arithmetic, Golomb- and Elias-based
coding (but *not* arithmetic coding) use variable-length bit
strings, *cannot* use C bitfields in any straightforward
way even if they wanted to, and *must* use something like
getbits()/setbits().
As Mr. Sosman points out, bitfields are needed for certain
large memory-efficient tables, but even then "hand-rolled
bitfields" can be used, especially if one agrees that
tp[ix]->u_fb.ufb.foo_bar = FB_GREEN;
isn't really much more readable than
setfoobar(tp, ix, FB_GREEN);
(Here, setfoobar() might be a header-defined macro
which simply invokes the library setbits() routine.)
There are many expert programmers posting and lurking in
this newsgroup. I propose that this ng collectively
fashion model implementations of getbits() and setbits().
(Surely these routines are simple enough that any worry
of invention confidentiality needn't apply.)
Goals should include portability, standard compliance, speed,
and readability, probably in that order. The definition of
these routines is *very* straightforward, for example:
unsigned long getbits(unsigned long *p, int bcnt, int boff)
getbits returns, with zero-padding on the left, the
bcnt bits beginning boff bits from *p. bcnt can equal,
but not exceed the number of bits in 'unsigned long'.
Write your own implementation, or start with mine, identifying
its compliance and speed flaws. "Brownie points" to all who
contribute, and gold stars to all whose source-code fragment,
however small, ends up in the model implementation !
After my implementation (slightly improved compared to last
week's version) I'll make some brief comments on its (lack of)
"readability."
* * * * * * * * * * * * * *
/* Next two normally arrive via #includes */
#define CHAR_BIT 8
typedef unsigned long Bgroup;
#define BITS_BG (CHAR_BIT * sizeof(Bgroup))
#define B_One ((Bgroup)1)
#define SET_LBITS(p, k, v) \
(*(p) = *(p) & ((B_One << BITS_BG - (k)) - 1) \
| (v) << BITS_BG - (k))
#define SET_GBITS(p, k, n, v) \
(*(p) = *(p) & (~((B_One << BITS_BG - (n)) \
- (B_One << BITS_BG - (n) - (k)))) \
| (v) << BITS_BG - (n) - (k))
/* set |bcnt| bits to |datum|, offset |boff| bits from *|p| */
void setbits(Bgroup *p, int bcnt, int boff, Bgroup datum)
{
int rbit;
p += boff / BITS_BG;
boff %= BITS_BG;
rbit = bcnt + boff;
if (rbit > BITS_BG) {
/* Set p[0] rightside and p[1] leftside */
SET_GBITS(p, BITS_BG - boff, boff,
datum >> rbit - BITS_BG);
SET_LBITS(p + 1, rbit - BITS_BG, datum);
} else if (boff) {
/* Set p[0] */
SET_GBITS(p, bcnt, boff, datum);
} else {
/* Set p[0] leftside */
SET_LBITS(p, bcnt, datum);
}
}
/* return the |bcnt| bits at offset |boff| bits from *|p| */
Bgroup getbits(Bgroup *p, int bcnt, int boff)
{
int rbit;
Bgroup x;
p += boff / BITS_BG;
rbit = bcnt + boff % BITS_BG;
if (rbit > BITS_BG) {
x = p[0] << rbit - BITS_BG
| p[1] >> 2 * BITS_BG - rbit;
} else {
x = p[0] >> BITS_BG - rbit;
}
if (bcnt < BITS_BG) {
x &= (B_One << bcnt) - 1;
}
return x;
}
* * * * * * * * * * * * * * *
I found your code very hard going. No comments to speak of, cryptic
identifiers, overly-complex expressions.
I agree the code seems poor for readability, and *specific
changes* to address that are certainly of interest; however,
I doubt the code is really atypical for macros of this ilk.
I will certainly concede that *readability* is
*not* a general virtue of James Allen coding!
I won't bore you by making excuses, e.g.
I *did* make some effort to write getbits()/setbits()
"properly." I *hope* the posted code works with
Big-Endian, 1's-complement, etc.
but I tested it *only* on one Intel-based laptop.
(I do assume that the number 2*BITS_BG fits in 'int',
so the code may fail if, e.g., Bgroup is 64 bits and
int just 7 bits+sign.)
BTW, why don't good compilers offer, to aid
verification of portability, command-line options to
change sizeof(int), Endianness, etc.?
Isn't this asking for trouble if unsigned long contains padding
bits?
Well, now you've got me wondering! If it fails
for me, I'll venture it fails for many others
who have used similar idioms.
Call me simple-minded but I learn best by starting
with specific examples. (See quote from the great
George Boole in .sig below.) Can you give a specific
example of the possible problem in that #define ?
Perhaps there's a good
idea in there somewhere - I hope and trust that there is - but I
couldn't find it.
Don't know about any "good idea." The code is nothing more nor
less than what it claims to be, and I'd venture that many of
the programmers posting or lurking here would code up something
very similar. The functions need shifts, masks, bitwise booleans,
along with simple arithmetic to construct shift counts, and my
code does that (and, I hope, does it in a proper order
.
I considered, but quickly rejected, a faster but unportable
version of these routines: The bits which are get or set
may cross a 32-bit boundary, but that time-consuming
inconvenience could be avoided on many architectures by
replacing
Uint32 *p;
p += boff / BITS_BG;
boff %= BITS_BG;
with
Uint32 *p;
/* next is noncompliant and nonportable */
p = (Uint16 *)p + boff / (BITS_BG / 2);
boff %= BITS_BG / 2;
* * * * * * * * * * * * * *
Although readability is generally a weak point in
James Allen code, I will argue that my code is
sometimes misjudged:
A 10-line routine which takes as long to understand
as a 40-line routine is *not* a readability loss, if
the 10-line routine replaces the 40-line routine, but
often rough readability assessments are made without
making the relevant comparison.
I must try to avoid blowing my own horn, but consider
http://james.fabpedigree.com/wnim.htm
as a specific example of this last point. That program
may help show why I find programming to be fun.
I threw it together in less than two hours in response
to a rec.puzzles puzzle. c.l.c'ers will denounce this
program gleefully, but "proof of the pudding" is that
no one else solved the rec.puzzles puzzle, and no one
has ever bothered to improve on my 2-hour programming
effort.
James Dow Allen