Critique my assignment please

Eric Sosman · Aug 24, 2007

Thanks everyone for your replies. I'll give it a second try:
[...]

unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
assert(isupper(x));
return strchr(letters, x) - letters + 1;
}

Shaky. First, assert() is not a good mechanism for
checking the validity of input: not only can it be turned
off, but if it detects something there's no recovery and
little likelihood of a diagnostic message that has meaning
for anyone but the programmer himself.

Second, there well be upper-case letters that are not
among the twenty-six you have listed: Ã‚, Ã‘, Ä¢, Ã˜, Ã‡ and so
on. In theory, all C programs begin execution in the "C"
locale where only the listed twenty-six are upper-case, and
you're safe. But in practice, as a "convenience," some C
implementations start in some other, non-standard locale
that agrees better with local customs than with the Standard.
You're safe in theory, but in practice you may find that
isupper(x) is no guarantee that strchr(letters, x) will
return a non-NULL result.

Recommendations: (1) Use something solider than assert()
for input validation. (2) Don't try to predict whether the
strchr() will succeed or fail, but inspect its returned value
to discover what happened. (Thought experiment: What if it
turned out that the letter O was forbidden at the start of
a serial number, because of its resemblance to 0? Would you
still use isupper() to check for validity? What if I and L
and Y were also forbidden because they look like 1, Â£, and Â¥?
In short: Why make two tests when one will do?)

warint · Aug 24, 2007

Eric Sosman:

unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
assert(isupper(x));
return strchr(letters, x) - letters + 1;
}

Click to expand...

Shaky. First, assert() is not a good mechanism for
checking the validity of input: not only can it be turned
off, but if it detects something there's no recovery and
little likelihood of a diagnostic message that has meaning
for anyone but the programmer himself.

I realise this. The function is almost intentionally unsafe so as to
aid in optimisation. One of the conditions upon which the function is
used is that the input is valid, and in order not to de-optimise or
complicate the program unnecessarily, I "delegate" the error checking
elsewhere.

The purpose of the "assert" is to help when debugging. No more, no
less.

Second, there well be upper-case letters that are not
among the twenty-six you have listed: Â, Ñ, , Ø, Ç and so
on. In theory, all C programs begin execution in the "C"
locale where only the listed twenty-six are upper-case, and
you're safe. But in practice, as a "convenience," some C
implementations start in some other, non-standard locale
that agrees better with local customs than with the Standard.
You're safe in theory, but in practice you may find that
isupper(x) is no guarantee that strchr(letters, x) will
return a non-NULL result.

I'll give this a thought.

Recommendations: (1) Use something solider than assert()
for input validation. (2) Don't try to predict whether the
strchr() will succeed or fail, but inspect its returned value
to discover what happened. (Thought experiment: What if it
turned out that the letter O was forbidden at the start of
a serial number, because of its resemblance to 0? Would you
still use isupper() to check for validity? What if I and L
and Y were also forbidden because they look like 1, £, and ¥?
In short: Why make two tests when one will do?)

In answer to the first recommendation, I have delegated the error
checking elsewhere so that it's simply not possible for an error to
occur in the function. As regards predicting strchr, I had thought it
to be both safe and efficient. As for the thought experiment, it
simply doesn't apply in this case.

warint · Aug 24, 2007

foo.c:24: warning: no previous prototype for `DigitFromChar'
foo.c:44: warning: no previous prototype for `NumberFromUpperLetter'
foo.c:66: warning: no previous prototype for `IsValidEuroSerial'

I realise your compiler is set to warn about this, but I'm happy with
their omission in this program.

foo.c: In function `IsValidEuroSerial':
foo.c:67: warning: unused variable `dummy'

Again I'm happy with this.

foo.c: In function `main':
foo.c:103: warning: implicit declaration of function
`Is_ValidEuroSerial'

Wups a daisy. I'll rewrite the code and post below.

foo.c:95: warning: `output_string' might be used uninitialized in this
function

The surrounding functions prevent that. I'm happy with the setup.

foo.o: In function `main':
foo.c:100: the `gets' function is dangerous and should not be used.

fgets instead, I presume?

foo.c:103: undefined reference to `Is_ValidEuroSerial'
collect2: ld returned 1 exit status
make: *** [foo] Error 1

Here we go:

#include <assert.h> /* For assert */
#include <ctype.h> /* For stuff like isupper */
#include <stdio.h> /* For puts and gets */
#include <string.h> /* For strchr */

#define SERIAL_LEN 12 /* Serial number = one letter followed by
eleven digits */

/* Function: DigitFromChar

Converts '0' to 0, '5' to 5, '3' to 3, etc..

Exploits C89 feature that '0' through '9' must be consecutive.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/

unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );

return x - '0';

}

/* Function: NumberFromUpperLetter

Converts 'A' to 1, 'B' to 2, 'C' to 3... and so on.

Uses "ABCDEF..." and strchr.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/

unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

assert(isupper(x));

return strchr(letters, x) - letters + 1;

}

/* Function: Is_ValidEuroSerial

Returns 1 if valid, 0 if invalid, or -1 if syntax error.

Loops through the characters, summing with each iteration.

Release Mode: UNSAFE because behaviour is undefined if pointer is
invalid
Debug Mode: SAFE because assertion fails if pointer is invalid
*/

int Is_ValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );

char const *const pend = p + SERIAL_LEN;

unsigned sum;

if(!isupper(*p)) return -1;

sum = NumberFromUpperLetter(*p++);

do
{
if(!isdigit(*p)) return -1;
sum += DigitFromChar(*p++);
} while (pend != p);

if (*pend) return -1;

if (8 == sum%9) return 1;
else return 0;

}

int main(void)
{
char input[SERIAL_LEN + 1];

char const *output_string;

puts("Enter Euro banknote serial number: ");

fgets(input,sizeof input / sizeof *input, stdin);

switch (Is_ValidEuroSerial(input))
{
case -1: output_string = "\n\nInvalid Input. Input must consist "
"of an uppercase letter followed by "
"eleven digits only.\n";
break;

case 0: output_string = "\n\nINVALID\n";

break;

case 1: output_string = "\n\nValid\n";

break;
}

puts(output_string);

return 0;

}

Richard Heathfield · Aug 24, 2007

(e-mail address removed) said:

unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );

Better: assert(isdigit(x));

int Is_ValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );

Poor style. You don't need this value. If the assertion fails, the
program terminates without using the value. If the assertion is absent
or doesn't fire, dummy gets the value 0, always, and then you don't use
it. What's the point?

char const *const pend = p + SERIAL_LEN;

unsigned sum;

if(!isupper(*p)) return -1;

sum = NumberFromUpperLetter(*p++);

do
{
if(!isdigit(*p)) return -1;
sum += DigitFromChar(*p++);
} while (pend != p);

I'm not saying you should, necessarily, but were you aware that you can,
if you wish, cast out 9s as you go?

sum += DigitFromChar(*p++);
sum %= 9;

Anyway - yes, that looks reasonable. You might want to check the return
value of fgets, though - always a good idea to ensure that a requested
resource was in fact made available, rather than just to assume it.

warint · Aug 24, 2007

Richard Heathfield:

Poor style. You don't need this value. If the assertion fails, the
program terminates without using the value. If the assertion is absent
or doesn't fire, dummy gets the value 0, always, and then you don't use
it. What's the point?

The aim is to circumvent C's requirement that "All definitions of
variables must take place in a block before there are any statements".
An alternative would have been something like:

int Func(void)
{
assert(whatever);

{
char k;
}
}

Given the different choices, I think the "dummy variable" one is quite
prefereable.

Martin

Keith Thompson · Aug 24, 2007

Richard Heathfield:

The aim is to circumvent C's requirement that "All definitions of
variables must take place in a block before there are any statements".
An alternative would have been something like:

int Func(void)
{
assert(whatever);

{
char k;
}
}

Given the different choices, I think the "dummy variable" one is quite
prefereable.

Personally, I strongly prefer the extra block. Your
int const dummy = ( assert(p), 0 );
is IMHO too obscure; the block form is much clearer.

If Richard Heathfield didn't figure out what you're doing, it's a good
sign that your code is unclear (or that Richard hasn't had is coffee
yet).

Another style point: I think you have way too much vertical
whitespace. Most of your code is double-spaced or triple-spaced.
Blank lines between functions, and between major chunks of code, are
fine, but it's helpful to see more of the code at once.

user923005 · Aug 24, 2007

(e-mail address removed) wrote:
[...]
assert(isupper(x));
switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
Gyuuggh ...
A naive programmer who hasn't yet learned that all the
world isn't ASCII might write
if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();
A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like
static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

Click to expand...

Click to expand...

/* Since I wanted to do some math directly from the switch, I used a
switch anyway like this: */

Click to expand...

#include <assert.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

Click to expand...

#define VALID_SERIAL_NUMBER (0)
#define INVALID_LENGTH (-1)
#define INVALID_FORMAT (-2)
#define INVALID_CHECKSUM (-3)

Click to expand...

static unsigned letter_to_sum(int letter, int *err)
{
int sum = 0;
assert(isalpha(letter));
switch (letter) {
case 'A':
sum = 2;
break;
case 'B':
sum = 3;
break; [snip]
case 'H':
sum = 9;

Click to expand...

What would be wrong with sum = 0 here, then?

It is neither better nor worse.

break;
[snip]

default:
*err = 1;
break;
}
return sum;
}

Click to expand...

static unsigned digit_to_sum(int digit, int *err)
{
int sum = 0;
assert(isdigit(digit));
switch (digit) {

Click to expand...

Whoa. How is it better than sum = digit - '0'?

Neither better nor worse.

case '0':
sum = 0;
break;
case '1':
sum = 1;
break;
case '2':
sum = 2;
break;
case '3':
sum = 3;
break;
case '4':
sum = 4;
break;
case '5':
sum = 5;
break;
case '6':
sum = 6;
break;
case '7':
sum = 7;
break;
case '8':
sum = 8;
break;
case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}

Click to expand...

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

Click to expand...

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

if (err != 0)
return 1;
return sum % 9;
} [snip]
char string[32767];

Click to expand...

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?PageAction=VIEWCATS&Category=42478
so 32K is $ 0.002625, but point taken.

Also, identifiers starting with str followed by a lowercase letter
are reserved.

I realize that function names of that format are reserved:
7.26.10 General utilities <stdlib.h>
1 Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.h> header.
7.26.11 String handling <string.h>
1 Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.h> header.

Can you give me a citation that proves variable names are likewise
reserved to the implementation?

Richard Heathfield · Aug 24, 2007

Keith Thompson said:

Richard Heathfield:

int Is_ValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );

[...] What's the point?

Click to expand...

The aim is to circumvent C's requirement that "All definitions of
variables must take place in a block before there are any
statements". [...] Given the different choices, I think the
"dummy variable" one is quite prefereable.

Click to expand...

Personally, I strongly prefer the extra block. Your
int const dummy = ( assert(p), 0 );
is IMHO too obscure; the block form is much clearer.

If Richard Heathfield didn't figure out what you're doing, it's a good
sign that your code is unclear (or that Richard hasn't had is coffee
yet).

The former. I'm way past my coffee quota. I'm *always* way past my
coffee quota.

Another style point: I think you have way too much vertical
whitespace. Most of your code is double-spaced or triple-spaced.
Blank lines between functions, and between major chunks of code, are
fine, but it's helpful to see more of the code at once.

Agreed. I'm not parsimonious about vertical space - quite the reverse,
if anything - but one can have too much of a good thing.

Army1987 · Aug 24, 2007

<topicality level="dubious">

O(27) means bounded time (if we're writing about time,
which in this case we are). So do O(1) and O(n) and O(n*n)
and O(exp(n)) and O(f(n)) for arbitrary f.

Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

Army1987 · Aug 24, 2007

Neither better nor worse.

For me, it isn't. I won't take more time to copy and paste that
than to write sum = digit - '0'. For you I'd expect a difference,
but if you don't mind bothering to type that, that's your
business. Also, why waste more than one screenful when one line
suffices?

case '0':
sum = 0;
break; [snip]
case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}

Click to expand...

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

Click to expand...

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

Click to expand...

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

So why did you bother typing *that*?

if (err != 0)
return 1;
return sum % 9;
} [snip]
char string[32767];

Click to expand...

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

Click to expand...

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?PageAction=VIEWCATS&Category=42478
so 32K is $ 0.002625, but point taken.

C89 does not require an implementation to allow more than 32 KB
of auto variables. And anyway you just need 13 bytes for that...
(If the twelfth isn't '\n' the string is too long.)

I realize that function names of that format are reserved:
7.26.10 General utilities <stdlib.h>
1 Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.h> header.
7.26.11 String handling <string.h>
1 Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.h> header.

Can you give me a citation that proves variable names are likewise
reserved to the implementation?

My wrong. They are only if they have external linkage. 7.1.3.

CBFalconer · Aug 24, 2007

Eric Sosman:

unsigned NumberFromUpperLetter(char const x) {
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
assert(isupper(x));
return strchr(letters, x) - letters + 1;
}

Click to expand...

Shaky. First, assert() is not a good mechanism for checking the
validity of input: not only can it be turned off, but if it
detects something there's no recovery and little likelihood of a
diagnostic message that has meaning for anyone but the programmer
himself.

Click to expand...

I realise this. The function is almost intentionally unsafe so as
to aid in optimisation. One of the conditions upon which the
function is used is that the input is valid, and in order not to
de-optimise or complicate the program unnecessarily, I "delegate"
the error checking elsewhere.

However the similar routine I published here a day or two ago had
no assert, could never fail, returned zero for a non-letter input,
etc. IIRC it was:

unsigned int NumberFromUpperLetter(char const x) {
static const char UC[] = " ABCDEFGHIJKLMNOPQRSTUVW";
const char *p;

if (p = strchr((unsigned char)x, UC)) return p - &UC[0];
else return 0;
} /* untested */

Eric Sosman · Aug 24, 2007

Army1987 wrote On 08/24/07 15:45,:

Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

A deliberate abuse to draw attention to the fact that
the first argument to strchr() is an unvarying, constant
string. The time strchr() requires for a search is bounded
by the worst-case time to search that constant string, and
this worst-case time is a constant[*]. To put it another
way, there is "no n" that increases and drives some value
towards a limiting case. Suggesting a `switch' because it
is "most likely [...] O(1)" is an abuse of big-O[**].

... to which I responded with a deliberate abuse of
a different sort, intending to draw attention to Keith's
misstatement. (I just got through reading "Plato and a
Platypus Walk Into a Bar," and it's made me too subtle
for my own good ...)

[*] Well, no, but the analyses we can carry out in
C-land are not able to deal with pipeline stalls, cache
effects, TLB misses, and all the rest.

[**] Please note that I do not claim `switch' is
slower than strchr() or vice versa, but something rather
different: It's the reason given for using `switch' that
is wrong, not necessarily the conclusion itself.

user923005 · Aug 24, 2007

On Aug 24 said:
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);
I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

Click to expand...

Click to expand...

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

Click to expand...

So why did you bother typing *that*?

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

if (err != 0)
return 1;
return sum % 9;
}
[snip]
char string[32767];
"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

Click to expand...

Click to expand...

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?PageAction=VIEWCATS&Category=42478
so 32K is $ 0.002625, but point taken.

Click to expand...

C89 does not require an implementation to allow more than 32 KB
of auto variables.

It is not an auto variable.

And anyway you just need 13 bytes for that...
(If the twelfth isn't '\n' the string is too long.)

Making it the exact length is a mistake. When someone inputs the
string, they should be able to type something too long and correct it.
Of course, 32767 is utter overkill, but it is a habit I have for
simple demo programs.

[snip]

cpedant · Aug 24, 2007

Army1987 wrote On 08/24/07 15:45,:

No, this is an abuse of the word "bounded". Plain "bounded" is
conventionally used to mean "bounded by a constant". "bounded by f(n)"
is used to indicate boundedness by a specific function. If plain
"bounded" were used in the way that you have used it, then it would be
a meaningless tautology because every function f(n) would then
trivially be "bounded": f(n) < f(n) + 1.

Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

Click to expand...

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

Click to expand...

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

Click to expand...

A deliberate abuse to draw attention to the fact that
the first argument to strchr() is an unvarying, constant
string. The time strchr() requires for a search is bounded
by the worst-case time to search that constant string, and
this worst-case time is a constant[*]. To put it another
way, there is "no n" that increases and drives some value
towards a limiting case. Suggesting a `switch' because it
is "most likely [...] O(1)" is an abuse of big-O[**].

... to which I responded with a deliberate abuse of
a different sort, intending to draw attention to Keith's
misstatement. (I just got through reading "Plato and a
Platypus Walk Into a Bar," and it's made me too subtle
for my own good ...)

[*] Well, no, but the analyses we can carry out in
C-land are not able to deal with pipeline stalls, cache
effects, TLB misses, and all the rest.

[**] Please note that I do not claim `switch' is
slower than strchr() or vice versa, but something rather
different: It's the reason given for using `switch' that
is wrong, not necessarily the conclusion itself.

Yes, the point is that the "runtime" of the function in question is
bounded by a constant regardless of whether or not strchr is used
(assuming strchr is not written in some absurd way).

Keith's suggestion to use a switch because it is "most likely [...]
O(1)" is correct. The misnomer is his implication that our function is
not "O(1)" if strchr is used. strchr itself is likely "O(n)", but that
does not make our function "O(n)" for the reasons you have stated.

I find such informal (ab)use of big-O notation often leads to nothing
but confusion when its semantics start being discussed, and downright
misleading for people who are not very familiar with big-O. Formally
speaking, the "runtime" of our function actually lies in each of O(1),
O(n), O(n^2), O(2^n), O(n^n), and (infinitely) many more distinct
classes of functions. Informal usage of "O(f(n))" is almost always
meant to correspond to the lesser-known Theta(f(n)) function class.

Keith Thompson · Aug 25, 2007

user923005 said:
On Aug 24 said:

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

Click to expand...

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

Click to expand...

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

Click to expand...

So why did you bother typing *that*?

Click to expand...

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

I changed one of the quoted lines of code. Did you notice? Didn't
think so.

user923005 · Aug 25, 2007

user923005 said:
user923005 said:

On Aug 24 said:

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);
I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?
It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.
So why did you bother typing *that*?

Click to expand...

Click to expand...

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

Click to expand...

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

Are you saying that seeing a one character typo in the above is harder
than seeing a one character typo in a loop?

I changed one of the quoted lines of code. Did you notice? Didn't
think so.

It's quite easy to spot. Certainly no more difficult than spotting a
typo in a loop. Loops are also prone to off-by-one fencepost errors,
where the inline version is less likely to suffer that defect.
At any rate, I do not consider a loop in any way superior. At some
point (e.g. 25 lines or so) a loop is probably better just because it
is a bit less tedious. I have a tendency to unroll loops that comes
from programming way, way, back in the day when compilers did a poor
job of it. So maybe it is a poor habit on my part, since others seem
to have an objection to it. But I firmly do not believe that it will
cause a greater defect rate or difficulty in maintenance.

Keith Thompson · Aug 25, 2007

user923005 said:
user923005 said:

[snip]
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

Click to expand...

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

Click to expand...

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

Click to expand...

So why did you bother typing *that*?

Click to expand...

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

Click to expand...

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

Click to expand...

Are you saying that seeing a one character typo in the above is harder
than seeing a one character typo in a loop?

Yes, but that's not my main point. It's easier to *make* a one
character typo, simply because the unnecessary multiple lines
introduce more opportunites for error. And it's harder to maintain,
since there are N places that have to be updated consistently.

It's quite easy to spot. Certainly no more difficult than spotting a
typo in a loop. Loops are also prone to off-by-one fencepost errors,
where the inline version is less likely to suffer that defect.
At any rate, I do not consider a loop in any way superior. At some
point (e.g. 25 lines or so) a loop is probably better just because it
is a bit less tedious. I have a tendency to unroll loops that comes
from programming way, way, back in the day when compilers did a poor
job of it. So maybe it is a poor habit on my part, since others seem
to have an objection to it. But I firmly do not believe that it will
cause a greater defect rate or difficulty in maintenance.

Yes, I think it's a bad habit. You're doing manual
micro-optimization; the compiler is likely to be able to do a better
job of it than you are (particularly if it knows it can get faster
code *without* unrolling the loop).

Army1987 · Aug 25, 2007

char string[32767];

Click to expand...

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

Click to expand...

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?PageAction=VIEWCATS&Category=42478
so 32K is $ 0.002625, but point taken.

Click to expand...

C89 does not require an implementation to allow more than 32 KB
of auto variables.

Click to expand...

It is not an auto variable.

Whoops... snipped too much. Sorry for that. (I'd better check in
the *original* code before making such comments, next time. I was
mis-remembering having seen it declared in main.)

Making it the exact length is a mistake. When someone inputs the
string, they should be able to type something too long and correct it.

A serial number on a euro banknote is 12 characters. If you can
read 13 characters from stdin without hitting the end-of-file, and
the thirteen character isn't a newline (or whitespace), you know
that the serial number input is not valid because it is too long.

Of course, 32767 is utter overkill, but it is a habit I have for
simple demo programs.

All right, I understand that (though I tend to use less
ridiculously large numbers for those).

warint · Aug 25, 2007

CBFalconer:

unsigned int NumberFromUpperLetter(char const x) {
static const char UC[] = " ABCDEFGHIJKLMNOPQRSTUVW";
const char *p;

if (p = strchr((unsigned char)x, UC)) return p - &UC[0];
else return 0;
} /* untested */

Before I go to write a function, or any code really, I decide how
"safe" it should be. I decide whether all the input to a function
should be non-erronous, or whether it's the function's duty to check
for errors. I do this for two reasons:

1: Efficiency (i.e. fast execution time)
2: Erradicate redundant code before it's even written

Notwithstanding this though, debug-mode code such as "assert" can be
put anywhere so long as it doesn't hinder the release-mode executable.
(That's *my* religion in anyway).

In the case of converting an uppercase letter to a number, I decided
that it was the CALLING function's duty to make sure there was no
erronous data, leaving me with a function something like:

unsigned UpperLetterToNumber(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

assert(x); /* Redundant, but OK because it only works in debug
mode */

return strchr(letters,x) - letters + 1;
}

Martin

Martin Wells · Aug 25, 2007

assert(x); /* Redundant, but OK because it only works in debug

assert(isupper(x));

My Status, Ciphertext	2	Nov 28, 2023
Trying to get the average value of the elements, please help ! JavaScript	3	Dec 13, 2022
Need help with this script	4	Mar 12, 2023
How can I fix my pattern coding error in c++	0	Mar 19, 2023
Dont work, it´s something whit the loops?	1	Jun 30, 2021
Minimum Total Difficulty	0	Nov 15, 2023
Function is not worked in C	2	Jun 27, 2023
Need help in debugging tic tac toe (Beginner)	0	Jun 28, 2023

Critique my assignment please

Eric Sosman

warint

warint

Richard Heathfield

warint

Keith Thompson

user923005

Richard Heathfield

Army1987

Army1987

CBFalconer

Eric Sosman

user923005

cpedant

Keith Thompson

user923005

Keith Thompson

Army1987

warint

Martin Wells

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads