"directory order" - K and R 2 exercise 5-16?

G Fernandes · Mar 19, 2005

Can someone explain what is meant by "directory order" in the questoin
for K and R 2 exercise 5-16?

I can't seem to find a solution for this exercise on the main site
where clc goers have posted solutions, so I'm guessing this phrase
might be ambiguous.

In any case, I'm wondering if anyone knows what might be a suitable
definition for this phrase. Thank you.

Tor Rustad · Mar 19, 2005

G Fernandes said:
Can someone explain what is meant by "directory order" in the
questoin for K and R 2 exercise 5-16?

What it say is, ignore other characters than letters, numbers
and blanks, when sorting.

Just like the UNIX sort, see "man sort -d"

Keith Thompson · Mar 19, 2005

Tor Rustad said:
What it say is, ignore other characters than letters, numbers
and blanks, when sorting.

Just like the UNIX sort, see "man sort -d"

My copy of K&R2 is several thousand miles away at the moment, but it
sounds like "dictionary order" rather than "directory order".

G Fernandes · Mar 19, 2005

Tor said:
What it say is, ignore other characters than letters, numbers
and blanks, when sorting.

Just like the UNIX sort, see "man sort -d"

Yes. I understand how this could work if all the input lines were the
same format, like
abcd@$abcd
efgh#&lkjs
ueid!-slkj

but whatif you have two input lines where one has an alphanumeric or
blank where in the same position the other line as a non-alphanmeric
nor blank?

For example

ab@aaa
abd#gh
a*shdj

How would someone sort that?

Joe Wright · Mar 19, 2005

G said:
Yes. I understand how this could work if all the input lines were the
same format, like
abcd@$abcd
efgh#&lkjs
ueid!-slkj

but whatif you have two input lines where one has an alphanumeric or
blank where in the same position the other line as a non-alphanmeric
nor blank?

For example

ab@aaa
abd#gh
a*shdj

How would someone sort that?

Assuming they are ASCII strings, I would use strcmp() to order them. All
of '@', '#' and '*' have values less than alphanumerics. I suppose they
would sort..

a*shdj
ab@aaa
abd#gh

Luke Wu · Mar 19, 2005

G said:
Yes. I understand how this could work if all the input lines were the
same format, like
abcd@$abcd
efgh#&lkjs
ueid!-slkj

but whatif you have two input lines where one has an alphanumeric or
blank where in the same position the other line as a non-alphanmeric
nor blank?

For example

ab@aaa
abd#gh
a*shdj

How would someone sort that?

Wrap strcmp with something that tests for a flag and acts differently
if d-order is required. Something like this:

#include <string.h>
#include <ctype.h>

int d_strcmp(char *s1, char *s2)
{
if (dorder) {
int i = 0;
while (1) {
if (s1 != s2 &&
( isalpha(s1) || isspace(s1) || !s1 ) &&
( isalpha(s2) || isspace(s2) || !s2 )
)
return s1 - s2;
else if (s1 == '\0' || s2 == '\0')
return 0;
i++;
}
}
else return strcmp(s1, s2);
}

dorder can be an external variable (as would be the case in the
function I've shown above) or it can be passed in as an argument

Some people suggest casting arguments of ctype function to unsigned
char, but I don't think you need to worry about that unless your
implementation has weird differences (padding bits) between signed and
unsigned char [these implementations break the standard, AFAIK]

Barry Schwarz · Mar 19, 2005

Yes. I understand how this could work if all the input lines were the
same format, like
abcd@$abcd
efgh#&lkjs
ueid!-slkj

but whatif you have two input lines where one has an alphanumeric or
blank where in the same position the other line as a non-alphanmeric
nor blank?

For example

ab@aaa
abd#gh
a*shdj

How would someone sort that?

Unless you are trying to be extra fancy (as in a phone book where you
want O'Connel to come between Occam and Odum), ignore the differences.
If the character appears in the execution set, then by definition it
fits in a char. A char is an integer type. Integer types can be
compared using if or, for arrays of char, strcmp and memcmp. The
results of all three are well defined, even if implementation
dependent. (For example, on an ASCII system, 'A' < 'a'. The opposite
is true on an EBCDIC system.)

<<Remove the del for email>>

Eric Sosman · Mar 19, 2005

Luke said:
[...]
Some people suggest casting arguments of ctype function to unsigned
char, but I don't think you need to worry about that unless your
implementation has weird differences (padding bits) between signed and
unsigned char [these implementations break the standard, AFAIK]

The reason for the cast has nothing to do with padding
bits, unusual CHAR_BIT values, exotic representations, or
broken implementations. It's because `char' can be a signed
type, and thus can have negative values. Pass a negative
value to a <ctype.h> function and you get undefined behavior
(unless the value just happens to equal EOF, in which case
you get the small consolation of an answer that's well-defined
but quite possibly wrong).

If you like U.B. and/or wrong answers, omit the cast.
Otherwise, ...

Mark McIntyre · Mar 19, 2005

Can someone explain what is meant by "directory order" in the questoin
for K and R 2 exercise 5-16?

the order it appears in a phone directory, probably. Hence Mc appears in amongst
Ma and before Mb....

Luke Wu · Mar 19, 2005

Luke said:
Wrap strcmp with something that tests for a flag and acts differently
if d-order is required. Something like this:

#include <string.h>
#include <ctype.h>

int d_strcmp(char *s1, char *s2)
{
if (dorder) {
int i = 0;
while (1) {
if (s1 != s2 &&
( isalpha(s1) || isspace(s1) || !s1 ) &&
( isalpha(s2) || isspace(s2) || !s2 )

^^
those should be isalnum (instead of isalpha)

)
return s1 - s2;
else if (s1 == '\0' || s2 == '\0')
return 0;
i++;
}
}
else return strcmp(s1, s2);
}

dorder can be an external variable (as would be the case in the
function I've shown above) or it can be passed in as an argument

Some people suggest casting arguments of ctype function to unsigned
char, but I don't think you need to worry about that unless your
implementation has weird differences (padding bits) between signed and
unsigned char [these implementations break the standard, AFAIK]

Click to expand...

CBFalconer · Mar 19, 2005

Luke said:
.... snip ...

Some people suggest casting arguments of ctype function to unsigned
char, but I don't think you need to worry about that unless your
implementation has weird differences (padding bits) between signed
and unsigned char [these implementations break the standard, AFAIK]

Nothing weird needed, just that the native version of char is
signed. Passing any negative value (other than EOF) to the ctype
functions results in undefined behaviour.

Arthur J. O'Dwyer · Mar 19, 2005

Exactly the way you put above: ABAAA before ABDGH before ASHDJ, and
ignore the funny characters in the middles of words. (This also would
sort O'Connel between Occam and Odoul, as mentioned by another poster.)

#include <string.h>
#include <ctype.h>

int d_strcmp(char *s1, char *s2)
{
if (dorder) {
int i = 0;
while (1) {
if (s1 != s2 &&
( isalpha(s1) || isspace(s1) || !s1 ) &&
( isalpha(s2) || isspace(s2) || !s2 )
)
return s1 - s2;
else if (s1 == '\0' || s2 == '\0')
return 0;
i++;
}
}
else return strcmp(s1, s2);
}

This looks really weird; it certainly doesn't seem to do what I inferred
the OP wanted to do, and I'm not sure it does anything reasonable. It
would produce d_strcmp("a", "%")==0, d_strcmp("O'Con","Occam") < 0, and
so on. I think the OP (and K&R) would be happier with

int dict_strcmp(const char *s, const char *t)
{
int i, j, si, tj;
for (i=j=0; s && t[j]; ++i, ++j) {
while (s && !isalpha(s)) ++i;
while (t[j] && !isalpha(t[j])) ++j;
if (toupper(s) != toupper(t[j])) break;
}
si = toupper(s);
tj = toupper(t[j]);
return si < tj? -1: si > tj;
}

It's a little messier due to the extra 'toupper's and my insistence on
returning -1, 0, or +1 instead of just negative, 0, or positive. An
exercise for the interested reader: Extend this function to deal more
reasonably with strings containing no alphabetic characters at all;
e.g. to sort "6" before "777". How difficult is it to sort numeric
strings by their decimal values (e.g., "100" after "99")? How difficult
is it to sort "A1 Steak Sauce" as equal to "A-One Steak Sauce," between
"AOL" and "Aorta"? (Interface design problem: In each case, where would
we sort the string "A4 Paper"? Which result is more reasonable? Why?)

Some people suggest casting arguments of ctype function to unsigned
char, but I don't think you need to worry about that unless your
implementation has weird differences (padding bits) between signed and
unsigned char [these implementations break the standard, AFAIK]

Click to expand...

No, padding bits aren't it. You need to worry only if you're planning
to process data containing negative 'char' values. Since both your and
my implementations basically assumed ASCII, I don't think it's worth the
extra opacity in this case. But certainly a line like

k = toupper(getchar());

would be way out of line, as I understand it; we have no guarantee that
the user won't enter negative character values. Whereas we can make
the "no negative values" requirement a precondition of the 'd_strcmp'
function, and put the burden on the client programmer, if we want.

-Arthur

CBFalconer · Mar 19, 2005

Arthur J. O'Dwyer said:
.... snip ...
extra opacity in this case. But certainly a line like

k = toupper(getchar());

would be way out of line, as I understand it; we have no guarantee
that the user won't enter negative character values. Whereas we

Yes we do. getchar returns, in an int, the unsigned value of an
input char. The only negative value it ever returns is EOF.

Arthur J. O'Dwyer · Mar 20, 2005

Yes we do. getchar returns, in an int, the unsigned value of an
input char. The only negative value it ever returns is EOF.

Whoops. You're right. Make that

scanf("%c", &k);
k = toupper(k);

then. I think there is no guarantee that 'scanf' will yield only
positive values for 'char'.

-Arthur

Kenneth Bull · Mar 21, 2005

Arthur said:
Whoops. You're right. Make that

scanf("%c", &k);
k = toupper(k);

then. I think there is no guarantee that 'scanf' will yield only
positive values for 'char'.

Now you're writing about two different things (further evidenced by the
fact that they appear in two separate statements in your code), and
trying needlessly to relate the two to make a point.

The point you are trying to make has very little to do with scanf, and
a lot of do with the type of the variable 'k' (which you have not shown
a declaration for). If 'k' is of type char, and the implementation
makes 'char' equivalent to signed char, then yes, there is no guarantee
that the value you're pushing into toupper will only yield a positive
value for all valid character. This has 'very little' to do with scanf
(or getchar as you previously claimed).

So if anything, your code -somewhat- reverts back to the exact same
point Eric Sosman was making, without adding any special caveat for
scanf whatsoever.

Peter Nilsson · Mar 21, 2005

Arthur said:
Whoops. You're right. Make that

scanf("%c", &k);
k = toupper(k);

then. I think there is no guarantee that 'scanf' will yield only
positive values for 'char'.

scanf will implicitly use fgets to read a (byte) character. If k
is a signed or plain char, then you are interpreting that read byte
through an lvalue of that type. You are better off interpreting
the byte through an unsigned char lvalue.

Arthur J. O'Dwyer · Mar 22, 2005

Now you're writing about two different things (further evidenced by the
fact that they appear in two separate statements in your code), and
trying needlessly to relate the two to make a point.

The point you are trying to make has very little to do with scanf, and
a lot of do with the type of the variable 'k' (which you have not shown
a declaration for).

Nope. I surmise that you have not understood the point I'm trying
to make. My point is that you need to verify user input (such as input
that comes from 'scanf'[1]), as opposed to the kind of input a library
function might get from a client program (such as C-style strings being
passed to a sorting function, the original context of my remark).

So if anything, your code -somewhat- reverts back to the exact same
point Eric Sosman was making, without adding any special caveat for
scanf whatsoever.

Huh? Eric basically said, "Not casting results in UB." I disagree;
the cast is /only/ necessary when you're dealing with potentially
unsafe input, and the only way to get unsafe input is from the user,
via 'getchar', 'scanf', or any other <stdio.h> input function.

There's nothing special about 'scanf' that makes it dangerous in
this respect; but, as CBFalconer pointed out, there is something special
about 'getchar' that makes it innocuous in this respect. That's why I
corrected my "dangerous" code --- it hadn't been as dangerous as I had
thought.

-Arthur

[1] - but not, technically speaking, 'getchar', which was what CBFalconer
pointed out, and which was why I corrected my example to use the 'scanf'
input function instead, which AFAIK provides no guarantee of its results'
<ctype.h>-friendliness.

Eric Sosman · Mar 22, 2005

Arthur said:
Huh? Eric basically said, "Not casting results in UB." I disagree;
the cast is /only/ necessary when you're dealing with potentially
unsafe input, and the only way to get unsafe input is from the user,
via 'getchar', 'scanf', or any other <stdio.h> input function.

char rebuttal[] = "Haben Sie alle Möglichkeiten betrachtet?"

Granted: This cannot appear in a strictly conforming program,
because it uses a character not found in the basic source or
execution sets. "Strictly conforming" programs, though, seem
to be a tiny minority; if you want to write robust code you
should consider the possibility that it might be used outside
the germ-proof bubble.

K&R Exercise 6-2	8	Aug 28, 2008
clc-wiki answer to K+R exercise 2-7	6	May 16, 2010
K&R Exercise 1-21: entab	10	Nov 18, 2009
C Programming Language 2nd Ed, Exercise 1.9 and 1.12, Solution suggestion.	45	Jun 9, 2014
K&R 2 exercise 2-3	47	Feb 3, 2004
Small Correction to K&R Exercise 1-22 Solution on CLC-Wiki	8	May 18, 2006
Strange "ruby -r debug t.rb" for PickAxe ver. 2's Fibonacci example	2	Feb 3, 2011
i=infinity;0= isin kpi, 1=cos kpi, k=m/n, n=4,m=0-00; cG=20=const, 1/sgrt2>G>0.5, 6<N = NA ^2su	13	Aug 8, 2006

"directory order" - K and R 2 exercise 5-16?

G Fernandes

Tor Rustad

Keith Thompson

G Fernandes

Joe Wright

Luke Wu

Barry Schwarz

Eric Sosman

Mark McIntyre

Luke Wu

CBFalconer

Arthur J. O'Dwyer

CBFalconer

Arthur J. O'Dwyer

Kenneth Bull

Peter Nilsson

Arthur J. O'Dwyer

Eric Sosman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads