"directory order" - K and R 2 exercise 5-16?

Discussion in 'C Programming' started by G Fernandes, Mar 19, 2005.

  1. G Fernandes

    G Fernandes Guest

    Can someone explain what is meant by "directory order" in the questoin
    for K and R 2 exercise 5-16?

    I can't seem to find a solution for this exercise on the main site
    where clc goers have posted solutions, so I'm guessing this phrase
    might be ambiguous.

    In any case, I'm wondering if anyone knows what might be a suitable
    definition for this phrase. Thank you.
     
    G Fernandes, Mar 19, 2005
    #1
    1. Advertising

  2. G Fernandes

    Tor Rustad Guest

    "G Fernandes" <> wrote in message

    > Can someone explain what is meant by "directory order" in the
    > questoin for K and R 2 exercise 5-16?


    What it say is, ignore other characters than letters, numbers
    and blanks, when sorting.

    Just like the UNIX sort, see "man sort -d"

    --
    Tor <torust AT online DOT no>
     
    Tor Rustad, Mar 19, 2005
    #2
    1. Advertising

  3. "Tor Rustad" <> writes:
    > "G Fernandes" <> wrote in message
    >
    >> Can someone explain what is meant by "directory order" in the
    >> questoin for K and R 2 exercise 5-16?

    >
    > What it say is, ignore other characters than letters, numbers
    > and blanks, when sorting.
    >
    > Just like the UNIX sort, see "man sort -d"


    My copy of K&R2 is several thousand miles away at the moment, but it
    sounds like "dictionary order" rather than "directory order".

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Mar 19, 2005
    #3
  4. G Fernandes

    G Fernandes Guest

    Tor Rustad wrote:
    > "G Fernandes" <> wrote in message
    >
    > > Can someone explain what is meant by "directory order" in the
    > > questoin for K and R 2 exercise 5-16?

    >
    > What it say is, ignore other characters than letters, numbers
    > and blanks, when sorting.
    >
    > Just like the UNIX sort, see "man sort -d"
    >


    Yes. I understand how this could work if all the input lines were the
    same format, like
    abcd@$abcd
    efgh#&lkjs
    ueid!-slkj

    but whatif you have two input lines where one has an alphanumeric or
    blank where in the same position the other line as a non-alphanmeric
    nor blank?

    For example

    ab@aaa
    abd#gh
    a*shdj

    How would someone sort that?
     
    G Fernandes, Mar 19, 2005
    #4
  5. G Fernandes

    Joe Wright Guest

    G Fernandes wrote:
    > Tor Rustad wrote:
    >
    >>"G Fernandes" <> wrote in message
    >>
    >>
    >>>Can someone explain what is meant by "directory order" in the
    >>>questoin for K and R 2 exercise 5-16?

    >>
    >>What it say is, ignore other characters than letters, numbers
    >>and blanks, when sorting.
    >>
    >>Just like the UNIX sort, see "man sort -d"
    >>

    >
    >
    > Yes. I understand how this could work if all the input lines were the
    > same format, like
    > abcd@$abcd
    > efgh#&lkjs
    > ueid!-slkj
    >
    > but whatif you have two input lines where one has an alphanumeric or
    > blank where in the same position the other line as a non-alphanmeric
    > nor blank?
    >
    > For example
    >
    > ab@aaa
    > abd#gh
    > a*shdj
    >
    > How would someone sort that?
    >


    Assuming they are ASCII strings, I would use strcmp() to order them. All
    of '@', '#' and '*' have values less than alphanumerics. I suppose they
    would sort..

    a*shdj
    ab@aaa
    abd#gh
    --
    Joe Wright mailto:
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Mar 19, 2005
    #5
  6. G Fernandes

    Luke Wu Guest

    G Fernandes wrote:
    > Tor Rustad wrote:
    > > "G Fernandes" <> wrote in message
    > >
    > > > Can someone explain what is meant by "directory order" in the
    > > > questoin for K and R 2 exercise 5-16?

    > >
    > > What it say is, ignore other characters than letters, numbers
    > > and blanks, when sorting.
    > >
    > > Just like the UNIX sort, see "man sort -d"
    > >

    >
    > Yes. I understand how this could work if all the input lines were

    the
    > same format, like
    > abcd@$abcd
    > efgh#&lkjs
    > ueid!-slkj
    >
    > but whatif you have two input lines where one has an alphanumeric or
    > blank where in the same position the other line as a non-alphanmeric
    > nor blank?
    >
    > For example
    >
    > ab@aaa
    > abd#gh
    > a*shdj
    >
    > How would someone sort that?



    Wrap strcmp with something that tests for a flag and acts differently
    if d-order is required. Something like this:

    #include <string.h>
    #include <ctype.h>

    int d_strcmp(char *s1, char *s2)
    {
    if (dorder) {
    int i = 0;
    while (1) {
    if (s1 != s2 &&
    ( isalpha(s1) || isspace(s1) || !s1 ) &&
    ( isalpha(s2) || isspace(s2) || !s2 )
    )
    return s1 - s2;
    else if (s1 == '\0' || s2 == '\0')
    return 0;
    i++;
    }
    }
    else return strcmp(s1, s2);
    }


    dorder can be an external variable (as would be the case in the
    function I've shown above) or it can be passed in as an argument

    Some people suggest casting arguments of ctype function to unsigned
    char, but I don't think you need to worry about that unless your
    implementation has weird differences (padding bits) between signed and
    unsigned char [these implementations break the standard, AFAIK]
     
    Luke Wu, Mar 19, 2005
    #6
  7. On 19 Mar 2005 09:50:04 -0800, "G Fernandes" <>
    wrote:

    >Tor Rustad wrote:
    >> "G Fernandes" <> wrote in message
    >>
    >> > Can someone explain what is meant by "directory order" in the
    >> > questoin for K and R 2 exercise 5-16?

    >>
    >> What it say is, ignore other characters than letters, numbers
    >> and blanks, when sorting.
    >>
    >> Just like the UNIX sort, see "man sort -d"
    >>

    >
    >Yes. I understand how this could work if all the input lines were the
    >same format, like
    >abcd@$abcd
    >efgh#&lkjs
    >ueid!-slkj
    >
    >but whatif you have two input lines where one has an alphanumeric or
    >blank where in the same position the other line as a non-alphanmeric
    >nor blank?
    >
    >For example
    >
    >ab@aaa
    >abd#gh
    >a*shdj
    >
    >How would someone sort that?


    Unless you are trying to be extra fancy (as in a phone book where you
    want O'Connel to come between Occam and Odum), ignore the differences.
    If the character appears in the execution set, then by definition it
    fits in a char. A char is an integer type. Integer types can be
    compared using if or, for arrays of char, strcmp and memcmp. The
    results of all three are well defined, even if implementation
    dependent. (For example, on an ASCII system, 'A' < 'a'. The opposite
    is true on an EBCDIC system.)


    <<Remove the del for email>>
     
    Barry Schwarz, Mar 19, 2005
    #7
  8. G Fernandes

    Eric Sosman Guest

    Luke Wu wrote:
    > [...]
    > Some people suggest casting arguments of ctype function to unsigned
    > char, but I don't think you need to worry about that unless your
    > implementation has weird differences (padding bits) between signed and
    > unsigned char [these implementations break the standard, AFAIK]


    The reason for the cast has nothing to do with padding
    bits, unusual CHAR_BIT values, exotic representations, or
    broken implementations. It's because `char' can be a signed
    type, and thus can have negative values. Pass a negative
    value to a <ctype.h> function and you get undefined behavior
    (unless the value just happens to equal EOF, in which case
    you get the small consolation of an answer that's well-defined
    but quite possibly wrong).

    If you like U.B. and/or wrong answers, omit the cast.
    Otherwise, ...

    --
    Eric Sosman
    lid
     
    Eric Sosman, Mar 19, 2005
    #8
  9. On 18 Mar 2005 21:16:21 -0800, in comp.lang.c , "G Fernandes"
    <> wrote:

    >Can someone explain what is meant by "directory order" in the questoin
    >for K and R 2 exercise 5-16?


    the order it appears in a phone directory, probably. Hence Mc appears in amongst
    Ma and before Mb....

    --
    Mark McIntyre
    CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
    CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
     
    Mark McIntyre, Mar 19, 2005
    #9
  10. G Fernandes

    Luke Wu Guest

    Luke Wu wrote:

    >
    >
    > Wrap strcmp with something that tests for a flag and acts differently
    > if d-order is required. Something like this:
    >
    > #include <string.h>
    > #include <ctype.h>
    >
    > int d_strcmp(char *s1, char *s2)
    > {
    > if (dorder) {
    > int i = 0;
    > while (1) {
    > if (s1 != s2 &&
    > ( isalpha(s1) || isspace(s1) || !s1 ) &&
    > ( isalpha(s2) || isspace(s2) || !s2 )
    >

    ^^
    those should be isalnum (instead of isalpha)
    >
    > )
    > return s1 - s2;
    > else if (s1 == '\0' || s2 == '\0')
    > return 0;
    > i++;
    > }
    > }
    > else return strcmp(s1, s2);
    > }
    >
    >
    > dorder can be an external variable (as would be the case in the
    > function I've shown above) or it can be passed in as an argument
    >
    > Some people suggest casting arguments of ctype function to unsigned
    > char, but I don't think you need to worry about that unless your
    > implementation has weird differences (padding bits) between signed

    and
    > unsigned char [these implementations break the standard, AFAIK]
     
    Luke Wu, Mar 19, 2005
    #10
  11. G Fernandes

    CBFalconer Guest

    Luke Wu wrote:
    >

    .... snip ...
    >
    > Some people suggest casting arguments of ctype function to unsigned
    > char, but I don't think you need to worry about that unless your
    > implementation has weird differences (padding bits) between signed
    > and unsigned char [these implementations break the standard, AFAIK]


    Nothing weird needed, just that the native version of char is
    signed. Passing any negative value (other than EOF) to the ctype
    functions results in undefined behaviour.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
     
    CBFalconer, Mar 19, 2005
    #11
  12. On Sat, 19 Mar 2005, Luke Wu wrote:
    >
    > G Fernandes wrote:
    >> Tor Rustad wrote:
    >>> "G Fernandes" <> wrote in message
    >>>
    >>>> Can someone explain what is meant by "directory order" in the
    >>>> questoin for K and R 2 exercise 5-16?
    >>>
    >>> What it say is, ignore other characters than letters, numbers
    >>> and blanks, when sorting.

    >>
    >> but whatif you have two input lines where one has an alphanumeric or
    >> blank where in the same position the other line as a non-alphanmeric
    >> nor blank?
    >>
    >> For example
    >>
    >> ab@aaa
    >> abd#gh
    >> a*shdj
    >>
    >> How would someone sort that?


    Exactly the way you put above: ABAAA before ABDGH before ASHDJ, and
    ignore the funny characters in the middles of words. (This also would
    sort O'Connel between Occam and Odoul, as mentioned by another poster.)


    > #include <string.h>
    > #include <ctype.h>
    >
    > int d_strcmp(char *s1, char *s2)
    > {
    > if (dorder) {
    > int i = 0;
    > while (1) {
    > if (s1 != s2 &&
    > ( isalpha(s1) || isspace(s1) || !s1 ) &&
    > ( isalpha(s2) || isspace(s2) || !s2 )
    > )
    > return s1 - s2;
    > else if (s1 == '\0' || s2 == '\0')
    > return 0;
    > i++;
    > }
    > }
    > else return strcmp(s1, s2);
    > }


    This looks really weird; it certainly doesn't seem to do what I inferred
    the OP wanted to do, and I'm not sure it does anything reasonable. It
    would produce d_strcmp("a", "%")==0, d_strcmp("O'Con","Occam") < 0, and
    so on. I think the OP (and K&R) would be happier with

    int dict_strcmp(const char *s, const char *t)
    {
    int i, j, si, tj;
    for (i=j=0; s && t[j]; ++i, ++j) {
    while (s && !isalpha(s)) ++i;
    while (t[j] && !isalpha(t[j])) ++j;
    if (toupper(s) != toupper(t[j])) break;
    }
    si = toupper(s);
    tj = toupper(t[j]);
    return si < tj? -1: si > tj;
    }

    It's a little messier due to the extra 'toupper's and my insistence on
    returning -1, 0, or +1 instead of just negative, 0, or positive. An
    exercise for the interested reader: Extend this function to deal more
    reasonably with strings containing no alphabetic characters at all;
    e.g. to sort "6" before "777". How difficult is it to sort numeric
    strings by their decimal values (e.g., "100" after "99")? How difficult
    is it to sort "A1 Steak Sauce" as equal to "A-One Steak Sauce," between
    "AOL" and "Aorta"? (Interface design problem: In each case, where would
    we sort the string "A4 Paper"? Which result is more reasonable? Why?)

    > Some people suggest casting arguments of ctype function to unsigned
    > char, but I don't think you need to worry about that unless your
    > implementation has weird differences (padding bits) between signed and
    > unsigned char [these implementations break the standard, AFAIK]


    No, padding bits aren't it. You need to worry only if you're planning
    to process data containing negative 'char' values. Since both your and
    my implementations basically assumed ASCII, I don't think it's worth the
    extra opacity in this case. But certainly a line like

    k = toupper(getchar());

    would be way out of line, as I understand it; we have no guarantee that
    the user won't enter negative character values. Whereas we can make
    the "no negative values" requirement a precondition of the 'd_strcmp'
    function, and put the burden on the client programmer, if we want.

    -Arthur
     
    Arthur J. O'Dwyer, Mar 20, 2005
    #12
  13. G Fernandes

    CBFalconer Guest

    "Arthur J. O'Dwyer" wrote:
    >

    .... snip ...
    > extra opacity in this case. But certainly a line like
    >
    > k = toupper(getchar());
    >
    > would be way out of line, as I understand it; we have no guarantee
    > that the user won't enter negative character values. Whereas we


    Yes we do. getchar returns, in an int, the unsigned value of an
    input char. The only negative value it ever returns is EOF.

    --
    "I conclude that there are two ways of constructing a software
    design: One way is to make it so simple that there are obviously
    no deficiencies and the other way is to make it so complicated
    that there are no obvious deficiencies." -- C. A. R. Hoare
     
    CBFalconer, Mar 20, 2005
    #13
  14. On Sun, 20 Mar 2005, CBFalconer wrote:
    >
    > "Arthur J. O'Dwyer" wrote:
    >> extra opacity in this case. But certainly a line like
    >>
    >> k = toupper(getchar());
    >>
    >> would be way out of line, as I understand it; we have no guarantee
    >> that the user won't enter negative character values. Whereas we

    >
    > Yes we do. getchar returns, in an int, the unsigned value of an
    > input char. The only negative value it ever returns is EOF.


    Whoops. You're right. Make that

    scanf("%c", &k);
    k = toupper(k);

    then. I think there is no guarantee that 'scanf' will yield only
    positive values for 'char'.

    -Arthur
     
    Arthur J. O'Dwyer, Mar 20, 2005
    #14
  15. G Fernandes

    Kenneth Bull Guest

    Arthur J. O'Dwyer wrote:
    > On Sun, 20 Mar 2005, CBFalconer wrote:
    > >
    > > "Arthur J. O'Dwyer" wrote:
    > >> extra opacity in this case. But certainly a line like
    > >>
    > >> k = toupper(getchar());
    > >>
    > >> would be way out of line, as I understand it; we have no guarantee
    > >> that the user won't enter negative character values. Whereas we

    > >
    > > Yes we do. getchar returns, in an int, the unsigned value of an
    > > input char. The only negative value it ever returns is EOF.

    >
    > Whoops. You're right. Make that
    >
    > scanf("%c", &k);
    > k = toupper(k);
    >
    > then. I think there is no guarantee that 'scanf' will yield only
    > positive values for 'char'.
    >


    Now you're writing about two different things (further evidenced by the
    fact that they appear in two separate statements in your code), and
    trying needlessly to relate the two to make a point.

    The point you are trying to make has very little to do with scanf, and
    a lot of do with the type of the variable 'k' (which you have not shown
    a declaration for). If 'k' is of type char, and the implementation
    makes 'char' equivalent to signed char, then yes, there is no guarantee
    that the value you're pushing into toupper will only yield a positive
    value for all valid character. This has 'very little' to do with scanf
    (or getchar as you previously claimed).

    So if anything, your code -somewhat- reverts back to the exact same
    point Eric Sosman was making, without adding any special caveat for
    scanf whatsoever.
     
    Kenneth Bull, Mar 21, 2005
    #15
  16. Arthur J. O'Dwyer wrote:
    > On Sun, 20 Mar 2005, CBFalconer wrote:
    > >
    > > "Arthur J. O'Dwyer" wrote:
    > >> extra opacity in this case. But certainly a line like
    > >>
    > >> k = toupper(getchar());
    > >>
    > >> would be way out of line, as I understand it; we have no guarantee
    > >> that the user won't enter negative character values. Whereas we

    > >
    > > Yes we do. getchar returns, in an int, the unsigned value of an
    > > input char. The only negative value it ever returns is EOF.

    >
    > Whoops. You're right. Make that
    >
    > scanf("%c", &k);
    > k = toupper(k);
    >
    > then. I think there is no guarantee that 'scanf' will yield only
    > positive values for 'char'.


    scanf will implicitly use fgets to read a (byte) character. If k
    is a signed or plain char, then you are interpreting that read byte
    through an lvalue of that type. You are better off interpreting
    the byte through an unsigned char lvalue.

    --
    Peter
     
    Peter Nilsson, Mar 21, 2005
    #16
  17. On Mon, 21 Mar 2005, Kenneth Bull wrote:
    > Arthur J. O'Dwyer wrote:
    >> On Sun, 20 Mar 2005, CBFalconer wrote:
    >>> "Arthur J. O'Dwyer" wrote:
    >>>> extra opacity in this case. But certainly a line like
    >>>>
    >>>> k = toupper(getchar());
    >>>>
    >>>> would be way out of line, as I understand it; we have no guarantee
    >>>> that the user won't enter negative character values. Whereas we
    >>>
    >>> Yes we do. getchar returns, in an int, the unsigned value of an
    >>> input char. The only negative value it ever returns is EOF.

    >>
    >> Whoops. You're right. Make that
    >>
    >> scanf("%c", &k);
    >> k = toupper(k);

    >
    > Now you're writing about two different things (further evidenced by the
    > fact that they appear in two separate statements in your code), and
    > trying needlessly to relate the two to make a point.
    >
    > The point you are trying to make has very little to do with scanf, and
    > a lot of do with the type of the variable 'k' (which you have not shown
    > a declaration for).


    Nope. I surmise that you have not understood the point I'm trying
    to make. My point is that you need to verify user input (such as input
    that comes from 'scanf'[1]), as opposed to the kind of input a library
    function might get from a client program (such as C-style strings being
    passed to a sorting function, the original context of my remark).

    > So if anything, your code -somewhat- reverts back to the exact same
    > point Eric Sosman was making, without adding any special caveat for
    > scanf whatsoever.


    Huh? Eric basically said, "Not casting results in UB." I disagree;
    the cast is /only/ necessary when you're dealing with potentially
    unsafe input, and the only way to get unsafe input is from the user,
    via 'getchar', 'scanf', or any other <stdio.h> input function.

    There's nothing special about 'scanf' that makes it dangerous in
    this respect; but, as CBFalconer pointed out, there is something special
    about 'getchar' that makes it innocuous in this respect. That's why I
    corrected my "dangerous" code --- it hadn't been as dangerous as I had
    thought.

    -Arthur

    [1] - but not, technically speaking, 'getchar', which was what CBFalconer
    pointed out, and which was why I corrected my example to use the 'scanf'
    input function instead, which AFAIK provides no guarantee of its results'
    <ctype.h>-friendliness.
     
    Arthur J. O'Dwyer, Mar 22, 2005
    #17
  18. G Fernandes

    Eric Sosman Guest

    Arthur J. O'Dwyer wrote:
    >
    > Huh? Eric basically said, "Not casting results in UB." I disagree;
    > the cast is /only/ necessary when you're dealing with potentially
    > unsafe input, and the only way to get unsafe input is from the user,
    > via 'getchar', 'scanf', or any other <stdio.h> input function.


    char rebuttal[] = "Haben Sie alle Möglichkeiten betrachtet?"

    Granted: This cannot appear in a strictly conforming program,
    because it uses a character not found in the basic source or
    execution sets. "Strictly conforming" programs, though, seem
    to be a tiny minority; if you want to write robust code you
    should consider the possibility that it might be used outside
    the germ-proof bubble.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Mar 22, 2005
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kevin Spencer
    Replies:
    2
    Views:
    456
    John Saunders
    Aug 6, 2003
  2. Bruce .J Sam
    Replies:
    0
    Views:
    1,955
    Bruce .J Sam
    Jun 16, 2005
  3. Phil Bradby

    Exercise from Kernagan and Richie

    Phil Bradby, Apr 8, 2010, in forum: C Programming
    Replies:
    7
    Views:
    382
    Phil Bradby
    Apr 9, 2010
  4. M. Edward (Ed) Borasky
    Replies:
    2
    Views:
    96
    Tomasz Wegrzanowski
    Nov 3, 2006
  5. Patrick Nolan

    History exercise: Mac IE and XMLHttpRequest

    Patrick Nolan, Dec 29, 2007, in forum: Javascript
    Replies:
    7
    Views:
    112
    Thomas 'PointedEars' Lahn
    Dec 30, 2007
Loading...

Share This Page