Degenerate strcmp

Discussion in 'C Programming' started by fishpond@invalid.com, Aug 17, 2007.

  1. Guest

    One way I've seen strcmp(char *s1, char *s2) implemented is: return
    immediately if s1==s2 (equality of pointers); otherwise do the usual
    thing of searching through the memory at s1 and s2.

    Of course the reason for doing this is to save time in case equal
    pointers are passed to strcmp. But it seems to me that this could create
    an inconsistency in the degenerate case when s1 points to memory that is
    not null-terminated, i.e. by some freak chance, all of the memory from
    s1 till the computer reaches the end of all its memory pages (however
    that works) don't contain a single null byte. In this case, strcmp
    should not say that s1 and s2 are "equal strings" since neither is
    actually a string (because not null terminated).

    Is my thinking correct?

    --
    Q: "What is the burning question on the mind of every dyslexic
    existentialist?"
    A: "Is there a dog?"
    , Aug 17, 2007
    #1
    1. Advertising

  2. pete Guest

    wrote:
    >
    > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > thing of searching through the memory at s1 and s2.
    >
    > Of course the reason for doing this is to save time in case equal
    > pointers are passed to strcmp.
    > But it seems to me that this could create
    > an inconsistency in the degenerate case when s1 points
    > to memory that is
    > not null-terminated, i.e. by some freak chance, all of the memory from
    > s1 till the computer reaches the end of all its memory pages (however
    > that works) don't contain a single null byte. In this case, strcmp
    > should not say that s1 and s2 are "equal strings" since neither is
    > actually a string (because not null terminated).
    >
    > Is my thinking correct?


    No.
    The behavior of strcmp is only defined for
    cases when s1 and s2 both point to strings.
    If it's not null terminated, it's not a string.

    In cases where the behavior of the code is not defined,
    the running program can do whatever it wants.
    That's the rules of the C programming language.

    --
    pete
    pete, Aug 18, 2007
    #2
    1. Advertising

  3. Guest

    On 17 Aug 2007 at 23:01, pete wrote:
    > wrote:
    >>
    >> One way I've seen strcmp(char *s1, char *s2) implemented is: return
    >> immediately if s1==s2 (equality of pointers); otherwise do the usual
    >> thing of searching through the memory at s1 and s2.
    >>
    >> Of course the reason for doing this is to save time in case equal
    >> pointers are passed to strcmp.
    >> But it seems to me that this could create
    >> an inconsistency in the degenerate case when s1 points
    >> to memory that is
    >> not null-terminated, i.e. by some freak chance, all of the memory from
    >> s1 till the computer reaches the end of all its memory pages (however
    >> that works) don't contain a single null byte. In this case, strcmp
    >> should not say that s1 and s2 are "equal strings" since neither is
    >> actually a string (because not null terminated).
    >>
    >> Is my thinking correct?

    >
    > No.
    > The behavior of strcmp is only defined for
    > cases when s1 and s2 both point to strings.
    > If it's not null terminated, it's not a string.


    Your right that a string has to be null terminated, but for random
    memory maybe by chance there just isn't any null byte.

    For example, for the program

    main() { printf("%d\n",strlen(malloc(0))); }

    this will print out a random number, but in principal the strlen call
    might never terminate, if the memory at the pointer returned by malloc
    doesn't have any null bytes. (OK, it's very unlikely, but it could
    happen in theory...)

    >
    > In cases where the behavior of the code is not defined,
    > the running program can do whatever it wants.
    > That's the rules of the C programming language.
    >
    > --
    > pete


    --
    Hlade's Law:
    If you have a difficult task, give it to a lazy person --
    they will find an easier way to do it.
    , Aug 18, 2007
    #3
  4. pete Guest

    wrote:
    >
    > On 17 Aug 2007 at 23:01, pete wrote:


    > but for random
    > memory maybe by chance there just isn't any null byte.


    > > In cases where the behavior of the code is not defined,
    > > the running program can do whatever it wants.
    > > That's the rules of the C programming language.


    Do you understand what I've quoted of myself here?

    --
    pete
    pete, Aug 18, 2007
    #4
  5. user923005 Guest

    On Aug 17, 3:47 pm, wrote:
    > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > thing of searching through the memory at s1 and s2.


    Adding an if() test for that is not (in general) a good idea.
    A missed branch prediction is expensive.
    How often are you really going to do this:
    if (strcmp(p,p)==0) call_captain_obvious();
    A library function with a quirk like that would make me worry about
    the quality of the implementation.

    > Of course the reason for doing this is to save time in case equal
    > pointers are passed to strcmp. But it seems to me that this could create
    > an inconsistency in the degenerate case when s1 points to memory that is
    > not null-terminated, i.e. by some freak chance, all of the memory from
    > s1 till the computer reaches the end of all its memory pages (however
    > that works) don't contain a single null byte. In this case, strcmp
    > should not say that s1 and s2 are "equal strings" since neither is
    > actually a string (because not null terminated).
    >
    > Is my thinking correct?


    It is undefined behavior in any case to call strcmp() with addresses
    that are not null terminated strings.

    > --
    > Q: "What is the burning question on the mind of every dyslexic
    > existentialist?"
    > A: "Is there a dog?"


    The actual joke goes:
    Q: What does an agnostic, insomniac, dyslexic person do?
    A: He lays awake at night, wondering if there is a dog.
    user923005, Aug 18, 2007
    #5
  6. pete Guest

    user923005 wrote:
    >
    > On Aug 17, 3:47 pm, wrote:
    > > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > > thing of searching through the memory at s1 and s2.

    >
    > Adding an if() test for that is not (in general) a good idea.
    > A missed branch prediction is expensive.
    > How often are you really going to do this:
    > if (strcmp(p,p)==0) call_captain_obvious();
    > A library function with a quirk like that would make me worry about
    > the quality of the implementation.


    Anybody who writes code to compare string p with string p,
    isn't in a rush.
    And that's one of the reasons why I think that it's usually bad
    to optimize the degenerate special case
    at any cost at all to the general case.

    --
    pete
    pete, Aug 18, 2007
    #6
  7. Eric Sosman Guest

    wrote:
    > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > thing of searching through the memory at s1 and s2.
    >
    > Of course the reason for doing this is to save time in case equal
    > pointers are passed to strcmp. But it seems to me that this could create
    > an inconsistency in the degenerate case when s1 points to memory that is
    > not null-terminated, i.e. by some freak chance, all of the memory from
    > s1 till the computer reaches the end of all its memory pages (however
    > that works) don't contain a single null byte. In this case, strcmp
    > should not say that s1 and s2 are "equal strings" since neither is
    > actually a string (because not null terminated).
    >
    > Is my thinking correct?


    What you seem to have missed is that there is no "correct"
    behavior in the case you describe: The behavior is undefined
    because the arguments are not strings. Returning zero is one
    possible behavior, a SIGSEGV is another, a graphic of a nasal
    demon whistling "Dixie" while riding backwards on a bicycle
    is yet another.

    --
    Eric Sosman
    lid
    Eric Sosman, Aug 18, 2007
    #7
  8. Eric Sosman Guest

    user923005 wrote:
    > On Aug 17, 3:47 pm, wrote:
    >> One way I've seen strcmp(char *s1, char *s2) implemented is: return
    >> immediately if s1==s2 (equality of pointers); otherwise do the usual
    >> thing of searching through the memory at s1 and s2.

    >
    > Adding an if() test for that is not (in general) a good idea.
    > A missed branch prediction is expensive.
    > How often are you really going to do this:
    > if (strcmp(p,p)==0) call_captain_obvious();
    > A library function with a quirk like that would make me worry about
    > the quality of the implementation.


    I tried to use strcmp(p, p) as a deliberate time-waster
    once. I wanted to study the behavior of a sorting function
    with "fast" and "slow" user-supplied comparators: the "slow"
    one was strcmp(p, p) followed by a call to the "fast" one.
    My program adjusted the length of the string at p until I got
    a ten-to-one speed ratio.

    This worked fine on several systems, but alas! I ran
    across one where my setup code kept making p longer and longer
    without slowing anything down -- and it turned out that strcmp
    was returning zero immediately, as described.

    BUT: Was this a stupid test? I don't think so. The
    strcmp implementation made a bunch of (highly non-portable)
    tests to decide whether it could replace a byte-by-byte loop
    with a loop that took bigger, er, bites: two, four, or even
    eight at a time. The decision was based on the alignments
    of the two operands -- and the "both operands equal" case just
    fell out of the alignment tests.

    Eventually, I arranged for p to consist entirely of 'X'
    and called strcmp(p, p+1), proceeding to call the "fast" method
    if ("if") strcmp didn't return zero. Works like a charm.

    --
    Eric Sosman
    lid
    Eric Sosman, Aug 18, 2007
    #8
  9. CBFalconer Guest

    user923005 wrote:
    >

    .... snip ...
    >
    > It is undefined behavior in any case to call strcmp() with addresses
    > that are not null terminated strings.


    That is a bar to relying on any particular behaviour in such a
    circumstance, not a bar to defining the result. Thus the following
    should be a satisfactory system implementation of strcmp():

    int strcmp(const char *_s1, const char *_s2) {
    if (_s1 == _s2) return 0;
    else
    while (*_s1 && (*_s1 == *_s2)) {
    _s1++; _s2++;
    }
    return (*_s1 > *_s2) - (*_s1 < *_s2);
    } /* strcmp */

    --
    Chuck F (cbfalconer at maineline dot net)
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net>



    --
    Posted via a free Usenet account from http://www.teranews.com
    CBFalconer, Aug 18, 2007
    #9
  10. user923005 said:

    > On Aug 17, 3:47 pm, wrote:
    >
    >> --
    >> Q: "What is the burning question on the mind of every dyslexic
    >> existentialist?"
    >> A: "Is there a dog?"

    >
    > The actual joke goes:
    > Q: What does an agnostic, insomniac, dyslexic person do?
    > A: He lays awake at night, wondering if there is a dog.


    Can we lay off the dyslexia jokes, please? They're not clever, and
    they're not furry.

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
    Richard Heathfield, Aug 18, 2007
    #10
  11. user923005 Guest

    On Aug 17, 11:27 pm, Richard Heathfield <> wrote:
    > user923005 said:
    >
    > > On Aug 17, 3:47 pm, wrote:

    >
    > >> --
    > >> Q: "What is the burning question on the mind of every dyslexic
    > >> existentialist?"
    > >> A: "Is there a dog?"

    >
    > > The actual joke goes:
    > > Q: What does an agnostic, insomniac, dyslexic person do?
    > > A: He lays awake at night, wondering if there is a dog.

    >
    > Can we lay off the dyslexia jokes, please? They're not clever, and
    > they're not furry.


    Being dyslexic myself, I can tell you that permutations are likely
    errors, but substitution of r's for n's is improbable.
    There are some substitutions that might occur, but they are likely to
    be b, p, d, q which are (after all) the same letter.
    In my case it has the strange effect that mirrored writing is very
    easy for me to read -- in fact too easy.

    I have had to train myself about glass doors to stores. If it says
    'PULL' on it but it is written on the other side of the door, anyone
    else would instantly see that it is reversed and would push instead of
    pull. If I am not thinking hard about it, I am quite likely to read
    "PULL" and give it a good yank.
    user923005, Aug 18, 2007
    #11
  12. user923005 Guest

    On Aug 17, 6:56 pm, Eric Sosman <> wrote:
    > user923005 wrote:
    > > On Aug 17, 3:47 pm, wrote:
    > >> One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > >> immediately if s1==s2 (equality of pointers); otherwise do the usual
    > >> thing of searching through the memory at s1 and s2.

    >
    > > Adding an if() test for that is not (in general) a good idea.
    > > A missed branch prediction is expensive.
    > > How often are you really going to do this:
    > > if (strcmp(p,p)==0) call_captain_obvious();
    > > A library function with a quirk like that would make me worry about
    > > the quality of the implementation.

    >
    > I tried to use strcmp(p, p) as a deliberate time-waster
    > once. I wanted to study the behavior of a sorting function
    > with "fast" and "slow" user-supplied comparators: the "slow"
    > one was strcmp(p, p) followed by a call to the "fast" one.
    > My program adjusted the length of the string at p until I got
    > a ten-to-one speed ratio.
    >
    > This worked fine on several systems, but alas! I ran
    > across one where my setup code kept making p longer and longer
    > without slowing anything down -- and it turned out that strcmp
    > was returning zero immediately, as described.
    >
    > BUT: Was this a stupid test? I don't think so. The
    > strcmp implementation made a bunch of (highly non-portable)
    > tests to decide whether it could replace a byte-by-byte loop
    > with a loop that took bigger, er, bites: two, four, or even
    > eight at a time. The decision was based on the alignments
    > of the two operands -- and the "both operands equal" case just
    > fell out of the alignment tests.
    >
    > Eventually, I arranged for p to consist entirely of 'X'
    > and called strcmp(p, p+1), proceeding to call the "fast" method
    > if ("if") strcmp didn't return zero. Works like a charm.


    Turns out that the possible missed branch prediction is
    inconsequential (after all, it happens only once):

    int testing_strcmp(const char *_s1, const char *_s2)
    {
    if (_s1 == _s2)
    return 0;
    else
    while (*_s1 && (*_s1 == *_s2)) {
    _s1++;
    _s2++;
    }
    return (*_s1 > *_s2) - (*_s1 < *_s2);
    } /* strcmp */

    int non_testing_strcmp(const char *_s1, const char *_s2)
    {
    while (*_s1 && (*_s1 == *_s2)) {
    _s1++;
    _s2++;
    }
    return (*_s1 > *_s2) - (*_s1 < *_s2);
    } /* strcmp */

    #include <time.h>
    #include <stdio.h>
    #include <stdlib.h>

    char s1[32767];
    char s2[32767];
    void cmptest1(FILE * f, int (*cmp) (const char *, const
    char *))
    {
    while (fgets(s1, sizeof s1, f)) {
    if (fgets(s2, sizeof s2, f)) {
    (void) cmp(s1, s2);
    } else
    break;
    }
    }

    int main(int argc, char **argv)
    {
    clock_t end, start;
    if (argc <= 1) {
    puts("USAGE: strcmptest <filename>");
    exit(EXIT_FAILURE);
    }
    {
    FILE *f = fopen(argv[1], "rt");
    if (f == NULL) {
    printf("ERROR: Failed to open file %s\n", argv[1]);
    exit(EXIT_FAILURE);
    }
    start = clock();
    cmptest1(f, testing_strcmp);
    end = clock();
    printf("With testing strcmp time is %f seconds\n", (end -
    start)*1.0/CLOCKS_PER_SEC);
    rewind(f);
    start = clock();
    cmptest1(f, non_testing_strcmp);
    end = clock();
    printf("With non-testing strcmp time is %f seconds\n", (end -
    start)*1.0/CLOCKS_PER_SEC);
    }
    return 0;
    }

    /*
    C:\tmp>dir b.txt
    Volume in drive C has no label.
    Volume Serial Number is 0890-87CA

    Directory of C:\tmp

    08/18/2007 12:34 AM 65,969,707 b.txt
    1 File(s) 65,969,707 bytes
    0 Dir(s) 1,602,658,304 bytes free

    C:\tmp>strcmptest b.txt
    With testing strcmp time is 0.859000 seconds
    With non-testing strcmp time is 0.859000 seconds

    C:\tmp>strcmptest b.txt
    With testing strcmp time is 0.859000 seconds
    With non-testing strcmp time is 0.875000 seconds

    C:\tmp>strcmptest b.txt
    With testing strcmp time is 0.859000 seconds
    With non-testing strcmp time is 0.859000 seconds

    C:\tmp>strcmptest b.txt
    With testing strcmp time is 0.859000 seconds
    With non-testing strcmp time is 0.859000 seconds

    C:\tmp>strcmptest b.txt
    With testing strcmp time is 0.875000 seconds
    With non-testing strcmp time is 0.843000 seconds
    */
    user923005, Aug 18, 2007
    #12
  13. On Aug 17, 3:47 pm, wrote:
    > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > thing of searching through the memory at s1 and s2.
    >
    > Of course the reason for doing this is to save time in case equal
    > pointers are passed to strcmp. But it seems to me that this could create
    > an inconsistency in the degenerate case when s1 points to memory that is
    > not null-terminated, i.e. by some freak chance, all of the memory from
    > s1 till the computer reaches the end of all its memory pages (however
    > that works) don't contain a single null byte. In this case, strcmp
    > should not say that s1 and s2 are "equal strings" since neither is
    > actually a string (because not null terminated).
    >
    > Is my thinking correct?


    Indefinitely searching for '\0' will do no good. In the simplest case
    it may either cause some sort of memory protection exception or hang
    the system if the address wrap around is permitted. If the access goes
    where memory mapped devices are, it may be worse and may even damage
    the hardware. In these cases strcmp() may not return at all and so I'm
    not sure if talking about consistency is any meaningful.

    Alex
    Alexei A. Frounze, Aug 18, 2007
    #13
  14. On 18 Aug 2007 at 1:40, Eric Sosman wrote:
    > wrote:
    >> One way I've seen strcmp(char *s1, char *s2) implemented is: return
    >> immediately if s1==s2 (equality of pointers); otherwise do the usual
    >> thing of searching through the memory at s1 and s2.
    >>
    >> Of course the reason for doing this is to save time in case equal
    >> pointers are passed to strcmp. But it seems to me that this could create
    >> an inconsistency in the degenerate case when s1 points to memory that is
    >> not null-terminated, i.e. by some freak chance, all of the memory from
    >> s1 till the computer reaches the end of all its memory pages (however
    >> that works) don't contain a single null byte. In this case, strcmp
    >> should not say that s1 and s2 are "equal strings" since neither is
    >> actually a string (because not null terminated).
    >>
    >> Is my thinking correct?

    >
    > What you seem to have missed is that there is no "correct"
    > behavior in the case you describe: The behavior is undefined
    > because the arguments are not strings. Returning zero is one
    > possible behavior, a SIGSEGV is another, a graphic of a nasal
    > demon whistling "Dixie" while riding backwards on a bicycle
    > is yet another.


    I think the subtle point is the following: a char * isn't actually the
    same thing as a string. A char * is a pointer to some bytes of memory,
    but is s is a char * then for s to be a string, we need the sequence
    *s, *(s+1), *(s+2), ..., *(s+i), ... to actually contain a 0 byte for
    some i. In practice memory will have 0 bytes all over the place, but
    there's still a theoretical possibility that there won't be zero byte
    for any i until the memory space is completely exhausted.

    Maybe the program I put in the other thread

    main() { printf("%d\n",strlen(malloc(0))); }

    illustrates this more simply than strcmp: malloc(0) returns a pointer to
    some random place in memory, and there's no absolute guarantee that a
    0-byte will occur later in memory, so what gets printed could be a
    random number or in theory the program could just never terminate.

    Part of the confusion seems to be the names: for example, strlen takes a
    char * and returns an int. If the parameter is a string, then the
    integer is the length of the string and that makes perfect sense. But
    what strlen actually takes is a general char *, not necessarily a
    string, and if you pass strlen a char * that isn't a string then you
    need to think more carefully about how to interpret the return value of
    strlen (or strlen might not terminate at all).

    >
    > --
    > Eric Sosman
    > lid


    --
    The difference between a career and a job is about 20 hours a week.
    Antoninus Twink, Aug 18, 2007
    #14
  15. On Aug 18, 10:19 am, Antoninus Twink <> wrote:
    > On 18 Aug 2007 at 1:40, Eric Sosman wrote:
    > > wrote:
    > >> One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > >> immediately if s1==s2 (equality of pointers); otherwise do the usual
    > >> thing of searching through the memory at s1 and s2.

    >
    > >> Of course the reason for doing this is to save time in case equal
    > >> pointers are passed to strcmp. But it seems to me that this could create
    > >> an inconsistency in the degenerate case when s1 points to memory that is
    > >> not null-terminated, i.e. by some freak chance, all of the memory from
    > >> s1 till the computer reaches the end of all its memory pages (however
    > >> that works) don't contain a single null byte. In this case, strcmp
    > >> should not say that s1 and s2 are "equal strings" since neither is
    > >> actually a string (because not null terminated).

    >
    > >> Is my thinking correct?

    >
    > > What you seem to have missed is that there is no "correct"
    > > behavior in the case you describe: The behavior is undefined
    > > because the arguments are not strings. Returning zero is one
    > > possible behavior, a SIGSEGV is another, a graphic of a nasal
    > > demon whistling "Dixie" while riding backwards on a bicycle
    > > is yet another.

    >
    > I think the subtle point is the following: a char * isn't actually the
    > same thing as a string. A char * is a pointer to some bytes of memory,
    > but is s is a char * then for s to be a string, we need the sequence
    > *s, *(s+1), *(s+2), ..., *(s+i), ... to actually contain a 0 byte for
    > some i. In practice memory will have 0 bytes all over the place, but
    > there's still a theoretical possibility that there won't be zero byte
    > for any i until the memory space is completely exhausted.


    What is "subtle" about this? It's just the definition of a string, and
    it's very simple.

    > Maybe the program I put in the other thread
    >
    > main() { printf("%d\n",strlen(malloc(0))); }
    >
    > illustrates this more simply than strcmp: malloc(0) returns a pointer to
    > some random place in memory, and there's no absolute guarantee that a
    > 0-byte will occur later in memory, so what gets printed could be a
    > random number or in theory the program could just never terminate.


    I don't understand your point. You seem to be working hard to tell us
    that a string is an array of chars up to and including the first zero-
    valued character. Of course it is, since that's what it's defined to
    be. If there is no zero-valued character in the array, then the array
    doesn't contain a string.

    > Part of the confusion seems to be the names: for example, strlen takes a
    > char * and returns an int. If the parameter is a string, then the
    > integer is the length of the string and that makes perfect sense. But
    > what strlen actually takes is a general char *, not necessarily a
    > string, and if you pass strlen a char * that isn't a string then you
    > need to think more carefully about how to interpret the return value of
    > strlen (or strlen might not terminate at all).


    You don't need to be careful about anything if you do this, since you
    cannot predict how the system will behave. What you need to be careful
    about is not doing this in the first place. strlen() takes a string;
    it is your responsibility to ensure that you only ever give it a
    string; if you give it anything else, there's no saying what might
    happen.
    J. J. Farrell, Aug 18, 2007
    #15
  16. Guest

    On Aug 18, 12:47 am, wrote:
    > One way I've seen strcmp(char *s1, char *s2) implemented is: return
    > immediately if s1==s2 (equality of pointers); otherwise do the usual
    > thing of searching through the memory at s1 and s2.
    >
    > Of course the reason for doing this is to save time in case equal
    > pointers are passed to strcmp.


    It would be interesting to see how many times (passing two equal
    pointers) to strcmp() happens.
    , Aug 18, 2007
    #16
  17. pete Guest

    Antoninus Twink wrote:

    > malloc(0) returns a pointer to
    > some random place in memory,


    malloc(0) may also return a null pointer instead.

    --
    pete
    pete, Aug 18, 2007
    #17
  18. Flash Gordon Guest

    user923005 wrote, On 18/08/07 08:02:
    > On Aug 17, 11:27 pm, Richard Heathfield <> wrote:
    >> user923005 said:
    >>
    >>> On Aug 17, 3:47 pm, wrote:
    >>>> --
    >>>> Q: "What is the burning question on the mind of every dyslexic
    >>>> existentialist?"
    >>>> A: "Is there a dog?"
    >>> The actual joke goes:
    >>> Q: What does an agnostic, insomniac, dyslexic person do?
    >>> A: He lays awake at night, wondering if there is a dog.

    >> Can we lay off the dyslexia jokes, please? They're not clever, and
    >> they're not furry.


    Except when maid by a Dyslexic. ;-)

    > Being dyslexic myself, I can tell you that permutations are likely
    > errors, but substitution of r's for n's is improbable.
    > There are some substitutions that might occur, but they are likely to
    > be b, p, d, q which are (after all) the same letter.


    Agreed. Getting mixed up between 5 and 2 was causing me a lot of grief
    yesterday, made worse by the fact I was dealing with IP addresses
    starting both 85. and 82.

    I could never get the hang of the fact the writing was meant to go from
    left to right until I started to learn to read music. It was also a
    problem for others when it was my turn to lay the table, whether the
    knife was on the left or the right was entirely random, and I was use
    them whichever way they were without even noticing.

    > In my case it has the strange effect that mirrored writing is very
    > easy for me to read -- in fact too easy.
    >
    > I have had to train myself about glass doors to stores. If it says
    > 'PULL' on it but it is written on the other side of the door, anyone
    > else would instantly see that it is reversed and would push instead of
    > pull. If I am not thinking hard about it, I am quite likely to read
    > "PULL" and give it a good yank.


    I've done that on several occasions as well. With me I am most likely to
    be caught by short words what I recognise as a single chunk.
    --
    Flash Gordon
    Flash Gordon, Aug 18, 2007
    #18
  19. Eric Sosman Guest

    CBFalconer wrote:
    > user923005 wrote:
    > ... snip ...
    >> It is undefined behavior in any case to call strcmp() with addresses
    >> that are not null terminated strings.

    >
    > That is a bar to relying on any particular behaviour in such a
    > circumstance, not a bar to defining the result. Thus the following
    > should be a satisfactory system implementation of strcmp():
    >
    > int strcmp(const char *_s1, const char *_s2) {
    > if (_s1 == _s2) return 0;
    > else
    > while (*_s1 && (*_s1 == *_s2)) {
    > _s1++; _s2++;
    > }
    > return (*_s1 > *_s2) - (*_s1 < *_s2);
    > } /* strcmp */


    A digression from the main point (with which I agree):
    There's a potential bug in the `return' statement on systems
    where `char' is signed. 7.2.1.4p1:

    "The sign of a nonzero value returned by the
    comparison functions [...] is determined by
    the sign of the difference between the values of
    the first pair of characters (both interpreted
    as unsigned char) that differ [...]"

    That is, "abc\200" must compare greater than "abc\177",
    even when the final `char' value in the first string is
    negative.

    Suggested fix:

    return ((unsigned char) *_s1 > (unsigned char) *_s2)
    - ((unsigned char) *_s1 < (unsigned char) *_s2);

    --
    Eric Sosman
    lid
    Eric Sosman, Aug 18, 2007
    #19
  20. pete Guest

    Eric Sosman wrote:
    >
    > CBFalconer wrote:
    > > user923005 wrote:
    > > ... snip ...
    > >> It is undefined behavior in any case to call strcmp() with addresses
    > >> that are not null terminated strings.

    > >
    > > That is a bar to relying on any particular behaviour in such a
    > > circumstance, not a bar to defining the result. Thus the following
    > > should be a satisfactory system implementation of strcmp():
    > >
    > > int strcmp(const char *_s1, const char *_s2) {
    > > if (_s1 == _s2) return 0;
    > > else
    > > while (*_s1 && (*_s1 == *_s2)) {
    > > _s1++; _s2++;
    > > }
    > > return (*_s1 > *_s2) - (*_s1 < *_s2);
    > > } /* strcmp */

    >
    > A digression from the main point (with which I agree):
    > There's a potential bug in the `return' statement on systems
    > where `char' is signed. 7.2.1.4p1:
    >
    > "The sign of a nonzero value returned by the
    > comparison functions [...] is determined by
    > the sign of the difference between the values of
    > the first pair of characters (both interpreted
    > as unsigned char) that differ [...]"
    >
    > That is, "abc\200" must compare greater than "abc\177",
    > even when the final `char' value in the first string is
    > negative.
    >
    > Suggested fix:
    >
    > return ((unsigned char) *_s1 > (unsigned char) *_s2)
    > - ((unsigned char) *_s1 < (unsigned char) *_s2);


    My Turn!
    /*
    ** Could not care less about the s1 == s2 case
    */
    int str_cmp(const char *s1, const char *s2)
    {
    const unsigned char *p1 = (const unsigned char *)s1;
    const unsigned char *p2 = (const unsigned char *)s2;

    for (;;) {
    if (*p1 != *p2) {
    return *p2 > *p1 ? -1 : 1;
    }
    if (*p1 == '\0') {
    return 0;
    }
    ++p1;
    ++p2;
    }
    }

    --
    pete
    pete, Aug 18, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shane Peck

    strcmp problem

    Shane Peck, Sep 19, 2003, in forum: C++
    Replies:
    6
    Views:
    2,340
    Kevin Goodsell
    Sep 22, 2003
  2. muser

    strcmp

    muser, Oct 3, 2003, in forum: C++
    Replies:
    6
    Views:
    1,094
    Frank Schmitt
    Oct 9, 2003
  3. Andrej Hocevar

    please help with strcmp()

    Andrej Hocevar, Jul 19, 2003, in forum: C Programming
    Replies:
    3
    Views:
    320
    Gordon Burditt
    Jul 19, 2003
  4. Allan Bruce

    strcmp but with '\n' as the terrminator

    Allan Bruce, Jul 19, 2003, in forum: C Programming
    Replies:
    53
    Views:
    1,367
    Arthur J. O'Dwyer
    Jul 30, 2003
  5. Robert Mens

    segfault when strcmp

    Robert Mens, Oct 22, 2003, in forum: C Programming
    Replies:
    6
    Views:
    565
    Peter Nilsson
    Oct 23, 2003
Loading...

Share This Page