Why does long w = 'word' fail?

Discussion in 'C Programming' started by petertwocakes, Nov 24, 2009.

  1. Hi,

    I'm trying to test a sequence of 4 characters from a ptr buffer
    against a long, but the test fails even though I think they should
    have the same value. e.g :


    char word[4] = {'w', 'o', 'r', 'd'};
    char *wordPtr = (char*)wordArray;
    long wordLong = 'word';
    long *longPtr = (long*)wordArrayPtr;
    long longPtrVal = *longPtr;

    Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    Shouldn't they be same?

    If not, how do test 4 character sequences in chunks like this?
    (without laboriously testing each char individually)
    I'm in a very large text buffer incrementing the current ptr until I
    hit "word"

    Thanks
    petertwocakes, Nov 24, 2009
    #1
    1. Advertising

  2. Well, this could be another troll, but this time I'll give it
    the benefit of the doubt.

    In article <>,
    petertwocakes <> wrote:

    >char word[4] = {'w', 'o', 'r', 'd'};
    >char *wordPtr = (char*)wordArray;
    >long wordLong = 'word';
    >long *longPtr = (long*)wordArrayPtr;
    >long longPtrVal = *longPtr;


    >Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    >Shouldn't they be same?


    There's no requirement for multi-character character literals to
    work in any particular way. If you look at the hex values, you
    will see that it's put the characters in the opposite order from
    what you're expecting. It's not portable; don't do it without
    a very good reason.

    >If not, how do test 4 character sequences in chunks like this?


    Don't.

    >(without laboriously testing each char individually)
    >I'm in a very large text buffer incrementing the current ptr until I
    >hit "word"


    Why not use the strstr() function? It's quite likely to be implemented
    efficiently.

    Incidentally, if you're intending to increment wordArraryPtr and
    dereference it to get a long at each position, you've got another
    problem: it won't work on machines where longs have to be aligned
    properly.

    -- Richard
    --
    Please remember to mention me / in tapes you leave behind.
    Richard Tobin, Nov 24, 2009
    #2
    1. Advertising

  3. petertwocakes

    Ike Naar Guest

    In article <>,
    petertwocakes <> wrote:
    >char word[4] = {'w', 'o', 'r', 'd'};
    >char *wordPtr = (char*)wordArray;
    >long wordLong = 'word';
    >long *longPtr = (long*)wordArrayPtr;
    >long longPtrVal = *longPtr;
    >Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239


    It's an endianness issue; apparently you're running the code on a
    little-endian machine,
    where the bytes 'w', 'o', 'r', 'd' (hex 0x77, 0x6f, 0x72, 0x64) in
    a ``long'' variable are interpreted as 0x64726f77 (decimal 1685221239).
    On a big-endian machine, the bytes would be interpreted as 0x776f7264
    (decimal 2003792484).

    >If not, how do test 4 character sequences in chunks like this?
    >(without laboriously testing each char individually)


    Use ``memcmp(&wordLong, &longPtrVal, 4)'', if that isn't too laborious
    for your taste.
    --

    SDF Public Access UNIX System - http://sdf.lonestar.org
    Ike Naar, Nov 24, 2009
    #3
  4. Re: Why does long w = 'word' fail?

    On 24 Nov, 11:46, (Ike Naar) wrote:
    > In article <..com>,
    >
    > petertwocakes  <> wrote:
    > >char word[4] = {'w', 'o', 'r', 'd'};
    > >char *wordPtr = (char*)wordArray;
    > >long wordLong = 'word';
    > >long *longPtr = (long*)wordArrayPtr;
    > >long longPtrVal = *longPtr;
    > >Yet, in the end wordLong = 2003792484, but longPtrVal  = 1685221239

    >
    > It's an endianness issue; apparently you're running the code on a
    > little-endian machine,
    > where the bytes 'w', 'o', 'r', 'd' (hex 0x77, 0x6f, 0x72, 0x64) in
    > a ``long'' variable are interpreted as 0x64726f77 (decimal 1685221239).
    > On a big-endian machine, the bytes would be interpreted as 0x776f7264
    > (decimal 2003792484).
    >
    > >If not, how do test 4 character sequences in chunks like this?
    > >(without laboriously testing each char individually)

    >
    > Use ``memcmp(&wordLong, &longPtrVal, 4)'', if that isn't too laborious
    > for your taste.
    > --
    >
    > SDF Public Access UNIX System -http://sdf.lonestar.org


    Thanks Richard and Ike, of course, it was an endian issue; I learnt C
    on an old Mac, pre-Intel days when this worked
    Which bears out your advice not to trust it either way.

    Ike, for "laborious" read run-time fast.

    Richard, I appreciate your help, but why on earth woulld you suspect
    this is a troll?
    I'm happy to admit at not being skilled in C, but the message was
    neither argumentative nor off-topic.
    petertwocakes, Nov 24, 2009
    #4
  5. Re: Why does long w = 'word' fail?

    In article <>,
    petertwocakes <> wrote:

    >Richard, I appreciate your help, but why on earth woulld you suspect
    >this is a troll?


    It had certain characteristics common to recent trolls in this group,
    viz:

    - choose one of the well-known unportable features of C (in
    this case, multi-character character literals);
    - add in a less obvious error (in this case, the implication
    of unaligned access) in the hope that the experts will miss it
    in their rush to give the obvious answer.

    I'm glad to see that it wasn't one.

    -- Richard
    --
    Please remember to mention me / in tapes you leave behind.
    Richard Tobin, Nov 24, 2009
    #5
  6. Re: Why does long w = 'word' fail?

    On 24 Nov, 12:38, (Richard Tobin) wrote:
    > In article <..com>,
    >
    > petertwocakes  <> wrote:
    > >Richard, I appreciate your help, but why on earth woulld you suspect
    > >this is a troll?

    >
    > It had certain characteristics common to recent trolls in this group,
    > viz:
    >
    >  - choose one of the well-known unportable features of C (in
    >    this case, multi-character character literals);
    >  - add in a less obvious error (in this case, the implication
    >    of unaligned access) in the hope that the experts will miss it
    >    in their rush to give the obvious answer.
    >
    > I'm glad to see that it wasn't one.
    >
    > -- Richard
    > --
    > Please remember to mention me / in tapes you leave behind.


    Ah! :)
    petertwocakes, Nov 24, 2009
    #6
  7. petertwocakes wrote:
    > I'm trying to test a sequence of 4 characters from a ptr buffer
    > against a long, but the test fails even though I think they should
    > have the same value. e.g :
    >
    >
    > char word[4] = {'w', 'o', 'r', 'd'};
    > char *wordPtr = (char*)wordArray;
    > long wordLong = 'word';
    > long *longPtr = (long*)wordArrayPtr;
    > long longPtrVal = *longPtr;
    >
    > Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    > Shouldn't they be same?
    >
    > If not, how do test 4 character sequences in chunks like this?
    > (without laboriously testing each char individually)
    > I'm in a very large text buffer incrementing the current ptr until I
    > hit "word"


    In a typical non-malicious implementation, the only thing you can test
    the value of a multi-character character sequence of reasonable length
    (fits in an 'int') against another multi-character character sequence of
    reasonable length. I.e. the implementation guarantees that 'word' is
    equal to 'word' and different from 'abcd'. This is the only meaningful
    use of multi-character character sequences. Trying to compare a
    multi-character character sequence to some other value formed in some
    other way (like re-interpretation of a character array, as in your
    example) is asking for trouble. It is not guaranteed to work. And it
    won't work. Stop wasting your time.

    In a malicious implementation all multi-character character sequences
    are actually allowed to evaluate to, say, zero, meaning that formally
    they can be completely useless. Fortunately, this is not normally the
    case in practice.

    --
    Best regards
    Andrey Tarasevich
    Andrey Tarasevich, Nov 24, 2009
    #7
  8. petertwocakes

    Seebs Guest

    On 2009-11-24, petertwocakes <> wrote:
    > I'm trying to test a sequence of 4 characters from a ptr buffer
    > against a long, but the test fails even though I think they should
    > have the same value. e.g :


    Because the meaning of a multiple-byte character concept is
    implementation defined.

    > char word[4] = {'w', 'o', 'r', 'd'};
    > char *wordPtr = (char*)wordArray;
    > long wordLong = 'word';
    > long *longPtr = (long*)wordArrayPtr;
    > long longPtrVal = *longPtr;


    > Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    > Shouldn't they be same?


    Not necessarily.

    > If not, how do test 4 character sequences in chunks like this?
    > (without laboriously testing each char individually)
    > I'm in a very large text buffer incrementing the current ptr until I
    > hit "word"


    First off, learn a bit more about your implementation. On a whole lot
    of modern hardware, what you're doing is going to be MUCH more expensive
    than testing individual characters, because you're going to be making
    unaligned accesses -- which can kill you completely or merely be slow.
    If it's running at all, it's probably slow.

    Secondly, there is no intrinsic right answer to the question of what order
    the bytes in a long are stored. x86 systems typically have the lowest-order
    bits first. You might find it more rewarding to look at the bytes in order
    of 0x11223344UL. If they're 44, 33, 22, 11, then you would need to take that
    into account.

    But in practice: strstr(buf, "word") is quite likely to be faster than
    whatever you write.

    -s
    --
    Copyright 2009, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    Seebs, Nov 24, 2009
    #8
  9. petertwocakes <> writes:
    > I'm trying to test a sequence of 4 characters from a ptr buffer
    > against a long, but the test fails even though I think they should
    > have the same value. e.g :
    >
    >
    > char word[4] = {'w', 'o', 'r', 'd'};
    > char *wordPtr = (char*)wordArray;
    > long wordLong = 'word';
    > long *longPtr = (long*)wordArrayPtr;
    > long longPtrVal = *longPtr;
    >
    > Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    > Shouldn't they be same?


    I'm a little surprised nobody else pointed out that you never declared
    wordArray or wordArrayPtr. I think what you meant was:

    char wordArray[4] = {'w', 'o', 'r', 'd'};
    char *wordArrayPtr = (char*)wordArray;
    long wordLong = 'word';
    long *longPtr = (long*)wordArrayPtr;
    long longPtrVal = *longPtr;

    Note that the cast on the second line is unnecessary.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 24, 2009
    #9
  10. petertwocakes <> writes:
    > I'm trying to test a sequence of 4 characters from a ptr buffer
    > against a long, but the test fails even though I think they should
    > have the same value. e.g :
    >
    >
    > char word[4] = {'w', 'o', 'r', 'd'};
    > char *wordPtr = (char*)wordArray;
    > long wordLong = 'word';
    > long *longPtr = (long*)wordArrayPtr;
    > long longPtrVal = *longPtr;
    >
    > Yet, in the end wordLong = 2003792484, but longPtrVal = 1685221239
    > Shouldn't they be same?


    Print them out in hex, and you'll see the problem.

    Hint: I guessed the answer before I tried it myself.
    Hint2: I guessed (correctly) you were on an x86 system.

    --
    -Ed Falk,
    http://thespamdiaries.blogspot.com/
    Edward A. Falk, Nov 25, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. George Marsaglia

    Assigning unsigned long to unsigned long long

    George Marsaglia, Jul 8, 2003, in forum: C Programming
    Replies:
    1
    Views:
    645
    Eric Sosman
    Jul 8, 2003
  2. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,666
    Smokey Grindel
    Dec 2, 2006
  3. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,161
    Peter Shaggy Haywood
    Sep 20, 2005
  4. Mathieu Dutour

    long long and long

    Mathieu Dutour, Jul 17, 2007, in forum: C Programming
    Replies:
    4
    Views:
    447
    santosh
    Jul 24, 2007
  5. Oliver Graeser
    Replies:
    10
    Views:
    565
    Oliver Graeser
    Sep 26, 2008
Loading...

Share This Page