tokens

Discussion in 'C Programming' started by mdh, Aug 11, 2008.

  1. mdh

    mdh Guest

    Hi all,
    From p 125, gives rise to this issue for me.

    Is it true that a "token" in C ( philisophically ) is the least amount
    of digits/chars/underscores/*s ( and other non blank space that I
    have not thought of) that the compiler uses to derive useable
    information. So, this would be a token

    " ( ) "

    but this

    "( "

    by itself would not?

    Thanks as usual.
     
    mdh, Aug 11, 2008
    #1
    1. Advertising

  2. On Sun, 10 Aug 2008 16:20:02 -0700, mdh wrote:
    > Hi all,
    > From p 125, gives rise to this issue for me.
    >
    > Is it true that a "token" in C ( philisophically ) is the least amount
    > of digits/chars/underscores/*s ( and other non blank space that I have
    > not thought of) that the compiler uses to derive useable information.


    In a way.

    > So, this would be a token
    >
    > " ( ) "


    If you mean including the quotation marks, then yes. Otherwise, no.

    > but this
    >
    > "( "
    >
    > by itself would not?


    So, no. A token is a word. A token is the largest string of characters
    that cannot have whitespace inserted without breaking it into smaller
    tokens, changing its meaning. () is not a token. It is two tokens, which
    you can tell from the fact that you can separate the two by writing ( ).
    If it were a single token, this would not be allowed. "" is a token,
    because you cannot separate the quotation marks by writing " " -- at least
    not without changing its meaning.

    > Thanks as usual.
     
    Harald van Dijk, Aug 11, 2008
    #2
    1. Advertising

  3. mdh

    mdh Guest

    On Aug 10, 4:40 pm, Harald van D©¦k <> wrote:
    > On Sun, 10 Aug 2008 16:20:02 -0700, mdh wrote:


    > >...... a "token" in C ( philisophically ) is the least amount
    > > of digits/chars/underscores/*s ( and other non blank space that I have
    > > not thought of) that the compiler uses to derive useable information.

    >
    >
    >
    > ...... A token is a word. A token is the largest string of characters..........
    > changing its meaning.


    So,

    (

    is a token as it has some meaning to the compiler.... "get some more
    characters, see if followed by another ) and if not, there is an
    error" etc? or " found second ) so comma delineated list are
    arguments" etc.


    Is that why my example of ( ) is not a single token as the first "("
    could be the start of a lot of different things?
     
    mdh, Aug 11, 2008
    #3
  4. On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    > Is that why my example of ( ) is not a single token as the first "("
    > could be the start of a lot of different things?


    No, your example of ( ) is not a single token because the two can be
    separated.
     
    Harald van Dijk, Aug 11, 2008
    #4
  5. mdh

    mdh Guest

    On Aug 10, 5:11 pm, Harald van D©¦k <> wrote:
    > On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    > > Is that why my example of ( ) is not a single token as the first "("
    > > could be the start of a lot of different things?

    >
    > No, your example of ( ) is not a single token because the two can be
    > separated.


    Sorry...then I did not make myself clear. I am agreeing with you. ()
    are 2 tokens, as each ( has a a meaning to the compiler.

    The reason this is somewhat confusing is that on p125 of K&R2, they
    define "tokens" ( and the quotation marks are theirs, not mine) as a
    pair of parentheses, a pair of brackets perhaps including a number. I
    assume that their definition is for the sake of the example, then.
     
    mdh, Aug 11, 2008
    #5
  6. On Sun, 10 Aug 2008 17:16:15 -0700, mdh wrote:
    > On Aug 10, 5:11 pm, Harald van Dijk <> wrote:
    >> On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    >> > Is that why my example of ( ) is not a single token as the first "("
    >> > could be the start of a lot of different things?

    >>
    >> No, your example of ( ) is not a single token because the two can be
    >> separated.

    >
    > Sorry...then I did not make myself clear. I am agreeing with you. () are
    > 2 tokens, as each ( has a a meaning to the compiler.


    I understood that you agreed that ( ) are two tokens, but I did not agree
    with my understanding of your reasoning. It is possible I misunderstood
    you, so I will give a different example. - can also be the start of a lot
    of different things, such as -= or -- or ->, but those three are three
    single tokens, each consisting of two characters. You cannot decrement a
    variable using a- - or a- =b. You cannot dereference a pointer to a
    structure using a- >b. In other words, you cannot separate the - from the
    second character. At the same time, - in 3-2 is a token by itself.

    > The reason this is somewhat confusing is that on p125 of K&R2, they
    > define "tokens" ( and the quotation marks are theirs, not mine) as a
    > pair of parentheses, a pair of brackets perhaps including a number.


    This seems strange, but...

    > I assume that their definition is for the sake of the example, then.


    without knowing the context, I cannot be sure. I don't have K&R, so
    hopefully someone else will comment.
     
    Harald van Dijk, Aug 11, 2008
    #6
  7. mdh

    mdh Guest

    On Aug 10, 5:44 pm, Harald van D©¦k <> wrote:
    > On Sun, 10 Aug 2008 17:16:15 -0700, mdh wrote:
    > > I am agreeing with you. () are


    >
    > I understood that you agreed that ( ) are two tokens, but I did not agree
    > with my understanding of your reasoning. It is possible I misunderstood
    > you, so I will give a different example.


    ok... I see what you mean.


    >
    > This seems strange, but...
    >
    > > I assume that their definition is for the sake of the example, then.

    >
    > without knowing the context, I cannot be sure. I don't have K&R, so
    > hopefully someone else will comment.


    Thank you Harald.
     
    mdh, Aug 11, 2008
    #7
  8. On Sun, 10 Aug 2008 17:16:15 -0700 (PDT), mdh <>
    wrote:

    >On Aug 10, 5:11 pm, Harald van D©¦k <> wrote:
    >> On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    >> > Is that why my example of ( ) is not a single token as the first "("
    >> > could be the start of a lot of different things?

    >>
    >> No, your example of ( ) is not a single token because the two can be
    >> separated.

    >
    >Sorry...then I did not make myself clear. I am agreeing with you. ()
    >are 2 tokens, as each ( has a a meaning to the compiler.
    >
    >The reason this is somewhat confusing is that on p125 of K&R2, they
    >define "tokens" ( and the quotation marks are theirs, not mine) as a
    >pair of parentheses, a pair of brackets perhaps including a number. I
    >assume that their definition is for the sake of the example, then.


    On page 125 K&R use the word token to describe the unique processing
    of a function they have provided. They enclosed it in quotes to
    indicate the word is being used with a meaning other than its normal
    one. In fact, it has at least two normal meanings within C.

    One is its use to describe the processing of strtok(). In
    this context, any string between the specified delimiters is a token.

    The other is the meaning compiler writers use when describing
    parse algorithms. In C, the expression a+++b is guaranteed to be
    evaluated as
    a++ + b
    and not
    a + ++b
    because C uses a maximum munch rule for identifying tokens.

    --
    Remove del for email
     
    Barry Schwarz, Aug 11, 2008
    #8
  9. mdh

    Bartc Guest

    "Harald van Dijk" <> wrote in message
    news:9a55e$489f83bb$541dfcd3$1.nb.home.nl...
    > On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    >> Is that why my example of ( ) is not a single token as the first "("
    >> could be the start of a lot of different things?

    >
    > No, your example of ( ) is not a single token because the two can be
    > separated.


    += is a single token, but it can be separated into two tokens + =

    --
    Bartc
     
    Bartc, Aug 11, 2008
    #9
  10. On 2008-08-11, Bartc <> wrote:
    >
    > "Harald van Dijk" <> wrote in message
    > news:9a55e$489f83bb$541dfcd3$1.nb.home.nl...
    >> On Sun, 10 Aug 2008 17:01:18 -0700, mdh wrote:
    >>> Is that why my example of ( ) is not a single token as the first "("
    >>> could be the start of a lot of different things?

    >>
    >> No, your example of ( ) is not a single token because the two can be
    >> separated.

    >
    > += is a single token, but it can be separated into two tokens + =
    >


    Yes, but that changes its meaning from a single "add and assign" to the
    individual tokens "plus or positive" and "assignment", which is what
    Harold wrote earlier in this thread.

    --
    Andrew Poelstra
    To email me, use the above email addresss with .com set to .net
     
    Andrew Poelstra, Aug 11, 2008
    #10
  11. mdh

    mdh Guest

    On Aug 10, 9:12 pm, Barry Schwarz <> wrote:
    > On Sun, 10 Aug 2008 17:16:15 -0700 (PDT), mdh <>
    > wrote:
    >
    >
    > >.............. on p125 of K&R2, they
    > >define "tokens"   ( and the quotation marks are theirs, not mine) as a
    > >pair of parentheses, a pair of brackets perhaps including a number. I
    > >assume that their definition is for the sake of the example, then.

    >
    > On page 125 K&R use the word token to describe the unique processing
    > of a function they have provided.  They enclosed it in quotes to
    > indicate the word is being used with a meaning other than its normal
    > one.  In fact, it has at least two normal meanings within C.
    >
    >         One is its use to describe the processing of strtok().  In
    > this context, any string between the specified delimiters is a token.
    >
    >         The other is the meaning compiler writers use............




    So you are saying that in the example on p125, K&R are defining their
    meaning of a token (in their function "gettoken") , in the same way
    one can apparently decide how to define a token (via the
    delimiters) with strtok in <string.h>

    And...

    what some posters are using to mean "token" is the usual way compiler
    writers use the word "token"?
     
    mdh, Aug 11, 2008
    #11
  12. mdh

    Bartc Guest

    mdh wrote:
    > On Aug 10, 9:12 pm, Barry Schwarz <> wrote:
    >> On Sun, 10 Aug 2008 17:16:15 -0700 (PDT), mdh <>
    >> wrote:
    >>
    >>
    >>> .............. on p125 of K&R2, they
    >>> define "tokens" ( and the quotation marks are theirs, not mine) as a
    >>> pair of parentheses, a pair of brackets perhaps including a number.
    >>> I assume that their definition is for the sake of the example, then.

    >>
    >> On page 125 K&R use the word token to describe the unique processing
    >> of a function they have provided. They enclosed it in quotes to
    >> indicate the word is being used with a meaning other than its normal
    >> one. In fact, it has at least two normal meanings within C.
    >>
    >> One is its use to describe the processing of strtok(). In
    >> this context, any string between the specified delimiters is a token.
    >>
    >> The other is the meaning compiler writers use............


    > So you are saying that in the example on p125, K&R are defining their
    > meaning of a token (in their function "gettoken") , in the same way
    > one can apparently decide how to define a token (via the
    > delimiters) with strtok in <string.h>


    The K&R function applies it's own meaning to 'token' (I'm not familiar with
    strtok()).

    When you write token-parsing code, you can make up your own rules. In this
    function, () /is/ a single token.

    > what some posters are using to mean "token" is the usual way compiler
    > writers use the word "token"?


    The C language has it's own set of symbols considered tokens (and it's own
    rules for forming them and deciding where one starts and another ends).
    Other languages will have their own symbols and rules:

    The strtok() seems designed to provide a crude way of parsing text, for
    example to separate out numbers and words from user input. Compiler parsers
    are more sophisticated and targeted at a specific language.

    --
    Bartc
     
    Bartc, Aug 11, 2008
    #12
  13. mdh

    mdh Guest

    On Aug 11, 2:02 pm, "Bartc" <> wrote:

    > >

    > The K&R function applies it's own meaning to 'token' (I'm not familiar with
    > strtok()).
    >
    > When you write token-parsing code, you can make up your own rules. In this
    > function, () /is/ a single token.


    .....

    snip

    ......


    > The C language has it's own set of symbols considered tokens (and it's own
    > rules for forming them and deciding where one starts and another ends).
    > Other languages will have their own symbols and rules:
    >



    Thanks...that's what I finally realized. viz a token depends upon many
    things, but the bottom line is that a token is a string with a
    meaning, the length, meaning etc being dependent upon whomever decides
    what it is for that particular system.
    Thanks again for your help.
     
    mdh, Aug 11, 2008
    #13
  14. On 11 Aug, 05:12, Barry Schwarz <> wrote:

    <snip>

    > On page 125 K&R use the word token to describe the unique processing
    > of a function they have provided.  They enclosed it in quotes to
    > indicate the word is being used with a meaning other than its normal
    > one.  In fact, it has at least two normal meanings within C.
    >
    >         One is its use to describe the processing of strtok().  In
    > this context, any string between the specified delimiters is a token.
    >
    >         The other is the meaning compiler writers use when describing
    > parse algorithms.  In C, the expression a+++b is guaranteed to be
    > evaluated as
    >    a++ + b
    > and not
    >    a + ++b
    > because C uses a maximum munch rule for identifying tokens.


    the C programming language definition also specifies
    "preprocessor tokens" which (I think) are subtly different
    from "tokens".


    --
    Nick Keighley
     
    Nick Keighley, Aug 12, 2008
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben Holness
    Replies:
    0
    Views:
    5,444
    Ben Holness
    Jan 6, 2006
  2. =?Utf-8?B?RWx0b24gVw==?=

    RE: string into tokens

    =?Utf-8?B?RWx0b24gVw==?=, Oct 13, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    432
    =?Utf-8?B?RWx0b24gVw==?=
    Oct 13, 2005
  3. =?Utf-8?B?TFc=?=

    string into tokens

    =?Utf-8?B?TFc=?=, Oct 13, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    392
    =?Utf-8?B?TFc=?=
    Oct 13, 2005
  4. Dale

    Struts Tokens - Newbie

    Dale, Feb 8, 2004, in forum: Java
    Replies:
    1
    Views:
    3,570
    Matt Parker
    Feb 10, 2004
  5. Per Magnus L?vold
    Replies:
    4
    Views:
    12,874
    Per Magnus L?vold
    Aug 12, 2004
Loading...

Share This Page