Parse variable names from string

Discussion in 'C Programming' started by Victor Lagerkvist, Jul 3, 2007.

  1. Hello, I have the need to parse variable names from a string and save them
    somewhere safe for future usage. Here's my first attempt (I don't have any
    rules for valid names yet) - but I have a feeling that it's unnecessary
    complex? Any input would be greatly appreciated.

    #include <stdio.h>
    #include <stdlib.h>

    int get_ops(char *sen, char ***atom, char limit);

    int main(void)
    {
    char *test = "a, b, c, d"; /* The real input is stripped of spaces */
    char **atom;
    get_ops(test, &atom, ',');
    return 0;
    }

    int get_ops(char *sen, char ***atom, char limit)
    {
    int i, j, k;
    char **tmp1, *tmp2;
    *atom = malloc(sizeof (char *));
    **atom = malloc(1);
    for (i = j = k = 0; sen != '\0'; ++i) {
    if (sen != limit) {
    tmp2 = realloc((*atom)[k], j+2);
    if (tmp2 == NULL)
    return -2;
    (*atom)[k] = tmp2;
    tmp2 = NULL;
    (*atom)[k][j++] = sen;
    }
    else if (sen == limit) {
    (*atom)[k++][j] = '\0';
    tmp1 = realloc(*atom, (k +1)*sizeof(char *));

    if (tmp1 == NULL)
    return -2;
    *atom = tmp1;
    tmp1 = NULL;
    (*atom)[k] = malloc(1);
    j = 0;
    }
    }
    return 0;
    }
     
    Victor Lagerkvist, Jul 3, 2007
    #1
    1. Advertising

  2. Victor Lagerkvist

    user923005 Guest

    On Jul 3, 12:53 pm, Victor Lagerkvist <> wrote:
    > Hello, I have the need to parse variable names from a string and save them
    > somewhere safe for future usage. Here's my first attempt (I don't have any
    > rules for valid names yet) - but I have a feeling that it's unnecessary
    > complex? Any input would be greatly appreciated.
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > int get_ops(char *sen, char ***atom, char limit);
    >
    > int main(void)
    > {
    > char *test = "a, b, c, d"; /* The real input is stripped of spaces */
    > char **atom;
    > get_ops(test, &atom, ',');
    > return 0;
    >
    > }
    >
    > int get_ops(char *sen, char ***atom, char limit)
    > {
    > int i, j, k;
    > char **tmp1, *tmp2;
    > *atom = malloc(sizeof (char *));
    > **atom = malloc(1);
    > for (i = j = k = 0; sen != '\0'; ++i) {
    > if (sen != limit) {
    > tmp2 = realloc((*atom)[k], j+2);
    > if (tmp2 == NULL)
    > return -2;
    > (*atom)[k] = tmp2;
    > tmp2 = NULL;
    > (*atom)[k][j++] = sen;
    > }
    > else if (sen == limit) {
    > (*atom)[k++][j] = '\0';
    > tmp1 = realloc(*atom, (k +1)*sizeof(char *));
    >
    > if (tmp1 == NULL)
    > return -2;
    > *atom = tmp1;
    > tmp1 = NULL;
    > (*atom)[k] = malloc(1);
    > j = 0;
    > }
    > }
    > return 0;
    >
    > }


    I guess that it is not nearly complex enough.
    If you are gathering variable names from {presumably} C source code,
    it will have to be fully grammar aware.
    Normally, parsers put variable names into a hash table.
    I suggest that you get an existing C parser, and just read the
    variable list it creates when it scans a source file.
    Here is a place to find a C grammar:
    http://www.devincook.com/goldparser/grammars/index.htm
    It works with the Gold Parser.
    There are C grammars all over the place, so I am sure you can find one
    for YACC or Antlr or whatever.
     
    user923005, Jul 3, 2007
    #2
    1. Advertising

  3. user923005 wrote:

    > On Jul 3, 12:53 pm, Victor Lagerkvist <> wrote:
    >> Hello, I have the need to parse variable names from a string and save
    >> them somewhere safe for future usage. Here's my first attempt (I don't
    >> have any rules for valid names yet) - but I have a feeling that it's
    >> unnecessary complex? Any input would be greatly appreciated.
    >>
    >> #include <stdio.h>
    >> #include <stdlib.h>
    >>
    >> int get_ops(char *sen, char ***atom, char limit);

    <snip>
    > I guess that it is not nearly complex enough.
    > If you are gathering variable names from {presumably} C source code,
    > it will have to be fully grammar aware.
    > Normally, parsers put variable names into a hash table.
    > I suggest that you get an existing C parser, and just read the
    > variable list it creates when it scans a source file.
    > Here is a place to find a C grammar:
    > http://www.devincook.com/goldparser/grammars/index.htm
    > It works with the Gold Parser.
    > There are C grammars all over the place, so I am sure you can find one
    > for YACC or Antlr or whatever.

    Actually, the only functionality I truly need is the names of the variables
    (there's only one "type") and the number of them - everything else is a
    bonus! They are given by the user from standard input, line by line, such
    as:
    << build a, b, c

    And that's all there is (more or less any types of names should be allowed),
    and for some reason I usually become a sad panda when the code "runs away".
     
    Victor Lagerkvist, Jul 3, 2007
    #3
  4. Victor Lagerkvist

    user923005 Guest

    On Jul 3, 2:47 pm, Victor Lagerkvist <> wrote:
    > user923005 wrote:
    > > On Jul 3, 12:53 pm, Victor Lagerkvist <> wrote:
    > >> Hello, I have the need to parse variable names from a string and save
    > >> them somewhere safe for future usage. Here's my first attempt (I don't
    > >> have any rules for valid names yet) - but I have a feeling that it's
    > >> unnecessary complex? Any input would be greatly appreciated.

    >
    > >> #include <stdio.h>
    > >> #include <stdlib.h>

    >
    > >> int get_ops(char *sen, char ***atom, char limit);

    > <snip>
    > > I guess that it is not nearly complex enough.
    > > If you are gathering variable names from {presumably} C source code,
    > > it will have to be fully grammar aware.
    > > Normally, parsers put variable names into a hash table.
    > > I suggest that you get an existing C parser, and just read the
    > > variable list it creates when it scans a source file.
    > > Here is a place to find a C grammar:
    > >http://www.devincook.com/goldparser/grammars/index.htm
    > > It works with the Gold Parser.
    > > There are C grammars all over the place, so I am sure you can find one
    > > for YACC or Antlr or whatever.

    >
    > Actually, the only functionality I truly need is the names of the variables
    > (there's only one "type") and the number of them - everything else is a
    > bonus! They are given by the user from standard input, line by line, such
    > as:
    > << build a, b, c
    >
    > And that's all there is (more or less any types of names should be allowed),
    > and for some reason I usually become a sad panda when the code "runs away".- Hide quoted text -
    >


    In that case, why not just store them in a hash table to ensure you do
    not have duplicates.
     
    user923005, Jul 3, 2007
    #4
  5. user923005 wrote:

    > On Jul 3, 2:47 pm, Victor Lagerkvist <> wrote:
    >> user923005 wrote:
    >> > On Jul 3, 12:53 pm, Victor Lagerkvist <> wrote:
    >> >> Hello, I have the need to parse variable names from a string and save
    >> >> them somewhere safe for future usage. Here's my first attempt (I don't
    >> >> have any rules for valid names yet) - but I have a feeling that it's
    >> >> unnecessary complex? Any input would be greatly appreciated.

    >>
    >> >> #include <stdio.h>
    >> >> #include <stdlib.h>

    >>
    >> >> int get_ops(char *sen, char ***atom, char limit);

    >> <snip>
    >> > I guess that it is not nearly complex enough.
    >> > If you are gathering variable names from {presumably} C source code,
    >> > it will have to be fully grammar aware.
    >> > Normally, parsers put variable names into a hash table.
    >> > I suggest that you get an existing C parser, and just read the
    >> > variable list it creates when it scans a source file.
    >> > Here is a place to find a C grammar:
    >> >http://www.devincook.com/goldparser/grammars/index.htm
    >> > It works with the Gold Parser.
    >> > There are C grammars all over the place, so I am sure you can find one
    >> > for YACC or Antlr or whatever.

    >>
    >> Actually, the only functionality I truly need is the names of the
    >> variables (there's only one "type") and the number of them - everything
    >> else is a bonus! They are given by the user from standard input, line by
    >> line, such as:
    >> << build a, b, c
    >>
    >> And that's all there is (more or less any types of names should be
    >> allowed), and for some reason I usually become a sad panda when the code
    >> "runs away".- Hide quoted text -
    >>

    >
    > In that case, why not just store them in a hash table to ensure you do
    > not have duplicates.

    Ah, that would no doubt be sleek in this case. I thank thee for thine input.
     
    Victor Lagerkvist, Jul 4, 2007
    #5
  6. "Victor Lagerkvist" <> wrote in message
    news:WOxii.3374$...
    > Hello, I have the need to parse variable names from a string and save them
    > somewhere safe for future usage. Here's my first attempt (I don't have any
    > rules for valid names yet) - but I have a feeling that it's unnecessary
    > complex? Any input would be greatly appreciated.
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > int get_ops(char *sen, char ***atom, char limit);
    >
    > int main(void)
    > {
    > char *test = "a, b, c, d"; /* The real input is stripped of spaces */
    > char **atom;
    > get_ops(test, &atom, ',');
    > return 0;
    > }
    >
    > int get_ops(char *sen, char ***atom, char limit)
    > {
    > int i, j, k;
    > char **tmp1, *tmp2;
    > *atom = malloc(sizeof (char *));
    > **atom = malloc(1);
    > for (i = j = k = 0; sen != '\0'; ++i) {
    > if (sen != limit) {
    > tmp2 = realloc((*atom)[k], j+2);
    > if (tmp2 == NULL)
    > return -2;
    > (*atom)[k] = tmp2;
    > tmp2 = NULL;
    > (*atom)[k][j++] = sen;
    > }
    > else if (sen == limit) {
    > (*atom)[k++][j] = '\0';
    > tmp1 = realloc(*atom, (k +1)*sizeof(char *));
    >
    > if (tmp1 == NULL)
    > return -2;
    > *atom = tmp1;
    > tmp1 = NULL;
    > (*atom)[k] = malloc(1);
    > j = 0;
    > }
    > }
    > return 0;
    > }
    >

    If you impose an order on the variables in the string, sscanf() will read
    them in one go.
    If you allow variables in any order, it is a bit more complicated. You need
    getword() and possibly getvalue() functions, which gobble up an input
    string.

    You might want to take a look at the option parser from my website. It
    solves a very similar problem to yours, except that the varaibles are
    contained in the program's arguments rather tha concatenated together into a
    string.
    --
    Free games and programming goodies.
    http://www.personal.leeds.ac.uk/~bgy1mm
     
    Malcolm McLean, Jul 5, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paddy McCarthy
    Replies:
    3
    Views:
    737
    Anthony J Bybell
    Sep 24, 2004
  2. wanwan
    Replies:
    3
    Views:
    449
    Alex Martelli
    Oct 14, 2005
  3. Replies:
    19
    Views:
    1,169
    Daniel Vallstrom
    Mar 15, 2005
  4. News123
    Replies:
    2
    Views:
    474
    John Machin
    Nov 26, 2008
  5. Peter Buckley
    Replies:
    5
    Views:
    219
    matt neuburg
    Feb 27, 2009
Loading...

Share This Page