Substrings and so on

Discussion in 'C Programming' started by Vicent, Jan 26, 2010.

  1. Vicent

    Vicent Guest

    I posted this also in comp.lang.c++, so sorry for multiposting.

    I would like to ask you about standard or usual ways to manage with
    files or strings, specially when getting input data and writing output
    from an algorithm.

    I mean: which structures, data types or classes? Which standard ways
    to read/write from/on files?

    I've read some tutorials that deal with the standard C I/O and string
    (string.h) libraries, but specially when managing strings, I am a bit
    lost: Are there methods or functions to get substrings from a string,
    or to take "spaces" ("blanks") away (a typical "wrap" function)??

    About reading data from a text file, I think this is called "parsing".
    Is there any "parsing" library???

    Sorry if my questions are too naive, but I am a beginner.

    Thank you very much in advance!

    --
    Vicent
    Vicent, Jan 26, 2010
    #1
    1. Advertising

  2. Vicent

    Tom St Denis Guest

    On Jan 26, 6:40 am, Vicent <> wrote:
    > I posted this also in comp.lang.c++, so sorry for multiposting.
    >
    > I would like to ask you about standard or usual ways to manage with
    > files or strings, specially when getting input data and writing output
    > from an algorithm.
    >
    > I mean: which structures, data types or classes? Which standard ways
    > to read/write from/on files?
    >
    > I've read some tutorials that deal with the standard C I/O and string
    > (string.h) libraries, but specially when managing strings, I am a bit
    > lost: Are there methods or functions to get substrings from a string,
    > or to take "spaces" ("blanks") away (a typical "wrap" function)??
    >
    > About reading data from a text file, I think this is called "parsing".
    > Is there any "parsing" library???
    >
    > Sorry if my questions are too naive, but I am a beginner.
    >
    > Thank you very much in advance!


    Sounds more like you have a comp.sci problem than a C problem, as in
    learn how to manipulate data first then pick a language to express it.

    Also, pick a single language and go with it. There is no C/C++ or
    whatever. If you want to learn how to manipulate strings in C++
    that's fine but at what I'm guessing is your level I'd stick to one or
    another, specially since they're not related.

    Tom
    Tom St Denis, Jan 26, 2010
    #2
    1. Advertising

  3. Vicent

    Flash Gordon Guest

    Vicent wrote:
    > I posted this also in comp.lang.c++, so sorry for multiposting.


    If you are using C++ then I believe C++ provides a lot more facilities
    string facilities than C.

    > I would like to ask you about standard or usual ways to manage with
    > files or strings, specially when getting input data and writing output
    > from an algorithm.
    >
    > I mean: which structures, data types or classes? Which standard ways
    > to read/write from/on files?


    C only has one string data structure (which is not a type), and that is
    the nul terminated string. For input/output you have the functions in
    stdio.h

    A number of people have string libraries which they have written
    themselves, which use more complex structures. However, these are not
    standard.

    > I've read some tutorials that deal with the standard C I/O and string
    > (string.h) libraries,


    That's what you get.

    > but specially when managing strings, I am a bit
    > lost: Are there methods or functions to get substrings from a string,
    > or to take "spaces" ("blanks") away (a typical "wrap" function)??


    Not in C. You either have to write your own or get a non-standard
    library that someone else wrote.

    > About reading data from a text file, I think this is called "parsing".
    > Is there any "parsing" library???


    Reading the text and parsing it are different tasks. Reading is getting
    it in to memory, parsing is breaking it apart in to useful chunks. Some
    functions, e.g. fscanf, do both tasks. Genreally in my opinion the best
    way is often to read a line at a time in to memory and then parse the
    line entirely in memory.

    > Sorry if my questions are too naive, but I am a beginner.


    The first thing you need to do is decide which language you are trying
    to learn. If it is C++ then the best answers are likely to be very
    different and could include classes or templates or something else which
    C does not have.
    --
    Flash Gordon
    Flash Gordon, Jan 26, 2010
    #3
  4. In article <>, Vicent <> writes:
    > I posted this also in comp.lang.c++, so sorry for multiposting.
    >
    > I would like to ask you about standard or usual ways to manage with
    > files or strings, specially when getting input data and writing output
    > from an algorithm.
    >
    > I mean: which structures, data types or classes? Which standard ways
    > to read/write from/on files?


    For C++, I guess you'd look first at std::string and std:iostream.
    Google them, or obtain a draft or an actual edition of the ISO C++
    standard (ISO/IEC 14882). A draft might be available at the C++
    Standards Committee's site:

    http://www.open-std.org/jtc1/sc22/wg21/

    Chapter 21, Strings library
    Chapter 27, Input/output library

    The Qt or Boost libraries may prove helpful as well.

    http://doc.trolltech.com/4.6-snapshot/qstring.html
    http://www.boost.org/doc/libs/1_41_0/libs/libraries.htm#String


    For C: don't start with it. Low-level string manipulation is one of the
    most error-prone tasks in general, leading to countless security
    vulnerabilities.


    > I've read some tutorials that deal with the standard C I/O and string
    > (string.h) libraries, but specially when managing strings, I am a bit
    > lost: Are there methods or functions to get substrings from a string,
    > or to take "spaces" ("blanks") away (a typical "wrap" function)??


    (That would be a typical "trim" function I guess.) In my opinion, the
    "string interface" provided by standard C (or by versions of the Single
    Unix Specification) are much lower-level than you'd need; definitiely
    not for a beginner with higher abstraction needs. I suggest you switch
    to another language supporting high-level string manipulation (Perl,
    Python, Ruby etc) or grab a strings library. A discussion on them
    occurred on Reddit some time ago; several libraries were mentioned:

    http://www.reddit.com/r/programming/comments/abh76/what_string_type_should_i_use_for_a_c_project

    When posting to that topic, I stumbled upon the following comparison
    page:

    http://www.and.org/vstr/comparison


    > About reading data from a text file, I think this is called "parsing".
    > Is there any "parsing" library???


    Especially in relation to parsing: don't start writing parsers in C. If
    you must, stick to whole-line input (with bounded length) and regular
    expressions. One such regex library is PCRE:

    http://www.pcre.org/

    But the Single Unix Specification defines a regex facility too.

    http://www.opengroup.org/onlinepubs/007908775/xsh/regex.h.html

    If you insist on consuming lines of arbitrary length, consider the
    getline() GNU libc extension.

    http://www.gnu.org/s/libc/manual/html_node/Line-Input.html#Line-Input

    Localized low-level text processing (put very crudely: anything
    non-ASCII) requires even more caution, so don't start with that either.
    Some C and C++ libraries should support it transparently, though.

    HTH,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #4
  5. Vicent

    Vicent Guest


    > Sounds more like you have a comp.sci problem than a C problem, as in
    > learn how to manipulate data first then pick a language to express it.
    >
    > Also, pick a single language and go with it.  There is no C/C++ or
    > whatever.  If you want to learn how to manipulate strings in C++
    > that's fine but at what I'm guessing is your level I'd stick to one or
    > another, specially since they're not related.
    >
    > Tom


    Tom,

    Thank you for your answer.

    I've chosen C++, because I need to program some algorithms and I think
    it is a good choice for that purpose.

    So, my problem is about how to read files in C++, I think.
    Vicent, Jan 26, 2010
    #5
  6. Vicent

    Vicent Guest

    On 26 ene, 14:07, Flash Gordon <> wrote:

    > If you are using C++ then I believe C++ provides a lot more facilities
    > string facilities than C.


    Yes, I realize of that...


    > A number of people have string libraries which they have written
    > themselves, which use more complex structures. However, these are not
    > standard.


    OK. I didn't know that, although I was suspecting it.


    > Reading the text and parsing it are different tasks. Reading is getting
    > it in to memory, parsing is breaking it apart in to useful chunks. Some
    > functions, e.g. fscanf, do both tasks. Genreally in my opinion the best
    > way is often to read a line at a time in to memory and then parse the
    > line entirely in memory.


    Yes, that was my idea, in fact --First, I read a line. Then, I try to
    get the information from that line into some variables in my
    algorithm.

    > The first thing you need to do is decide which language you are trying
    > to learn. If it is C++ then the best answers are likely to be very
    > different and could include classes or templates or something else which
    > C does not have.


    I guess I'll stay with C++.

    Thank you!
    Vicent, Jan 26, 2010
    #6
  7. Vicent

    Eric Sosman Guest

    On 1/26/2010 11:14 AM, Vicent wrote:
    > [...]
    > I've chosen C++, because I need to program some algorithms and I think
    > it is a good choice for that purpose.
    >
    > So, my problem is about how to read files in C++, I think.


    Perhaps the kind people on the comp.lang.c++ forum
    would be better able to assist you with that language.
    Since the I/O features of C++ differ quite a lot from
    those of C, and since I/O is what you're interested in ...

    --
    Eric Sosman
    lid
    Eric Sosman, Jan 26, 2010
    #7
  8. Vicent

    Vicent Guest

    On 26 ene, 15:20, (Ersek, Laszlo) wrote:
    > In article <>, Vicent <> writes:


    > For C++, I guess you'd look first at std::string and std:iostream.


    Thank you! That's a point to start.


    > Google them, or obtain a draft or an actual edition of the ISO C++
    > standard (ISO/IEC 14882). A draft might be available at the C++
    > Standards Committee's site:
    >
    > http://www.open-std.org/jtc1/sc22/wg21/
    >
    > Chapter 21, Strings library
    > Chapter 27, Input/output library


    That's a great link!! Thanks.


    > The Qt or Boost libraries may prove helpful as well.
    >
    > http://doc.trolltech.com/4.6-snapsh...org/doc/libs/1_41_0/libs/libraries.htm#String
    >


    Good links, also. :)


    > For C: don't start with it. Low-level string manipulation is one of the
    > most error-prone tasks in general, leading to countless security
    > vulnerabilities.


    OK. Everyone tells me to avoid using C-strings, so...

    >
    > > I've read some tutorials that deal with the standard C I/O and string
    > > (string.h) libraries, but specially when managing strings, I am a bit
    > > lost: Are there methods or functions to get substrings from a string,
    > > or to take "spaces" ("blanks") away (a typical "wrap" function)??

    >
    > (That would be a typical "trim" function I guess.)


    Yes, yes, sorry, I meant "trim", not "wrap". I miss those simple
    "trim" functions at Visual Basic and PL/SQL Oracle...

    > In my opinion, the
    > "string interface" provided by standard C (or by versions of the Single
    > Unix Specification) are much lower-level than you'd need; definitiely
    > not for a beginner with higher abstraction needs. I suggest you switch
    > to another language supporting high-level string manipulation (Perl,
    > Python, Ruby etc) or grab a strings library. A discussion on them
    > occurred on Reddit some time ago; several libraries were mentioned:
    >
    > http://www.reddit.com/r/programming/comments/abh76/what_string_type_s...
    >
    > When posting to that topic, I stumbled upon the following comparison
    > page:
    >
    > http://www.and.org/vstr/comparison
    >


    OK, that's all very interesting. I see that other people had the same
    problem before me!


    > > About reading data from a text file, I think this is called "parsing".
    > > Is there any "parsing" library???

    >
    > Especially in relation to parsing: don't start writing parsers in C. If
    > you must, stick to whole-line input (with bounded length) and regular
    > expressions. One such regex library is PCRE:
    >
    > http://www.pcre.org/
    >
    > But the Single Unix Specification defines a regex facility too.
    >
    > http://www.opengroup.org/onlinepubs/007908775/xsh/regex.h.html
    >
    > If you insist on consuming lines of arbitrary length, consider the
    > getline() GNU libc extension.
    >
    > http://www.gnu.org/s/libc/manual/html_node/Line-Input.html#Line-Input
    >
    > Localized low-level text processing (put very crudely: anything
    > non-ASCII) requires even more caution, so don't start with that either.
    > Some C and C++ libraries should support it transparently, though.



    What I exactly need to do is the following:

    While there are still new lines:
    (1) Get one line from a given text file.
    (2) In that line, detect a "first" part and a "second part", which are
    separated by a "=" symbol.
    (3) Take away the possible "blanks" (like a "trim" function would do)
    from those parts.
    (4) Detect which variable in my program is being referred by the
    "first part".
    (5) Translate the second part (it is still a "string") into a number.

    - About #1 : It can be done by means of standard I/O C libraries. I
    guess that there are also ways to do it with C++ libraries.

    - About #2 : It would be as simple as: detecting the position of "="
    and then get two substrings. I don't understand why this step is so
    difficult to perform in C!!!! I mean: there IS a C standard function
    for getting the position of a character (it is "strchr"), but not a
    function for substring (unless it is a substring that starts at
    position 1, which can be done with "strncpy_s"). Is it easier at C++??

    - About #3 : I would only need an equivalent of VB's "trim"
    function... Is there anything like that at C++?

    - About #4 : I can do this by using a "case" or an "if" statement. No
    problem at all with this step, provided that "first part" has been
    successfully extracted and trimmed.

    - About #5 : I hope that a proper casting statement will be enough.


    So, do you think that C++ std::string and std:iostream classes are
    the right choice for me??

    Thank you in advance for your feed-back!!!

    --
    Vicent
    Vicent, Jan 26, 2010
    #8
  9. In article <20100126182310.7d14676c@kubuntu>, Lorenzo Villari <> writes:
    > On 26 Jan 2010 15:20:07 +0100
    > (Ersek, Laszlo) wrote:
    >
    >>
    >> For C: don't start with it. Low-level string manipulation is one of
    >> the most error-prone tasks in general, leading to countless security
    >> vulnerabilities.
    >>

    >
    > I thought that was comp.lang.c...


    Please elaborate.

    Thanks,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #9
  10. Vicent

    santosh Guest

    Vicent wrote:
    > On 26 ene, 15:20, (Ersek, Laszlo) wrote:
    > > In article <>, Vicent <> writes:

    >
    > > For C++, I guess you'd look first at std::string and std:iostream.

    >
    > Thank you! That's a point to start.
    >
    >
    > > Google them, or obtain a draft or an actual edition of the ISO C++
    > > standard (ISO/IEC 14882). A draft might be available at the C++
    > > Standards Committee's site:
    > >
    > > http://www.open-std.org/jtc1/sc22/wg21/
    > >
    > > Chapter 21, Strings library
    > > Chapter 27, Input/output library

    >
    > That's a great link!! Thanks.
    >
    >
    > > The Qt or Boost libraries may prove helpful as well.
    > >
    > > http://doc.trolltech.com/4.6-snapsh...org/doc/libs/1_41_0/libs/libraries.htm#String
    > >

    >
    > Good links, also. :)
    >
    >
    > > For C: don't start with it. Low-level string manipulation is one of the
    > > most error-prone tasks in general, leading to countless security
    > > vulnerabilities.

    >
    > OK. Everyone tells me to avoid using C-strings, so...
    >
    > >
    > > > I've read some tutorials that deal with the standard C I/O and string
    > > > (string.h) libraries, but specially when managing strings, I am a bit
    > > > lost: Are there methods or functions to get substrings from a string,
    > > > or to take "spaces" ("blanks") away (a typical "wrap" function)??

    > >
    > > (That would be a typical "trim" function I guess.)

    >
    > Yes, yes, sorry, I meant "trim", not "wrap". I miss those simple
    > "trim" functions at Visual Basic and PL/SQL Oracle...
    >
    > > In my opinion, the
    > > "string interface" provided by standard C (or by versions of the Single
    > > Unix Specification) are much lower-level than you'd need; definitiely
    > > not for a beginner with higher abstraction needs. I suggest you switch
    > > to another language supporting high-level string manipulation (Perl,
    > > Python, Ruby etc) or grab a strings library. A discussion on them
    > > occurred on Reddit some time ago; several libraries were mentioned:
    > >
    > > http://www.reddit.com/r/programming/comments/abh76/what_string_type_s...
    > >
    > > When posting to that topic, I stumbled upon the following comparison
    > > page:
    > >
    > > http://www.and.org/vstr/comparison
    > >

    >
    > OK, that's all very interesting. I see that other people had the same
    > problem before me!
    >
    >
    > > > About reading data from a text file, I think this is called "parsing".
    > > > Is there any "parsing" library???

    > >
    > > Especially in relation to parsing: don't start writing parsers in C. If
    > > you must, stick to whole-line input (with bounded length) and regular
    > > expressions. One such regex library is PCRE:
    > >
    > > http://www.pcre.org/
    > >
    > > But the Single Unix Specification defines a regex facility too.


    > What I exactly need to do is the following:
    >
    > While there are still new lines:
    > (1) Get one line from a given text file.
    > (2) In that line, detect a "first" part and a "second part", which are
    > separated by a "=" symbol.
    > (3) Take away the possible "blanks" (like a "trim" function would do)
    > from those parts.
    > (4) Detect which variable in my program is being referred by the
    > "first part".
    > (5) Translate the second part (it is still a "string") into a number.
    >
    > - About #1 : It can be done by means of standard I/O C libraries. I
    > guess that there are also ways to do it with C++ libraries.
    >
    > - About #2 : It would be as simple as: detecting the position of "="
    > and then get two substrings. I don't understand why this step is so
    > difficult to perform in C!!!! I mean: there IS a C standard function
    > for getting the position of a character (it is "strchr"), but not a
    > function for substring (unless it is a substring that starts at
    > position 1, which can be done with "strncpy_s"). Is it easier at C++??
    >
    > - About #3 : I would only need an equivalent of VB's "trim"
    > function... Is there anything like that at C++?
    >
    > - About #4 : I can do this by using a "case" or an "if" statement. No
    > problem at all with this step, provided that "first part" has been
    > successfully extracted and trimmed.
    >
    > - About #5 : I hope that a proper casting statement will be enough.
    >
    >
    > So, do you think that C++ std::string and std:iostream classes are
    > the right choice for me??
    >
    > Thank you in advance for your feed-back!!!
    >
    > --
    > Vicent
    santosh, Jan 26, 2010
    #10
  11. In article <>, Vicent <> writes:

    > (5) Translate the second part (it is still a "string") into a number.
    >
    > - About #5 : I hope that a proper casting statement will be enough.


    Please read an introductory book or tutorial on C, preferably one not
    contradicting the ISO C standard(s). I hope others will name such works.
    Reddit had a similar discussion recently. I obviously can't vouch for
    the pieces of advice given there.

    http://www.reddit.com/r/programming/comments/au1fg/dear_proggit_what_book_would_you_recommend_for_a/


    > So, do you think that C++ std::string and std:iostream classes are
    > the right choice for me??


    I don't know. For the stated purpose, in (not standard) C I'd likely use
    fgets() with a 32,767 byte buffer, then call regexec() in order to
    identify the trimmed parts via parenthesized subexpressions, then call
    strtol() to convert the decimal sequence to a long int.

    Cheers,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #11
  12. In article <20100126191040.35310081@kubuntu>, Lorenzo Villari <> writes:
    > On 26 Jan 2010 19:04:22 +0100
    > (Ersek, Laszlo) wrote:
    >
    >>
    >> Please elaborate.
    >>
    >> Thanks,
    >> lacos

    >
    > C is not perfect I know that, but saying "C: don't start with it" in a
    > newsgroup with this name, it sounds a bit strange to me. That's all.


    Thanks for answering.

    I love C (even though most of the time this love is unrequited). I
    didn't intend to point out C's perceived "shortcomings" -- I hope not to
    have an ego that big. I tried to signal that C (and especially
    manipulation of character arrays for parsing purposes) might not be the
    best choice for the *original poster*, following completely from what I
    perceived to be the OP's understanding of C.

    Someone advising against me operating a sawbench would be completely
    justified. A sawbench is a wonderful tool. It's not the sawbench, it's
    me. I should start with introductory woodworking lessons first.

    (Yes, I just compared C to a sawbench, please forgive me. And for the
    record, I can "operate" a hand saw.)

    Cheers,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #12
  13. Vicent

    santosh Guest

    Vicent wrote:
    [...]

    > What I exactly need to do is the following:
    >
    > While there are still new lines:
    > (1) Get one line from a given text file.
    > (2) In that line, detect a "first" part and a "second part", which are
    > separated by a "=" symbol.
    > (3) Take away the possible "blanks" (like a "trim" function would do)
    > from those parts.
    > (4) Detect which variable in my program is being referred by the
    > "first part".
    > (5) Translate the second part (it is still a "string") into a number.
    >
    > - About #1 : It can be done by means of standard I/O C libraries. I
    > guess that there are also ways to do it with C++ libraries.


    Yes. For C, fgets() is the obvious choice, but if you want to read in
    lines of arbitrary length, then you might have to write your own
    function which uses dynamically allocated memory.

    > - About #2 : It would be as simple as: detecting the position of "="
    > and then get two substrings. I don't understand why this step is so
    > difficult to perform in C!!!! I mean: there IS a C standard function
    > for getting the position of a character (it is "strchr"), but not a
    > function for substring (unless it is a substring that starts at
    > position 1, which can be done with "strncpy_s"). Is it easier at C++??


    Your point #2 is not clear. Do you simply need to locate the first
    occurence of a '=' character? For that purpose strchr() would be fine.

    [...]

    > - About #5 : I hope that a proper casting statement will be enough.


    Atleast for C, no. Casting is not appropriate. Depending on what type
    of number the "string" represents (i.e., integer or real), you'll want
    to use one of the strto*() family of functions, like strtol() strtoul
    () & strtod() to name three.

    Here's a good online reference to Standard C library functions (among
    others):

    <http://www.dinkumware.com/manuals/>

    [...]
    santosh, Jan 26, 2010
    #13
  14. Vicent

    santosh Guest

    Ersek, Laszlo wrote:
    > In article <>, Vicent <> writes:
    >
    > > (5) Translate the second part (it is still a "string") into a number.
    > >
    > > - About #5 : I hope that a proper casting statement will be enough.

    >
    > Please read an introductory book or tutorial on C, preferably one not
    > contradicting the ISO C standard(s). I hope others will name such works.

    [...]

    One online tutorial for complete beginners might be the one by Steve
    Summit:

    <http://www.eskimo.com/~scs/cclass/cclass.html>

    Since Mr. Summit was apparently involved in the standardisation
    process of C90, one might trust his tutorial not to contradict
    Standard C. :)

    [...]
    santosh, Jan 26, 2010
    #14
  15. Vicent wrote:
    > On 26 ene, 15:20, (Ersek, Laszlo) wrote:
    >> For C: don't start with it. Low-level string manipulation is one of the
    >> most error-prone tasks in general, leading to countless security
    >> vulnerabilities.


    Depends. It can be fun and educative. See below.

    >
    >
    > What I exactly need to do is the following:
    >
    > While there are still new lines:
    > (1) Get one line from a given text file.


    Use fgets() in a while loop.


    > (2) In that line, detect a "first" part and a "second part", which are
    > separated by a "=" symbol.
    > (3) Take away the possible "blanks" (like a "trim" function would do)
    > from those parts.


    That's the fun part. You need a few simple loops to do this. I used to
    do a lot of those string-walking exercises, so I just typed this into
    the newsreader untested. It really helps you develop a sense of what
    goes on behind the curtain.

    for (p = buffer; *p && isspace(*p); ++p) ; /* skip initial WS */
    first_part = p; /* save pointer */
    for (; *p && *p != '='; ++p) ; /* find '=' */
    for (q = p+1; *q && isspace(*q); ++q) ; /* skip more WS */
    second_part = q; /* save pointer */
    for (--p; isspace(*p); --p) ; /* skip trailing WS */
    *(p+1) = 0; /* mark end of 1st */
    for (p = second_part; *p; ++p) ; /* find \0 char */
    for (--p; isspace(*p); --p) ; /* skip trailing WS */

    now first_part and second_part should be nicely trimmed, NUL-terminated
    C strings. This thing will probably segfault when fed invalid strings,
    so some input validity checks are in order. This method can be driven to
    the extreme; the nice thing is that everything happens in a single chunk
    of memory ('buffer') which gets pointed into and peppered with zeroes.

    If your first and second part can't contain whitespace, it boils down to
    a sscanf() one-liner:

    #include <stdio.h>
    int main(void)
    {
    char *str = "abcd = 100 "; /* test string */
    char first[20];
    int second;
    int r;

    r = sscanf(str, " %[^ =] = %d", first, &second);
    if (r == 2) {
    printf("%s=%d\n", first, second);
    } else {
    fprintf(stderr, "Couldn't parse string (r=%d)\n", r);
    }
    return 0;
    }

    It would be wise to check for the position of the '=' sign first to make
    sure that the buffer 'first' doesn't overflow.

    > (4) Detect which variable in my program is being referred by the
    > "first part".


    A bsearch()-based solution comes to mind

    > (5) Translate the second part (it is still a "string") into a number.


    strtol(), or automatically done by sscanf()

    > - About #2 : It would be as simple as: detecting the position of "="
    > and then get two substrings. I don't understand why this step is so
    > difficult to perform in C!!!!


    > I mean: there IS a C standard function
    > for getting the position of a character (it is "strchr"), but not a
    > function for substring


    strtok() can also be your friend. For index-based substrings, use
    strdup() and pointer arithmetics. All one-liners.

    > So, do you think that C++ std::string and std:iostream classes are
    > the right choice for me??


    It really depends on what the rest of your application does. If breaking
    up a string into two parts overwhelms you complexity-wise, it probably
    does very little.

    That said, I nowadays greatly prefer Python over C for many things,
    although I enjoy coding in C more. Especially when dealing with
    undefined input, the necessary overhead of error-checking and -handling
    in C (and C++) can be bothersome.

    robert
    Robert Latest, Jan 26, 2010
    #15
  16. Vicent

    Stefan Ram Guest

    Vicent <> writes:
    >About reading data from a text file, I think this is called "parsing".


    I am just teaching about binary trees in C. So I started with:

    struct tree { struct tree * left; int value; struct tree * right; };

    To print a tree:

    void print( struct tree const * const tree )
    { if( tree ){ putchar( '(' ); print( tree->left );
    putchar( '0' + tree->value ); print( tree->right ); putchar( ')' ); }}

    (The code is simplified insofar as it assumes one-digit
    numbers only.)

    An example output is:

    (((0)1(2))3(4))

    For the tree

    3
    / \
    / \
    1 4
    / \
    / \
    0 2

    Now, how do we parse this in again?

    Two steps:

    1.) Write a grammar:

    <number> ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'.

    <entry> ::= '(' <tree> <number> <tree> ')'.

    <tree> ::= [<entry>].

    2.) Write the parser in analogy with the grammar:

    int number( void ){ return get( 1 )- '0'; }

    struct tree * entry( void )
    { TREE left, right; int value;
    get( '(' );
    left = tree();
    value = number();
    right = tree();
    get( ')' );
    return newtree( left, value, right ); }

    struct tree * tree( void )
    { return get( 0 )== '(' ? entry() : 0; }

    This assume a »get« function that will return the current
    character from the source and advances to the next character
    when called with any non-zero argument. The code is
    simplified insofar as it does not handle any run-time errors.

    Thus, we are now able to round-trip serialize (write) and
    de-serialize (read) binary tries with essentially 14 lines
    of C code.

    .------------------------------------------------------.
    | Now, observe that during the whole serialization and |
    | de-serialization we create and process strings of |
    | symbols, but never actually build a 0-terminated |
    | C-string in memory! |
    '------------------------------------------------------'

    I thought it would be nice if a tree in the source code
    also would look like a tree. So the above tree

    3
    / \
    / \
    1 4
    / \
    / \
    0 2

    is being defined using:

    extern struct tree t1, t0, t2, t4; struct tree t3 =
    { &t1, 3, &t4 },

    t1 ={ &t0, 1, &t2 }, t4 ={ 0, 4, 0 },


    t0 ={ 0, 0, 0 }, t2 ={ 0, 2, 0 };
    Stefan Ram, Jan 26, 2010
    #16
  17. Vicent

    Stefan Ram Guest

    -berlin.de (Stefan Ram) writes:
    >{ TREE left, right; int value;


    Oops, this should read:

    { struct tree *left, *right; int value;
    Stefan Ram, Jan 26, 2010
    #17
  18. (Not to contradict, but to complement.)

    In article <-berlin.de>, Robert Latest <> writes:

    > If your first and second part can't contain whitespace, it boils down to
    > a sscanf() one-liner:
    >
    > #include <stdio.h>
    > int main(void)
    > {
    > char *str = "abcd = 100 "; /* test string */
    > char first[20];
    > int second;
    > int r;
    >
    > r = sscanf(str, " %[^ =] = %d", first, &second);
    > if (r == 2) {
    > printf("%s=%d\n", first, second);
    > } else {
    > fprintf(stderr, "Couldn't parse string (r=%d)\n", r);
    > }
    > return 0;
    > }
    >
    > It would be wise to check for the position of the '=' sign first to make
    > sure that the buffer 'first' doesn't overflow.


    [...]

    >> (5) Translate the second part (it is still a "string") into a number.

    >
    > strtol(), or automatically done by sscanf()


    abcd=99999999999999999999999999999999999999999999999999999999

    %d -> implementation-defined behavior ("signed overflow")

    %u -> silent truncation ("unsigned overflow")

    %*d -> assignment suppressed, not applicable here

    %9ld -> file position indicator will advance until after the ninth nine
    (I think), the stored long int value (999,999,999) won't reflect the
    actual decimal string, a matching failure will follow only in the next
    cycle. Full range of long int not available to decimal strings.
    Magnitude of smallest negative value is about one tenth of the greatest
    positive value.

    strtol() is better.

    When writing my previous post in the thread, I've tried to create a
    scanf() format string that (a) relies only on completely defined
    behavior, (b) is correct: parses what the OP needs (pre-set limits on
    the lengths of the trimmed parts are allowed), (c) is complete: refuses
    anything else. I gave up after a while and decided to wait for other
    submissions and try to break them, or if I can't, learn from them.

    Cheers,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #18
  19. In article <9b0HOlwZ28YG@ludens>, (Ersek, Laszlo) writes:

    > abcd=99999999999999999999999999999999999999999999999999999999
    >
    > %d -> implementation-defined behavior ("signed overflow")


    I apologize, that would hold for a conversion from eg. unsigned int; the
    fscanf() spec says (C99 7.19.6.2 The fscanf function, p10):

    ----v----
    Unless assignment suppression was indicated by a *, the result of the
    conversion is placed in the object pointed to by the first argument
    following the format argument that has not already received a conversion
    result. If this object does not have an appropriate type, or if the
    result of the conversion cannot be represented in the object, the
    behavior is undefined.
    ----^----

    See also

    http://groups.google.com/group/comp.lang.c.moderated/msg/700a797a716cf74a

    Cheers,
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #19
  20. In article <>, santosh <> writes:

    > One online tutorial for complete beginners might be the one by Steve
    > Summit:
    >
    > <http://www.eskimo.com/~scs/cclass/cclass.html>
    >
    > Since Mr. Summit was apparently involved in the standardisation
    > process of C90, one might trust his tutorial not to contradict
    > Standard C. :)


    Bookmarked, thank you!
    lacos
    Ersek, Laszlo, Jan 26, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Leandro Pardini

    Binary files, substrings and (un)packing.

    Leandro Pardini, Oct 25, 2003, in forum: Perl
    Replies:
    1
    Views:
    582
    Jim Gibson
    Oct 27, 2003
  2. Lawrie

    Char strings, pointers and substrings.

    Lawrie, Apr 7, 2005, in forum: C Programming
    Replies:
    8
    Views:
    358
    Keith Thompson
    Apr 7, 2005
  3. DarthBob88

    Finding and Replacing Substrings In A String

    DarthBob88, Sep 23, 2007, in forum: C Programming
    Replies:
    7
    Views:
    553
    Keith Thompson
    Sep 23, 2007
  4. Vicent Giner-Bosch

    Again substrings and so on

    Vicent Giner-Bosch, Jan 26, 2010, in forum: C++
    Replies:
    3
    Views:
    300
    Vicent Giner-Bosch
    Jan 26, 2010
  5. Sam Larbi
    Replies:
    10
    Views:
    277
    Sam Larbi
    Nov 28, 2007
Loading...

Share This Page