susbtring function

Discussion in 'C Programming' started by HSeganfredo@gmail.com, Sep 14, 2007.

  1. Guest

    Hello again,

    I´ve made a small substring() function that shoudl give back the a
    susbtring from a string, given its start and end positions.

    Is there any chance of the code below give an output containing data
    from another memory area that is completely unrelated to the input
    string? This is happening and the start/end parameters are acceptable
    (within bounds) for the input string...any idea?


    char *substring(char *string, int start, int end){
    printf("Input %s:", string);

    char *result = (char *)malloc((end - start + 1)*sizeof(char));
    if(result == NULL){
    return(NULL);
    }

    string = string + start;
    while(start < end){
    *result = *string;
    result++;
    string++;
    start++;
    }
    printf("Output %s:", result);
    return(result);
    }
     
    , Sep 14, 2007
    #1
    1. Advertising

  2. Ian Collins Guest

    wrote:
    > Hello again,
    >
    > I´ve made a small substring() function that shoudl give back the a
    > susbtring from a string, given its start and end positions.
    >
    > Is there any chance of the code below give an output containing data
    > from another memory area that is completely unrelated to the input
    > string? This is happening and the start/end parameters are acceptable
    > (within bounds) for the input string...any idea?
    >

    A few issues:
    >
    > char *substring(char *string, int start, int end){


    You don't change the input string, so make this

    char *substring(const char *string, int start, int end);

    > printf("Input %s:", string);
    >


    You forgot the '\n':

    printf("Input %s:", string);

    > char *result = (char *)malloc((end - start + 1)*sizeof(char));


    Don't cast the return of malloc and sizeof(char) == 1 by definition.

    char *result = malloc(end - start + 1);
    > if(result == NULL){
    > return(NULL);


    return isn't a function

    return NULL;
    > }
    >

    You must preserve the start of the result in order to return it:

    char* temp = result;

    > string = string + start;
    > while(start < end){
    > *result = *string;


    More idiomatic:

    *result++ = *string++;

    > start++;
    > }


    You should add the closing '\0' to the result.

    > printf("Output %s:", result);
    > return(result);


    printf("Output %s:\n", temp);
    return temp;

    > }
    >



    --
    Ian Collins.
     
    Ian Collins, Sep 15, 2007
    #2
    1. Advertising

  3. CBFalconer Guest

    wrote:
    >
    > I´ve made a small substring() function that shoudl give back the a
    > susbtring from a string, given its start and end positions.
    >
    > Is there any chance of the code below give an output containing data
    > from another memory area that is completely unrelated to the input
    > string? This is happening and the start/end parameters are acceptable
    > (within bounds) for the input string...any idea?
    >
    > char *substring(char *string, int start, int end){


    start and end are indices into *string, so should be type size_t

    > printf("Input %s:", string);
    >
    > char *result = (char *)malloc((end - start + 1)*sizeof(char));


    You can't put this after code. Declare result before the printf
    statement. Delete the cast. Eliminate "*sizeof(char)", because
    that is 1 by definition.

    > if(result == NULL){
    > return(NULL);
    > }
    >
    > string = string + start;


    Where have you checked that start is a valid index for this
    string? Maybe "if (start < strlen(string)) ... " would do.

    > while(start < end){


    Where have you checked that end is a valid index for this string?
    Where have you checked that end is larger than start? See above.

    > *result = *string;


    Maybe this expression should include start?

    > result++;


    Maybe this action should disappear?

    > string++;
    > start++;


    Indenting code controlled by a conditional is considered cool.

    > }
    > printf("Output %s:", result);
    > return(result);


    This isn't the value malloc returned. You have a problem when you
    free, or you have a memory leak.

    > }


    Enjoy.

    --
    Chuck F (cbfalconer at maineline dot net)
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net>


    --
    Posted via a free Usenet account from http://www.teranews.com
     
    CBFalconer, Sep 15, 2007
    #3
  4. Ian Collins <> writes:
    > wrote:

    [...]
    >> printf("Input %s:", string);
    >>

    >
    > You forgot the '\n':
    >
    > printf("Input %s:", string);


    So did you. Just as a matter of esthetics, putting the ':' at the end
    seems odd. I'd write this as:

    printf("Input: \"%s\"\n", string);

    [...]

    >> return(NULL);

    >
    > return isn't a function
    >
    > return NULL;


    To expand on that a bit: the syntax of a return statement is
    return ;
    or
    return <expression> ;

    The parentheses are not necessary, but they are permitted. If you use

    return(NULL);

    then the parentheses are part of the expression, not part of the
    syntax of the return statement itself.

    I prefer *not* to use extraneous parentheses; a return statement isn't
    a function call, so it shouldn't look like one.

    But using parentheses isn't wrong. Even the examples in K&R1 use
    parentheses on return statements (though K&R2 changed this).

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 15, 2007
    #4
  5. "Ian Collins" <> a écrit dans le message de news:
    ...
    > wrote:
    >> Hello again,
    >>
    >> I´ve made a small substring() function that shoudl give back the a
    >> susbtring from a string, given its start and end positions.
    >>
    >> Is there any chance of the code below give an output containing data
    >> from another memory area that is completely unrelated to the input
    >> string? This is happening and the start/end parameters are acceptable
    >> (within bounds) for the input string...any idea?
    >>

    > A few issues:
    >>
    >> char *substring(char *string, int start, int end){

    >
    > You don't change the input string, so make this
    >
    > char *substring(const char *string, int start, int end);
    >
    >> printf("Input %s:", string);
    >>

    >
    > You forgot the '\n':
    >
    > printf("Input %s:", string);
    >
    >> char *result = (char *)malloc((end - start + 1)*sizeof(char));

    >
    > Don't cast the return of malloc and sizeof(char) == 1 by definition.
    >
    > char *result = malloc(end - start + 1);
    >> if(result == NULL){
    >> return(NULL);

    >
    > return isn't a function
    >
    > return NULL;
    >> }
    >>

    > You must preserve the start of the result in order to return it:
    >
    > char* temp = result;
    >
    >> string = string + start;
    >> while(start < end){
    >> *result = *string;

    >
    > More idiomatic:
    >
    > *result++ = *string++;
    >
    >> start++;
    >> }

    >
    > You should add the closing '\0' to the result.
    >
    >> printf("Output %s:", result);
    >> return(result);

    >
    > printf("Output %s:\n", temp);
    > return temp;
    >
    >> }


    All these corrections are fine and dandy... to complete the answer for the
    OP, there are 2 reasons you can get data in the output unrelated to in
    input:
    - you forgot the final '\0' after the copy loop. Insert *result = '\0';
    after the }.
    - you don't check that start and end fall within the boundaries of the
    string. If not, you are attempting to read from beyond the end of the
    string or before its beginning or indeed from no particular location... This
    invokes undefined behaviour and does not necessarily cause your program to
    crash.

    --
    Chqrlie.
     
    Charlie Gordon, Sep 15, 2007
    #5
  6. Ian Collins Guest

    Charlie Gordon wrote:
    >
    > All these corrections are fine and dandy...


    I hope the were more than that.

    > to complete the answer for the
    > OP, there are 2 reasons you can get data in the output unrelated to in
    > input:
    > - you forgot the final '\0' after the copy loop. Insert *result = '\0';
    > after the }.
    > - you don't check that start and end fall within the boundaries of the
    > string. If not, you are attempting to read from beyond the end of the
    > string or before its beginning or indeed from no particular location... This
    > invokes undefined behaviour and does not necessarily cause your program to
    > crash.
    >

    There were at least three, as I pointed out, he was returning one past
    the end of the result.

    --
    Ian Collins.
     
    Ian Collins, Sep 15, 2007
    #6
  7. CBFalconer <> writes:
    > wrote:

    [...]
    >> printf("Input %s:", string);
    >>
    >> char *result = (char *)malloc((end - start + 1)*sizeof(char));

    >
    > You can't put this after code. Declare result before the printf
    > statement.

    [...]

    To be precise, C90 does not allow declarations to follow statements
    within a block, but C99 does. Most compilers do not (yet?) fully
    implement C99, so if you want maximal portability for your code, all
    declarations within a block should precede all statements within that
    same block. If you don't mind limiting the portability of your code
    to compilers than support C99, or at least that support that
    particular feature, then you're ok.

    [...]
    > that is 1 by definition.
    >
    >> if(result == NULL){
    >> return(NULL);
    >> }
    >>
    >> string = string + start;

    >
    > Where have you checked that start is a valid index for this
    > string? Maybe "if (start < strlen(string)) ... " would do.
    >
    >> while(start < end){

    >
    > Where have you checked that end is a valid index for this string?
    > Where have you checked that end is larger than start? See above.


    Possibly that check needs to be done by the caller. It's ok for a
    function's behavior to be undefined if you give it invalid arguments;
    many functions in the standard library work this way.

    If you do want to do the check in the function, you need to decide
    what to do if it fails. For a substring function, if the 'start' and
    'end' functions are outside the bounds of the string, it's probably
    reasonable to return an empty string; you might not even consider this
    to be an error.

    But the most important thing is to *document the requirements*. If I
    want to use your function, I should be able to look at the function's
    declaration (not body) and any commentary, an for any possible
    arguments either determine exactly what the function will do, or be
    warned that the behavior is undefined and I shouldn't call it with
    those arguments.

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 15, 2007
    #7
  8. writes:
    > I´ve made a small substring() function that shoudl give back the a
    > susbtring from a string, given its start and end positions.
    >
    > Is there any chance of the code below give an output containing data
    > from another memory area that is completely unrelated to the input
    > string?


    As others described it is. :)

    > This is happening and the start/end parameters are acceptable
    > (within bounds) for the input string...any idea?
    >
    > char *substring(char *string, int start, int end){
    > printf("Input %s:", string);
    >
    > char *result = (char *)malloc((end - start + 1)*sizeof(char));
    > if(result == NULL){
    > return(NULL);
    > }
    >
    > string = string + start;
    > while(start < end){
    > *result = *string;
    > result++;
    > string++;
    > start++;
    > }
    > printf("Output %s:", result);
    > return(result);
    > }


    You may use strncpy() but remember to put 0 byte at the end of the
    string:

    #v+
    char *substring(const char *string, size_t start, size_t end) {
    return strcnpy(calloc(end-start+1), string+start, end-start);
    }
    #v-

    Exercise: make the code readable, check if memory allocation succeed,
    check if start is valid string's index and [start, end] is
    non-empty set.

    --
    Best regards, _ _
    .o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
    ..o | Computer Science, Michal "mina86" Nazarewicz (o o)
    ooo +--<mina86*tlen.pl>---<jid:mina86*chrome.pl>--ooO--(_)--Ooo--
     
    Michal Nazarewicz, Sep 15, 2007
    #8
  9. "Michal Nazarewicz" <> a écrit dans le message de news:
    86.com...
    > writes:
    >> I´ve made a small substring() function that shoudl give back the a
    >> susbtring from a string, given its start and end positions.
    >>
    >> Is there any chance of the code below give an output containing data
    >> from another memory area that is completely unrelated to the input
    >> string?

    >
    > As others described it is. :)
    >
    >> This is happening and the start/end parameters are acceptable
    >> (within bounds) for the input string...any idea?
    >>
    >> char *substring(char *string, int start, int end){
    >> printf("Input %s:", string);
    >>
    >> char *result = (char *)malloc((end - start + 1)*sizeof(char));
    >> if(result == NULL){
    >> return(NULL);
    >> }
    >>
    >> string = string + start;
    >> while(start < end){
    >> *result = *string;
    >> result++;
    >> string++;
    >> start++;
    >> }
    >> printf("Output %s:", result);
    >> return(result);
    >> }

    >
    > You may use strncpy() but remember to put 0 byte at the end of the
    > string:


    Please refrein from giving the OP ill advice.
    strncpy is not appropriate as an alternative for the above code, memcpy is.
    The semantics of strncpy are cumbersome, inefficient and error prone. It
    should be deprecated and no longer used. Viable alternatives exist such as
    BSD's strlcpy.

    > #v+
    > char *substring(const char *string, size_t start, size_t end) {
    > return strcnpy(calloc(end-start+1), string+start, end-start);
    > }
    > #v-


    Assuming strcncpy was a typo for the infamous strncpy, this solution is
    inefficient: the destination array is written twice, and each byte copied
    from the source string must be checked for '\0'. bytes beyond an eventual
    '\0' are not copied, unlike the previous code, but instead are set to '\0'
    in the destination buffer, unlike popular belief.

    A much simpler and more efficient solution:

    char *substring(const char *string, size_t start, size_t end) {
    char *dest = malloc(end - start + 1);
    if (dest) {
    memcpy(dest, string + start, end - start);
    dest[end - start] = '\0';
    }
    return dest;
    }

    > Exercise: make the code readable, check if memory allocation succeed,
    > check if start is valid string's index and [start, end] is
    > non-empty set.


    Of course we agree on these important considerations.

    --
    Chqrlie.
     
    Charlie Gordon, Sep 15, 2007
    #9
  10. >> writes:
    >>> I´ve made a small substring() function that shoudl give back the a
    >>> susbtring from a string, given its start and end positions.
    >>>
    >>> This is happening and the start/end parameters are acceptable
    >>> (within bounds) for the input string...any idea?
    >>>
    >>> char *substring(char *string, int start, int end){
    >>> printf("Input %s:", string);
    >>>
    >>> char *result = (char *)malloc((end - start + 1)*sizeof(char));
    >>> if(result == NULL){
    >>> return(NULL);
    >>> }
    >>>
    >>> string = string + start;
    >>> while(start < end){
    >>> *result = *string;
    >>> result++;
    >>> string++;
    >>> start++;
    >>> }
    >>> printf("Output %s:", result);
    >>> return(result);
    >>> }


    > "Michal Nazarewicz" <> a écrit dans le message de news:
    > 86.com...
    >> You may use strncpy() but remember to put 0 byte at the end of the
    >> string:


    "Charlie Gordon" <> writes:
    > Please refrein from giving the OP ill advice.
    > strncpy is not appropriate as an alternative for the above code, memcpy is.
    > The semantics of strncpy are cumbersome, inefficient and error prone. It
    > should be deprecated and no longer used. Viable alternatives exist such as
    > BSD's strlcpy.


    It seems OP is learning C and therefore it's IMO important to try out
    different solutions. True however, that I should have said that the
    code is inefficient.

    >> #v+
    >> char *substring(const char *string, size_t start, size_t end) {
    >> return strcnpy(calloc(end-start+1), string+start, end-start);
    >> }
    >> #v-


    > Assuming strcncpy was a typo for the infamous strncpy,


    Ah yes, sorry...

    > this solution is inefficient: the destination array is written twice,
    > and each byte copied from the source string must be checked for '\0'.


    No matter if one uses strncpy or something else source string should be
    checked for terminating NUL character.

    > A much simpler and more efficient solution:
    >
    > char *substring(const char *string, size_t start, size_t end) {
    > char *dest = malloc(end - start + 1);
    > if (dest) {
    > memcpy(dest, string + start, end - start);
    > dest[end - start] = '\0';
    > }
    > return dest;
    > }


    This adds another point to exercise list below: check if end is valid
    string's index.

    >> Exercise: make the code readable, check if memory allocation succeed,
    >> check if start is valid string's index and [start, end] is
    >> non-empty set.

    >
    > Of course we agree on these important considerations.


    --
    Best regards, _ _
    .o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
    ..o | Computer Science, Michal "mina86" Nazarewicz (o o)
    ooo +--<mina86*tlen.pl>---<jid:mina86*chrome.pl>--ooO--(_)--Ooo--
     
    Michal Nazarewicz, Sep 15, 2007
    #10
  11. santosh Guest

    Charlie Gordon wrote:

    > Michal Nazarewicz wrote:


    <snip>

    >> #v+
    >> char *substring(const char *string, size_t start, size_t end) {
    >> return strcnpy(calloc(end-start+1), string+start, end-start);
    >> }
    >> #v-

    >
    > Assuming strcncpy was a typo for the infamous strncpy,


    There is a typo in the correction to the previous typo :)

    <snip>
     
    santosh, Sep 15, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. James Vanns
    Replies:
    7
    Views:
    7,088
    Evan Carew
    Jan 21, 2004
  2. komal
    Replies:
    6
    Views:
    1,445
    msalters
    Jan 25, 2005
  3. Replies:
    2
    Views:
    944
    Bengt Richter
    Aug 1, 2005
  4. Giannis Papadopoulos

    Function pointer to void function and int function

    Giannis Papadopoulos, Sep 5, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,261
    Barry Schwarz
    Sep 5, 2005
  5. weafon
    Replies:
    1
    Views:
    323
    Diez B. Roggisch
    Jul 14, 2009
Loading...

Share This Page