C++ Compiler with a -Wwarn-use-of-strcpy or similar option??

Discussion in 'C++' started by Paul Sheer, Sep 3, 2004.

  1. Paul Sheer

    Paul Sheer Guest

    I need to automatically search and replace all fixed size
    buffer strcpy's with strncpy's (or better yet, strlcpy's)
    as a security and stability audit. The code base is large
    and it is not feasable to manually perform these changes.

    I would like perhaps a C++ parser that can automatically
    detect use of a strcpy to a buffer of fixed size. For instance,

    struct x {
    char member[128];
    }
    ...
    struct x X;
    ...
    strcpy (X.member, p); /* <-- should generate a warning here */

    but

    struct x {
    char *member;
    }
    ...
    struct x X;
    ...
    strcpy (X.member, p); /* <-- should NOT generate a warning */

    (The second case is too complex to fix at this point.)



    Is there any way of doing this? Our code is C++ (not C) and I
    have, for example, looked at

    http://codeworker.free.fr/ScriptsRepository.html

    but this does not seem to provide an easy solution.

    I am anticipating writing a script that can search and replace
    "strcpy (x.member, p);" with "strlcpy (x.member, p, sizeof(x.member));"
    provided the script can be guaranteed that the replacement is valid
    (and I suppose only a full C++ parser would know if it is valid).

    Can GCC be modified to give such a warning?

    thanks

    -paul
    Paul Sheer, Sep 3, 2004
    #1
    1. Advertising

  2. Paul Sheer

    Andre Kostur Guest

    (Paul Sheer) wrote in news:9ec0d967.0409031355.1f5bb1
    @posting.google.com:

    > I need to automatically search and replace all fixed size
    > buffer strcpy's with strncpy's (or better yet, strlcpy's)
    > as a security and stability audit. The code base is large
    > and it is not feasable to manually perform these changes.


    Err... why not grep for them? (OK, granted I'm assuming you're on a
    unixish platform....)
    Andre Kostur, Sep 3, 2004
    #2
    1. Advertising

  3. Paul Sheer

    David Guest

    On Fri, 3 Sep 2004 21:55:54 UTC, (Paul Sheer) wrote:

    > I need to automatically search and replace all fixed size
    > buffer strcpy's with strncpy's (or better yet, strlcpy's)
    > as a security and stability audit. The code base is large
    > and it is not feasable to manually perform these changes.
    >
    > I would like perhaps a C++ parser that can automatically
    > detect use of a strcpy to a buffer of fixed size. For instance,
    >
    > struct x {
    > char member[128];
    > }
    > ...
    > struct x X;
    > ...
    > strcpy (X.member, p); /* <-- should generate a warning here */
    >
    > but
    >
    > struct x {
    > char *member;
    > }
    > ...
    > struct x X;
    > ...
    > strcpy (X.member, p); /* <-- should NOT generate a warning */
    >
    > (The second case is too complex to fix at this point.)
    >
    >
    >
    > Is there any way of doing this? Our code is C++ (not C) and I
    > have, for example, looked at
    >
    > http://codeworker.free.fr/ScriptsRepository.html
    >
    > but this does not seem to provide an easy solution.
    >
    > I am anticipating writing a script that can search and replace
    > "strcpy (x.member, p);" with "strlcpy (x.member, p, sizeof(x.member));"
    > provided the script can be guaranteed that the replacement is valid
    > (and I suppose only a full C++ parser would know if it is valid).
    >
    > Can GCC be modified to give such a warning?
    >
    > thanks
    >
    > -paul


    Paul,

    I've worked for several companies that have generated their own
    macros to watch the string and memory routines and inform at
    compiler time if there was going to be a problem. There are
    many times though when the code won't give you clues and even at
    runtime you can't catch everything. However, it is generally
    useful to use such macros occasionally, or as a rule, to test that
    your code adheres to some reasonable practice. I've also seen
    similar macros used to track memory leaks and similar offenses
    that can sometime be checked this way. There are some public
    domain versions of these as well. I don't have any to offer,
    but I'll describe the general idea for you.

    Say that you want to insure that strcpy tries to protect itself
    against copying a large space into a smaller space. NULLs could
    be detected at compile time. Certain run time conditinos could
    also be checked for... such as NULL values.

    The general method is to declare macros of the same name as
    the function to check. The macro does compile time checking
    and generates a call to an equivalent function that you may
    fee is more suitable to your needs. A compile time switch is
    sometimes used to differentiate when the real or enhanced code
    is called. The macros must be defined before they are used
    and the original functions reliably removed from view. #ifdefs
    and such are generally used.

    Consider the strcpy function. It is defined as

    char *strcpy (char *destination, const char * const source);

    or equivalent. The replacement function could be something
    like the following.

    char *StrCpy (char *destination, const char * const source);

    In this case StrCpy might handle improper parameters a bit
    better or provide some form of useful side effect such as
    writing to a log file. For memory leak tracers, this is where
    you keep a table of allocated memory and watch it get freed.
    Upon a final, special call, your code would list memory not
    freed and perhaps the module, line, and function that allocated
    it.

    The macro for strcpy would include comparing the pointers
    to the size of a pointer on your system and if they aren't
    that length, then they point to a measurable space defined
    by sizeof(). You can then write an error/warning at compile
    time that the sizeof a source is larger than the sizeof a
    destination. There are likely many instances in your code
    where it isn't possible to tell though.

    As a general rule you should compile with the maximum warning
    level and treat all warnings as errors. If the code is correct,
    fix it so as not to issue the warning. Otherwise fix the code.
    For groups that don't already do this, slowly increase the error
    level and correct errors as you go. For very dirty code this may
    take a long time. However, if you've seen one mistake, like
    using an assignment in a condition, chances are it was also done
    elsewhere. Look for all of that specific warning and fix the
    code or remove the warning.

    David
    David, Sep 3, 2004
    #3
  4. (Paul Sheer) wrote in message news:<>...
    > I need to automatically search and replace all fixed size
    > buffer strcpy's with strncpy's (or better yet, strlcpy's)
    > as a security and stability audit. The code base is large
    > and it is not feasable to manually perform these changes.
    >
    > I would like perhaps a C++ parser that can automatically
    > detect use of a strcpy to a buffer of fixed size. For instance,


    Paul,

    All that seems to be a bit of overkill if you source code base is
    large. Do you also plan on replacing the entire str* and mem* family?

    There a many tools on the market to help diagnose buffer overruns.
    Take a look at mpatrol and Valgrind for example for Unix platforms.
    Boundscheck and Purify work on Windows. My company markets Dynamic
    Leak Check (DLC) for this purpose. The DLC checks for heap overruns,
    leaks and many other errors. The big advantage of the DLC is that no
    recompile is required.

    To answer your question directly, I don't know of any tools that do
    what you are looking for. I am a bit of an expert in the area. You
    may check for tools that do 'static analysis'.

    If you are motivated, it should be easy to modify gcc to give a
    warning if strcpy is called. You could also use the LD_PRELOAD on many
    platform to overload strcpy and spit out a runtime error.

    Matthew
    Dynamic Memory Solutions
    www.dynamic-memory.com
    Matthew Fisher, Sep 4, 2004
    #4
  5. Paul Sheer

    Paul Sheer Guest

    [[I'll just reply to all three responses together...]]

    > From: Matthew Fisher ()
    >
    > > I need to automatically search and replace all fixed size [...]

    >
    > All that seems to be a bit of overkill if you source code base is
    > large. Do you also plan on replacing the entire str* and mem* family?


    we are going to start with strcpy which is very widely
    used in our source code.

    i personally believe it is currently the biggest problem with
    this code base.

    >
    > There a many tools on the market to help diagnose buffer overruns.


    i do use valgrind on Linux.

    > Take a look at mpatrol and Valgrind for example for Unix platforms.
    > Boundscheck and Purify work on Windows. My company markets Dynamic
    > Leak Check (DLC) for this purpose. The DLC checks for heap overruns,
    > leaks and many other errors. The big advantage of the DLC is that no
    > recompile is required.


    yes, but what if no overruns are detected with the test
    data that you run the program with?

    i really need to solve potential problems for the future
    stability of the code.

    >
    > To answer your question directly, I don't know of any tools that do
    > what you are looking for. I am a bit of an expert in the area. You
    > may check for tools that do 'static analysis'.
    >
    > If you are motivated, it should be easy to modify gcc to give a
    > warning if strcpy is called. You could also use the LD_PRELOAD on many
    > platform to overload strcpy and spit out a runtime error.


    i started looked at gcc. i got expand_builtin_strcpy to give the
    warning i want. however, it is taking much time to understand how
    gcc's parse tree is organized and i can't seem to get it working
    for the general case.

    any help would be appreciated.

    -------------------
    > From: David ()
    > > I need to automatically search and replace all fixed size

    >
    > I've worked for several companies that have generated their own
    > macros to watch the string and memory [...]
    >


    yes, me too...

    off the top of my head, something like,

    #define strcpy(a,b) \
    do { \
    char *dummy; \
    fprintf (stderr, "%s:%d: Warning, unbounded strcpy(%s,%s)\n", \
    __FILE__, __LINE__, #a, #b); \
    if (sizeof(dummy) == sizeof(a)) real_strcpy (a, b); \
    else strlcpy (a, b, sizeof(a)); \
    } while (0)

    is only a partial solution. it can still break many things.

    -------------------
    > From: Andre Kostur ()
    >
    > > I need to automatically search and replace all fixed size

    >
    > Err... why not grep for them? (OK, granted I'm assuming you're on a
    > unixish platform....)
    >


    a blind search and replace across a gig of code WILL get me fired.

    i have to GUARANTEE that my changes will not break anything.

    -paul
    Paul Sheer, Sep 4, 2004
    #5
  6. Paul Sheer

    Paul Sheer Guest

    ok, managed to patch gcc to do this

    diff follows...

    -paul

    ---------------

    diff -u -r gcc-ORIG/builtins.c gcc/builtins.c
    --- gcc-ORIG/builtins.c 2003-05-05 18:59:13.000000000 +0200
    +++ gcc/builtins.c 2004-09-05 16:00:47.000000000 +0200
    @@ -74,7 +74,6 @@
    tree built_in_decls[(int) END_BUILTINS];

    static int get_pointer_alignment PARAMS ((tree, unsigned int));
    -static tree c_strlen PARAMS ((tree));
    static const char *c_getstr PARAMS ((tree));
    static rtx c_readstr PARAMS ((const char *,
    enum machine_mode));
    @@ -232,7 +231,7 @@
    Unfortunately, string_constant can't access the values of const
    char
    arrays with initializers, so neither can we do so here. */

    -static tree
    +tree
    c_strlen (src)
    tree src;
    {
    diff -u -r gcc-ORIG/c-tree.h gcc/c-tree.h
    --- gcc-ORIG/c-tree.h 2002-09-16 20:33:18.000000000 +0200
    +++ gcc/c-tree.h 2004-09-05 16:06:49.000000000 +0200
    @@ -323,4 +323,7 @@
    extern GTY(()) tree static_ctors;
    extern GTY(()) tree static_dtors;

    +/* in builtins.c */
    +extern tree c_strlen PARAMS ((tree));
    +
    #endif /* ! GCC_C_TREE_H */
    diff -u -r gcc-ORIG/c-typeck.c gcc/c-typeck.c
    --- gcc-ORIG/c-typeck.c 2003-04-18 08:50:45.000000000 +0200
    +++ gcc/c-typeck.c 2004-09-05 16:19:36.000000000 +0200
    @@ -57,6 +57,7 @@
    static tree decl_constant_value_for_broken_optimization PARAMS
    ((tree));
    static tree default_function_array_conversion PARAMS ((tree));
    static tree lookup_field PARAMS ((tree, tree));
    +static void warn_bad_usage PARAMS ((tree, tree, tree, tree));
    static tree convert_arguments PARAMS ((tree, tree, tree, tree));
    static tree pointer_diff PARAMS ((tree, tree));
    static tree unary_complex_lvalue PARAMS ((enum tree_code, tree,
    int));
    @@ -1525,6 +1526,10 @@
    /* fntype now gets the type of function pointed to. */
    fntype = TREE_TYPE (fntype);

    + /* Warn about use of strcpy to a buffer of fixed sized and
    + other common programming errors. */
    + warn_bad_usage (TYPE_ARG_TYPES (fntype), params, name, fundecl);
    +
    /* Convert the parameters to the types declared in the
    function prototype, or apply default promotions. */

    @@ -1559,6 +1564,61 @@
    return require_complete_type (result);
    }

    +/* Check the function call and arguments against certain good
    + programming ethics. TYPELIST, VALUES, NAME, FUNDECL have the
    + same sense as in convert_arguments below */
    +
    +static void
    +warn_bad_usage (typelist, values, name, fundecl)
    + tree typelist, values, name, fundecl;
    +{
    + tree typetail, valtail;
    + tree parm[3];
    + int parmnum;
    +
    + for (valtail = values, typetail = typelist, parmnum = 0;
    + valtail; valtail = TREE_CHAIN (valtail), parmnum++)
    + {
    + tree type = typetail ? TREE_VALUE (typetail) : 0;
    + tree val = TREE_VALUE (valtail);
    + if (type == void_type_node)
    + return;
    +
    + /* See convert_arguments below */
    + if (TREE_CODE (val) == NON_LVALUE_EXPR)
    + val = TREE_OPERAND (val, 0);
    + parm[parmnum] = val;
    + }
    +
    + switch (DECL_FUNCTION_CODE (fundecl))
    + {
    + case BUILT_IN_STRCPY:
    + if (ARRAY_TYPE == TREE_CODE (TREE_TYPE (parm[0])))
    + {
    + tree dlen = c_sizeof (TREE_TYPE (parm[0]));
    + tree slen = c_strlen (parm[1]);
    + if (slen == 0)
    + {
    + warning
    + ("strcpy to an array of fixed size, use strncpy instead, or better
    yet, use strlcpy");
    + }
    + else
    + {
    + slen = size_binop (PLUS_EXPR, slen, ssize_int (1));
    + if (tree_int_cst_lt (dlen, slen))
    + {
    + error
    + ("trying to strcpy a constant string to a fixed size array of
    insufficient size",
    + IDENTIFIER_POINTER (name));
    + }
    + }
    + }
    + break;
    + default:
    + return;
    + }
    +}
    +
    /* Convert the argument expressions in the list VALUES
    to the types in the list TYPELIST. The result is a list of
    converted
    argument expressions.
    diff -u -r gcc-ORIG/expr.c gcc/expr.c
    --- gcc-ORIG/expr.c 2003-04-23 01:08:15.000000000 +0200
    +++ gcc/expr.c 2004-09-05 15:49:44.000000000 +0200
    @@ -9471,7 +9471,12 @@
    {
    STRIP_NOPS (arg);

    - if (TREE_CODE (arg) == ADDR_EXPR
    + if (TREE_CODE (arg) == STRING_CST)
    + {
    + *ptr_offset = size_zero_node;
    + return arg;
    + }
    + else if (TREE_CODE (arg) == ADDR_EXPR
    && TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST)
    {
    *ptr_offset = size_zero_node;
    diff -u -r gcc-ORIG/tree.h gcc/tree.h
    --- gcc-ORIG/tree.h 2003-03-24 19:59:37.000000000 +0200
    +++ gcc/tree.h 2004-09-04 20:05:32.000000000 +0200
    @@ -155,6 +155,8 @@
    unsigned lang_flag_5 : 1;
    unsigned lang_flag_6 : 1;
    unsigned unused_1 : 1;
    +
    + tree param_before_array_conversion;
    };

    /* The following table lists the uses of each of the above flags and
    Paul Sheer, Sep 5, 2004
    #6
  7. > I need to automatically search and replace all fixed size
    > buffer strcpy's with strncpy's (or better yet, strlcpy's)
    > as a security and stability audit. The code base is large
    > and it is not feasable to manually perform these changes.
    >
    > I would like perhaps a C++ parser that can automatically
    > detect use of a strcpy to a buffer of fixed size. For instance,
    >
    > struct x {
    > char member[128];
    > }
    > ...
    > struct x X;
    > ...
    > strcpy (X.member, p); /* <-- should generate a warning here */
    >
    > but
    >
    > struct x {
    > char *member;
    > }
    > ...
    > struct x X;
    > ...
    > strcpy (X.member, p); /* <-- should NOT generate a warning */
    >
    > (The second case is too complex to fix at this point.)
    >


    Well you need no any additional parser or whatever, just C++.
    Write simple header file strcpy2.h :

    ------------------------------------------
    class decorated_char_p
    {
    private: char* p_;
    public: decorated_char_p( char* p ) : p_(p) { }
    operator char* () { return p_; }
    };

    void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }
    template< int N > void strcpy2( char (&arg)[N] ) {
    int x[0]; // force syntax error at instantiation
    }
    #define strcpy strcpy2
    ------------------------------------------

    Then include it to all .cpp files where you have references to strcpy.
    You need to include it AFTER inclusion of string.h, just put it after
    last #include in the beginning of the file.

    You should have syntax error like "cannot allocate an array of
    constant size 0"
    whenever you tend to use strcpy with an array.

    We need class decorated_char_p because if we overload template
    function with no conversion of argument, compiler always choses
    nontemplate function. Thus we have to introduce additional "barrier"
    with implicit construction of decorated_char_p object.

    -- Mikhail Kupchik
    University of KPI, Kiev, Ukraine
    Mikhail N. Kupchik, Sep 10, 2004
    #7
  8. Paul Sheer

    Paul Sheer Guest

    >
    > Well you need no any additional parser or whatever, just C++.
    > Write simple header file strcpy2.h :
    >
    > ------------------------------------------
    > class decorated_char_p
    > {
    > private: char* p_;
    > public: decorated_char_p( char* p ) : p_(p) { }
    > operator char* () { return p_; }
    > };
    >
    > void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }
    > template< int N > void strcpy2( char (&arg)[N] ) {
    > int x[0]; // force syntax error at instantiation
    > }
    > #define strcpy strcpy2
    > ------------------------------------------
    >
    > Then include it to all .cpp files where you have references to strcpy.
    > You need to include it AFTER inclusion of string.h, just put it after
    > last #include in the beginning of the file.
    >
    > You should have syntax error like "cannot allocate an array of
    > constant size 0"
    > whenever you tend to use strcpy with an array.
    >
    > We need class decorated_char_p because if we overload template
    > function with no conversion of argument, compiler always choses
    > nontemplate function. Thus we have to introduce additional "barrier"
    > with implicit construction of decorated_char_p object.
    >
    > -- Mikhail Kupchik
    >


    I made a minor change, but YES, this actually works.
    It's quite brilliant actually. See the first strcpy gives no warning,
    the second does. (My updated gcc patch can be gotten from the
    gcc-patches mailing list BTW.)

    #include <string>

    class decorated_char_p
    {
    private: char* p_;
    public: decorated_char_p( char* p ) : p_(p) { }
    operator char* () { return p_; }
    };

    void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }

    template< int N > void strcpy2( char (&arg)[N], const char* p2 ) {
    strcpy_to_fixed_array_warning();
    }

    #define strcpy strcpy2

    char p[256];

    void
    foo (void)
    {
    char *q;
    q = p;
    strcpy (q, "hello");
    strcpy (p, "hello");
    }

    -paul
    Paul Sheer, Sep 10, 2004
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paul Sheer
    Replies:
    4
    Views:
    637
    Paul Sheer
    Sep 14, 2004
  2. Julien ROUZIERES

    g++ -pg option and -shared option

    Julien ROUZIERES, Dec 21, 2004, in forum: C++
    Replies:
    1
    Views:
    711
    GianGuz
    Dec 21, 2004
  3. Cas
    Replies:
    5
    Views:
    789
    Kevin Jones
    Aug 28, 2006
  4. Kevin Blount

    page.aspx?option - how to detect "option"

    Kevin Blount, Nov 28, 2006, in forum: ASP .Net
    Replies:
    6
    Views:
    603
    =?Utf-8?B?RWVyYWo=?=
    Nov 28, 2006
  5. John

    Regex for <option> ... </option>

    John, Jan 23, 2009, in forum: Perl Misc
    Replies:
    10
    Views:
    235
    Eric Pozharski
    Jan 29, 2009
Loading...

Share This Page