Pointer validity

Discussion in 'C Programming' started by jacob navia, Dec 1, 2003.

  1. jacob navia

    jacob navia Guest

    Valid pointers have two states. Either empty (NULL), or filled with an
    address that must be at a valid address.

    Valid addresses are:

    1) The current global context. The first byte of the data of the
    program
    till the last byte. Here we find static tables, global context
    pointers, etc.
    This are the global variables of the program.

    2) The current scope and all nested scopes. The current scope is given
    by the address of the local variables and the arguments. A
    conservative
    estimate of this area is the address of argc in main() or the
    address of the
    first local variable in main. Normally, a procedure should never
    access
    memory outside its scope, but it can receive pointers to areas in
    higher
    scopes, so the comparison is not easier if done throughly.

    3) The heap. To this area belong all addresses allocated with malloc()
    and not passed to free().

    A fast procedure tyo determine the validity of a pointer could be:

    1) Check if the address is in the data area. It would be nice if the
    standard
    specified a name for those addresses, but this is tricky in
    environments where those addresses aren't contiguous. Here we
    suppose
    that the compiler supplies __first_data__ and __last_data__.

    2) To check if the address is within the valid stack we need two
    memory
    comparisons again. The current stack and the stored value of the top
    of it.
    We suppose the compiler provides __top_of_stack__

    3) The heap. We suppose there is a procedure to verify a memory block.

    All this would cost a couple of memory reads in most cases, or a call
    to a
    procedure, in case of malloced block.

    What about making those tests automatically to do that with all
    pointers
    passed to all functions?

    That would lead to pointer bugs surfacing immediately. This could be
    disconnected later. But in the first phases of development, speed is
    not so
    important as correctly implementing the algorithm.

    Pointer bugs are likely to surface in the first phases of development,
    and we have the means now to put the machine to check those pointers.

    A run of the mill processor now runs at several GHZ. Some memory
    comparisons would slow down the program so little as to be completely
    transparent in PC architectures.

    Of course, in embedded systems the situation is different, but for C
    developers in a PC this would be a good improvement.

    Just some thoughts

    jacob
     
    jacob navia, Dec 1, 2003
    #1
    1. Advertising

  2. jacob navia

    xarax Guest

    "jacob navia" <> wrote in message
    news:bqgc1a$3kd$...
    > Valid pointers have two states. Either empty (NULL), or filled with an
    > address that must be at a valid address.
    >
    > Valid addresses are:
    >
    > 1) The current global context. The first byte of the data of the
    > program
    > till the last byte. Here we find static tables, global context
    > pointers, etc.
    > This are the global variables of the program.
    >
    > 2) The current scope and all nested scopes. The current scope is given
    > by the address of the local variables and the arguments. A
    > conservative
    > estimate of this area is the address of argc in main() or the
    > address of the
    > first local variable in main. Normally, a procedure should never
    > access
    > memory outside its scope, but it can receive pointers to areas in
    > higher
    > scopes, so the comparison is not easier if done throughly.
    >
    > 3) The heap. To this area belong all addresses allocated with malloc()
    > and not passed to free().


    All of the above is implementation specific and therefore OFF
    TOPIC. There is no requirement for a heap or stackframe as we
    know and love them. Implementations are allowed to do whatever
    they want as if the behavior appears to conform to the standard.

    Pointers to valid memory locations can come from an external
    source (e.g., generated by an agent other than the currently
    running C program), and be usable by the currently running
    C program.

    > A fast procedure tyo determine the validity of a pointer could be:
    >
    > 1) Check if the address is in the data area. It would be nice if the
    > standard
    > specified a name for those addresses, but this is tricky in
    > environments where those addresses aren't contiguous. Here we
    > suppose
    > that the compiler supplies __first_data__ and __last_data__.
    >
    > 2) To check if the address is within the valid stack we need two
    > memory
    > comparisons again. The current stack and the stored value of the top
    > of it.
    > We suppose the compiler provides __top_of_stack__


    A stack frame, if implemented, can contain more data locations
    than what would appear to be needed by just looking at the
    automatic variable declarations in the source code. A corrupted
    pointer that just happens to fall within the stackframe boundaries
    would appear to be valid according to your description.

    > 3) The heap. We suppose there is a procedure to verify a memory block.


    A single heap is not required by the standard, and certainly
    its implementation would be opaque and subject to change.

    > All this would cost a couple of memory reads in most cases, or a call
    > to a
    > procedure, in case of malloced block.
    >
    > What about making those tests automatically to do that with all
    > pointers
    > passed to all functions?
    >
    > That would lead to pointer bugs surfacing immediately. This could be
    > disconnected later. But in the first phases of development, speed is
    > not so
    > important as correctly implementing the algorithm.
    >
    > Pointer bugs are likely to surface in the first phases of development,
    > and we have the means now to put the machine to check those pointers.
    >
    > A run of the mill processor now runs at several GHZ. Some memory
    > comparisons would slow down the program so little as to be completely
    > transparent in PC architectures.
    >
    > Of course, in embedded systems the situation is different, but for C
    > developers in a PC this would be a good improvement.
    >
    > Just some thoughts
    >
    > jacob


    Your premise is flawed, therefore your conclusions are meaningless.


    There are plenty of memory management tools out there that are
    replacements for the common implementations of malloc() and
    friends, for locating heap corruption, dangling references, etc.
    It is all dependent upon implementation details and is something
    that would require a compiler to have intimate knowledge of the
    heap implementation.
     
    xarax, Dec 1, 2003
    #2
    1. Advertising

  3. jacob navia

    jacob navia Guest

    "xarax" <> wrote in message
    news:78Pyb.23836$...
    > All of the above is implementation specific and therefore OFF
    > TOPIC. There is no requirement for a heap or stackframe as we
    > know and love them. Implementations are allowed to do whatever
    > they want as if the behavior appears to conform to the standard.
    >


    Sorry but I gather from the standard that the storage allocated by
    local variables is valid only during the execution of a function.

    Since functions return and are called, this implies a stack structure
    one way or the other. The thing gets started with main() that
    can call other functions.

    The scope of a global is indefinite, as long as the program runs.
    This means that C surely assumes that this storage is distinct
    conceptually from the local storage.

    malloc/free are part of the standard.

    > Pointers to valid memory locations can come from an external
    > source (e.g., generated by an agent other than the currently
    > running C program), and be usable by the currently running
    > C program.
    >


    Yes, we could hypothetically assume that the operating system
    returns valid pointers to applications but this is very uncommon,
    outside the obvious call to malloc/free.

    This is very rare and can be safely forgotten.

    > Your premise is flawed, therefore your conclusions are meaningless.
    >


    There is nothing flawed here.

    > There are plenty of memory management tools out there that are
    > replacements for the common implementations of malloc() and
    > friends, for locating heap corruption, dangling references, etc.


    And they do probably a very similar thing to what I described.

    > It is all dependent upon implementation details and is something
    > that would require a compiler to have intimate knowledge of the
    > heap implementation.


    Yes. And so what?

    My question is: would it be interesting to add to the language itself?

    C has been widely critized, and with reason, for the ample
    opportunities of
    pointer errors. Giving thought to this is not off topic here. It is
    one
    of the most common errors in any program when it is being developed.

    Two conceptions of the C language underlie our differences. For you,
    any
    reflection about some basic tenets of the language is "off topic". I
    think
    too little discussion is going on about how we could improve things.
     
    jacob navia, Dec 1, 2003
    #3
  4. jacob navia

    Mike Wahler Guest

    "jacob navia" <> wrote in message
    news:bqghf7$bvd$...
    >
    > "xarax" <> wrote in message
    > news:78Pyb.23836$...
    > > All of the above is implementation specific and therefore OFF
    > > TOPIC. There is no requirement for a heap or stackframe as we
    > > know and love them. Implementations are allowed to do whatever
    > > they want as if the behavior appears to conform to the standard.
    > >

    >
    > Sorry but I gather from the standard that the storage allocated by
    > local variables is valid only during the execution of a function.
    >
    > Since functions return and are called, this implies a stack structure


    .... can be used, but not that one is required.

    > one way or the other.


    Or some other non-stack method may be used.

    >The thing gets started with main() that
    > can call other functions.


    The ability for functions to call functions is not required
    to be implemented with a stack.


    >
    > The scope of a global is indefinite, as long as the program runs.


    It is definite. The duration of the program's execution.

    > This means that C surely assumes that this storage is distinct
    > conceptually from the local storage.


    "Local" vs "nonlocal" is not a lifetime issue, but one of scope.
    But yes, 'local' vs. 'global' scopes are considered distinct.
    That's what 'scope' means, after all. :)

    But there's no requirement that a compiler internally store e.g
    'all globals here and all locals there'. This is often done,
    but this is an implementation detail.

    >
    > malloc/free are part of the standard.


    Yes they are. What's your point?

    >
    > > Pointers to valid memory locations can come from an external
    > > source (e.g., generated by an agent other than the currently
    > > running C program), and be usable by the currently running
    > > C program.
    > >

    >
    > Yes, we could hypothetically assume that the operating system
    > returns valid pointers to applications but this is very uncommon,


    I find it very common. E.g. Microsoft Windows is very
    widespread. So are embedded devices with interfaces that
    use pointers.

    > outside the obvious call to malloc/free.
    >
    > This is very rare and can be safely forgotten.


    IMO not rare at all.

    Also, in my experience, the 'rare' problems are the most difficult
    to rectify (or even locate).

    >
    > > Your premise is flawed, therefore your conclusions are meaningless.
    > >

    >
    > There is nothing flawed here.
    >
    > > There are plenty of memory management tools out there that are
    > > replacements for the common implementations of malloc() and
    > > friends, for locating heap corruption, dangling references, etc.

    >
    > And they do probably a very similar thing to what I described.


    In necessarily platform specific ways.

    >
    > > It is all dependent upon implementation details and is something
    > > that would require a compiler to have intimate knowledge of the
    > > heap implementation.

    >
    > Yes. And so what?


    So here, we discuss standard C, not implementation details.

    >
    > My question is: would it be interesting to add to the language itself?


    Perhaps it might be to some, but not to others.

    > C has been widely critized, and with reason, for the ample
    > opportunities of
    > pointer errors.



    The C language has no need to justify its existence to any
    critics. I think it stands upon its own success. AFAIK,
    besides COBOL, it's the oldest high-level language still in
    widespread use (I welcome any corrections from the historians).

    >Giving thought to this is not off topic here.


    Actually it is. Here we discuss the C language as it is.


    >It is
    > one
    > of the most common errors in any program when it is being developed.


    Yes it is. Which is why many/most of the 'critics' you cite
    often try to blame their tools for their mistakes.

    >
    > Two conceptions of the C language underlie our differences. For you,
    > any
    > reflection about some basic tenets of the language is "off topic".


    Explanations of 'basic tenets' of C as part of helping someone
    with C are indeed topical. Speculations/suggestions about the
    'how and why', 'can/should something be different', etc. are not.

    > I
    > think
    > too little discussion is going on about how we could improve things.


    Much such discussion is occurring, you just apparently aren't
    aware of it. The result of many such discussions was C99.
    Ten-plus years of discussion. Too little? :)

    I suggest you visit comp.std.c if you want to share ideas
    about changes to the language.

    -Mike
     
    Mike Wahler, Dec 1, 2003
    #4
  5. jacob navia

    Eric Sosman Guest

    jacob navia wrote:
    >
    > "xarax" <> wrote in message
    > news:78Pyb.23836$...
    > > All of the above is implementation specific and therefore OFF
    > > TOPIC. There is no requirement for a heap or stackframe as we
    > > know and love them. Implementations are allowed to do whatever
    > > they want as if the behavior appears to conform to the standard.
    > >

    >
    > Sorry but I gather from the standard that the storage allocated by
    > local variables is valid only during the execution of a function.
    >
    > Since functions return and are called, this implies a stack structure
    > one way or the other. The thing gets started with main() that
    > can call other functions.


    Yes, a stack of some kind is implied. But it would be too
    much of a leap to assume the stack is represented as a simple
    contiguous array of memory! For example, a linked list of
    "frames" would serve the needs of C just fine but would make it
    impossible to classify a pointer value as stack or non-stack
    with just two comparisons, as you suggest.

    > The scope of a global is indefinite, as long as the program runs.
    > This means that C surely assumes that this storage is distinct
    > conceptually from the local storage.
    >
    > malloc/free are part of the standard.
    >
    > > Pointers to valid memory locations can come from an external
    > > source (e.g., generated by an agent other than the currently
    > > running C program), and be usable by the currently running
    > > C program.

    >
    > Yes, we could hypothetically assume that the operating system
    > returns valid pointers to applications but this is very uncommon,
    > outside the obvious call to malloc/free.
    >
    > This is very rare and can be safely forgotten.


    Actually, there are two *very* common examples of "out of the
    blue" memory, provided to and used by a large fraction of all C
    programs. The second argument to main() comes from -- well, from
    who knows where, and so do the strings to which its elements
    point. A possibly less common example is getenv(), and perhaps
    some thought might suggest others. In any event, memory supplied
    from extra-program sources can't be "safely forgotten."

    > > There are plenty of memory management tools out there that are
    > > replacements for the common implementations of malloc() and
    > > friends, for locating heap corruption, dangling references, etc.

    >
    > And they do probably a very similar thing to what I described.


    There's a fairly extensive literature on checking pointer
    validity, but most of what I've seen addresses a more important
    problem than you're tackling. For example, simply knowing that
    a pointer addresses a valid object isn't enough:

    int a[10][10];
    int *p = &a[0][9];
    *++p = 0; // valid pointer, invalid access

    --
     
    Eric Sosman, Dec 1, 2003
    #5
  6. Mike Wahler wrote:
    > "jacob navia" <> wrote in message
    > news:bqghf7$bvd$...
    >>
    >> "xarax" <> wrote in message
    >> news:78Pyb.23836$...
    >> > All of the above is implementation specific and therefore OFF
    >> > TOPIC. There is no requirement for a heap or stackframe as we
    >> > know and love them. Implementations are allowed to do whatever
    >> > they want as if the behavior appears to conform to the standard.
    >> >

    >>
    >> Sorry but I gather from the standard that the storage allocated by
    >> local variables is valid only during the execution of a function.
    >>
    >> Since functions return and are called, this implies a stack structure

    >
    > ... can be used, but not that one is required.


    That's not what Jacob meant. At any point during the execution of a C
    program, the currently active functions together with their local
    storage form a stack structure. Calling a function is equivalent to
    pushing an item onto the stack; returning from a function pops it off
    the (top of the) stack. However this is implemented, the basic
    operations (call/return) correspond to those which can be performed on
    a stack (push/pop), so it's entirely accurate, and natural, to talk
    about "the call stack".

    > The C language has no need to justify its existence to any
    > critics. I think it stands upon its own success. AFAIK,
    > besides COBOL, it's the oldest high-level language still in
    > widespread use (I welcome any corrections from the historians).


    That depends on what you mean by "widespread". There are several
    languages still in use that are considerably older than C, e.g. (in
    descending order of popularity) Fortran, Lisp, BCPL, etc.

    Jeremy.
     
    Jeremy Yallop, Dec 1, 2003
    #6
  7. Mike Wahler wrote:

    > Here we discuss the C language as it is.


    Are we discussing C89, C99 or some combination of the two?
     
    E. Robert Tisdale, Dec 2, 2003
    #7
  8. jacob navia

    nobody Guest

    "jacob navia" <> wrote in message
    news:bqghf7$bvd$...
    >
    > "xarax" <> wrote in message
    > news:78Pyb.23836$...
    > > All of the above is implementation specific and therefore OFF
    > > TOPIC. There is no requirement for a heap or stackframe as we
    > > know and love them. Implementations are allowed to do whatever
    > > they want as if the behavior appears to conform to the standard.
    > >

    >
    > Sorry but I gather from the standard that the storage allocated by
    > local variables is valid only during the execution of a function.
    >

    Actually, it's "Storage for the object is no longer guaranteed
    to be reserved when execution of the block ends in any way."
    Standard doesn't mention "validity of a storage" in this context.

    > Since functions return and are called, this implies a stack structure
    > one way or the other.


    It does not. That it is common on some platforms doesn't mean
    that standard implies such thing.

    > The thing gets started with main() that


    Not necessarilly in freestanding environments.

    > can call other functions.
    >
    > The scope of a global is indefinite, as long as the program runs.


    You seem to be confusing scope (of identifiers) and (storage)
    duration. For neither of them standard enumerates "indefinite".

    > This means that C surely assumes that this storage is distinct
    > conceptually from the local storage.
    >

    How did you arrive at this conclusion ("C surely assumes")?

    > malloc/free are part of the standard.
    >
    > > Pointers to valid memory locations can come from an external
    > > source (e.g., generated by an agent other than the currently
    > > running C program), and be usable by the currently running
    > > C program.
    > >

    >
    > Yes, we could hypothetically assume that the operating system
    > returns valid pointers to applications but this is very uncommon,
    > outside the obvious call to malloc/free.
    >

    Maybe uncommon for programs you are writing? BTW, standard doesn't
    say that malloc() returns "OS returns pointer".

    > This is very rare and can be safely forgotten.
    >

    Sure. Given your experience a problems you are facing (which
    undoubtely spawned this thread) ...

    > > Your premise is flawed, therefore your conclusions are meaningless.
    > >

    >
    > There is nothing flawed here.
    >

    Well, so far you've got "malloc/free are part of the standard"
    right.

    > > There are plenty of memory management tools out there that are
    > > replacements for the common implementations of malloc() and
    > > friends, for locating heap corruption, dangling references, etc.

    >
    > And they do probably a very similar thing to what I described.
    >

    So sensible thing would be to get them and use them. No need
    for fundamental language change.

    > > It is all dependent upon implementation details and is something
    > > that would require a compiler to have intimate knowledge of the
    > > heap implementation.

    >
    > Yes. And so what?
    >
    > My question is: would it be interesting to add to the language itself?
    >

    No.

    > C has been widely critized, and with reason, for the ample
    > opportunities of pointer errors.


    So were chainsaws by idiots sawing their fingers off. Blame
    the tools, eh?

    > Giving thought to this is not off topic here. It is
    > one
    > of the most common errors in any program when it is being developed.
    >

    You mean in any program developed by you? Or by newbie? C is not
    a tool that can be mastered in "21 days" or so (for majority
    of people, anyway, IMHO).

    > Two conceptions of the C language

    ------^^^^^^^^^^^
    There was only one - by Dennis M. Ritchie, AFAIK. If you want
    to talk concepts, in clc there is still only one, as I gather -
    that of standard. Maybe comp.std.c would be better place?

    > underlie our differences. For you, any reflection
    > about some basic tenets of the language is "off topic".


    Again, unwarranted conclusion.

    > I think
    > too little discussion is going on about how we could improve things.


    First step would be to fully understand how things *are*.
    Then, *why* are they as they are. (Not that I know it all,
    but what you are asking is against "spirit" of C, as *I* see
    it. But who am I, anyway?:)
     
    nobody, Dec 2, 2003
    #8
  9. jacob navia

    Jack Klein Guest

    On Mon, 1 Dec 2003 22:32:26 +0100, "jacob navia"
    <> wrote in comp.lang.c:

    > Valid pointers have two states. Either empty (NULL), or filled with an
    > address that must be at a valid address.
    >
    > Valid addresses are:
    >
    > 1) The current global context. The first byte of the data of the
    > program
    > till the last byte. Here we find static tables, global context
    > pointers, etc.
    > This are the global variables of the program.
    >
    > 2) The current scope and all nested scopes. The current scope is given
    > by the address of the local variables and the arguments. A
    > conservative
    > estimate of this area is the address of argc in main() or the
    > address of the
    > first local variable in main. Normally, a procedure should never
    > access
    > memory outside its scope, but it can receive pointers to areas in
    > higher
    > scopes, so the comparison is not easier if done throughly.
    >
    > 3) The heap. To this area belong all addresses allocated with malloc()
    > and not passed to free().
    >
    > A fast procedure tyo determine the validity of a pointer could be:
    >
    > 1) Check if the address is in the data area. It would be nice if the
    > standard
    > specified a name for those addresses, but this is tricky in
    > environments where those addresses aren't contiguous. Here we
    > suppose
    > that the compiler supplies __first_data__ and __last_data__.
    >
    > 2) To check if the address is within the valid stack we need two
    > memory
    > comparisons again. The current stack and the stored value of the top
    > of it.
    > We suppose the compiler provides __top_of_stack__
    >
    > 3) The heap. We suppose there is a procedure to verify a memory block.
    >
    > All this would cost a couple of memory reads in most cases, or a call
    > to a
    > procedure, in case of malloced block.
    >
    > What about making those tests automatically to do that with all
    > pointers
    > passed to all functions?
    >
    > That would lead to pointer bugs surfacing immediately. This could be
    > disconnected later. But in the first phases of development, speed is
    > not so
    > important as correctly implementing the algorithm.
    >
    > Pointer bugs are likely to surface in the first phases of development,
    > and we have the means now to put the machine to check those pointers.
    >
    > A run of the mill processor now runs at several GHZ. Some memory
    > comparisons would slow down the program so little as to be completely
    > transparent in PC architectures.
    >
    > Of course, in embedded systems the situation is different, but for C
    > developers in a PC this would be a good improvement.
    >
    > Just some thoughts
    >
    > jacob


    Speak as an lcc-win32 user, I like the idea but would suggest a
    somewhat different implementation, at least from what I think you are
    suggesting.

    It sound to me like you are thinking of adding a compiler option that
    would silently generate runtime code to test the validity of a pointer
    every time it was used in the code in certain situations.
    Dereferencing, certainly. Also assigning to pointers, passing as
    function arguments, returning from functions?

    That sounds like too much overhead even in early testing.

    I would suggest something like the assert macro. A macro like:

    POINTER_TEST(prt_name);

    ....that could be put in explicitly where wanted, returning 0 if the
    pointer is invalid, non-zero if it passes the test, so that in fact
    the POINTER_TEST macro could be used inside an assert macro.

    For example consider a function that receives a pointer to a
    structure. Ideally, that pointer should be validated only once, like
    it might be checked for NULL once at the beginning of the function,
    rather than for each of the many times the code uses the pointer to
    access a structure member.

    And, of course, like the assert macro, the pointer test macro should
    expand to nothing (such as "void(0)"), depending on the definition or
    lack of some other macro definition.

    Example:

    #ifdef TEST_POINTERS
    #define POINTER_TEST(p) pointer_test(p)
    #else
    #define POINTER_TEST(p) void(0)
    #endif

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c /faq
     
    Jack Klein, Dec 2, 2003
    #9
  10. >Sorry but I gather from the standard that the storage allocated by
    >local variables is valid only during the execution of a function.
    >
    >Since functions return and are called, this implies a stack structure
    >one way or the other. The thing gets started with main() that
    >can call other functions.


    This does not disallow a "stack structure" that involves OS/360-style
    save areas ("stack frames", if you insist) that are GETMAIN'd on
    function entry and FREEMAIN'd on function exit. The so-called "stack"
    would then be intermixed with malloc() memory as malloc() would almost
    certainly call GETMAIN also.

    The possibility of having multiple threads also tends to blow away
    the idea that active local variables can be found in a contiguous
    area between something like the address of a local variable in main()
    and the address of a local variable in the current function.


    >The scope of a global is indefinite, as long as the program runs.
    >This means that C surely assumes that this storage is distinct
    >conceptually from the local storage.


    But it does not mean that any global is contiguous with any other
    global in another translation unit, without having non-globals
    (e.g. read-only code) in between them.

    >malloc/free are part of the standard.
    >
    >> Pointers to valid memory locations can come from an external
    >> source (e.g., generated by an agent other than the currently
    >> running C program), and be usable by the currently running
    >> C program.


    Nobody ever uses functions like mmap() or dlopen(). Well, in ANSI
    C, they really don't.

    >> There are plenty of memory management tools out there that are
    >> replacements for the common implementations of malloc() and
    >> friends, for locating heap corruption, dangling references, etc.

    >
    >And they do probably a very similar thing to what I described.


    For memory management debugging purposes, I'd like to see
    check_malloc_arena(void) which checks for memory overruns
    in an unspecified way but *if* it's a linked list of some sort,
    makes sure it's in the correct order and not broken. I could
    also use is_from_malloc(void *pointer) which checks whether it
    is a valid pointer *FROM malloc()* .

    >My question is: would it be interesting to add to the language itself?


    It tends to encourage programmers to do horrible hacks of
    making more variants of NULL that test as invalid pointers for
    use with more special cases. For example:

    struct symtab *lookupsymbol(char *name)

    Returns: pointer to symbol table entry if name is found or created.
    NULL if name is an invalid symbol
    (void *)3 if memory could not be allocated to create a new
    symbol table entry.
    (void *)7 if an attempt is made to create an entry for two symbols
    identical in the first 64 characters.
    (void *)11 if name is an officially registered obscene word.

    Oh, yes, on the author's machine, 3, 7, and 11 are considered invalid
    pointers, and the author is depending on that being true elsewhere.


    Gordon L. Burditt
     
    Gordon Burditt, Dec 2, 2003
    #10
  11. On Mon, 1 Dec 2003 22:32:26 +0100, "jacob navia"
    <> wrote:

    >Valid pointers have two states. Either empty (NULL), or filled with an
    >address that must be at a valid address.
    >
    >Valid addresses are:
    >
    >1) The current global context. The first byte of the data of the
    >program
    > till the last byte. Here we find static tables, global context
    >pointers, etc.
    > This are the global variables of the program.
    >
    >2) The current scope and all nested scopes. The current scope is given
    > by the address of the local variables and the arguments. A
    >conservative
    > estimate of this area is the address of argc in main() or the
    >address of the
    > first local variable in main. Normally, a procedure should never
    >access
    > memory outside its scope, but it can receive pointers to areas in
    >higher
    > scopes, so the comparison is not easier if done throughly.
    >
    >3) The heap. To this area belong all addresses allocated with malloc()
    > and not passed to free().


    What about those of us whose hardware does not have stack or a heap?

    Why do you think argc and argv are in that order? Why do you think
    they are in any way related to the location of local variables?

    What about function pointers?

    There is at least one more state, uninitialized.


    <<Remove the del for email>>
     
    Barry Schwarz, Dec 2, 2003
    #11
  12. jacob navia

    Richard Bos Guest

    "jacob navia" <> wrote:

    > Valid pointers have two states. Either empty (NULL), or filled with an
    > address that must be at a valid address.
    >
    > Valid addresses are:
    >
    > 1) The current global context.


    There is no "current" global context. The point of global is that it's
    global, not current.

    > The first byte of the data of the program
    > till the last byte. Here we find static tables, global context
    > pointers, etc.


    That is assuming that these are to be found in one place only. Not
    necessarily true; these two:

    char *str1 ="Of Man's first disobedience, and the fruit";
    char str2[]="Through Eden took their Solitary Way";

    may well reside in widely separate areas of memory.

    > 2) The current scope and all nested scopes.


    Again, not necessarily together.

    > The current scope is given by the address of the local variables and
    > the arguments. A conservative estimate of this area is the address
    > of argc in main() or the address of the first local variable in main.


    Up, or down?

    > 3) The heap. To this area belong all addresses allocated with malloc()


    Or realloc(), or calloc().

    > and not passed to free().


    Or realloc().

    > A fast procedure tyo determine the validity of a pointer could be:


    Fast, but completely unportable.

    > 1) Check if the address is in the data area.


    How? Pointer comparison is not defined for pointers in different
    objects, let alone between a valid and an invalid pointer.

    > 2) To check if the address is within the valid stack we need two
    > memory comparisons again.


    Ditto.

    > 3) The heap. We suppose there is a procedure to verify a memory block.


    You might as well suppose an implementation-supplied function called
    validate_pointer(void *ptr), which would do all the work for you. It
    would, of course, not work for function pointers (which you never even
    mention), and wouldn't be able to tell if the pointer were properly
    aligned.

    Richard
     
    Richard Bos, Dec 2, 2003
    #12
  13. jacob navia

    Dan Pop Guest

    In <bqgc1a$3kd$> "jacob navia" <> writes:

    >Valid pointers have two states. Either empty (NULL), or filled with an
    >address that must be at a valid address.


    Null pointers are valid only in certain contexts: they can be assigned to
    other pointers, type converted and used as operands for the equality
    operators. In any other context involving the pointer value, they're
    invalid.

    >Valid addresses are:
    >
    >1) The current global context. The first byte of the data of the
    >program
    > till the last byte. Here we find static tables, global context
    >pointers, etc.
    > This are the global variables of the program.


    There is no requirement/guarantee that this is a compact address space.
    It may have "holes", whose addresses are invalid or each item may be put
    into its own memory segment on a segmented memory architecture.

    >2) The current scope and all nested scopes. The current scope is given
    > by the address of the local variables and the arguments. A
    >conservative
    > estimate of this area is the address of argc in main() or the
    >address of the
    > first local variable in main. Normally, a procedure should never
    >access
    > memory outside its scope, but it can receive pointers to areas in
    >higher
    > scopes, so the comparison is not easier if done throughly.


    See my comment above.

    >3) The heap. To this area belong all addresses allocated with malloc()
    > and not passed to free().


    Ditto. Also, there are malloc implementations that deliberately don't
    use a heap, to assist in the immediate detection of buffer overruns.
    See Electric Fence for an example.

    As far as the C standard is concerned, *each* top level (outermost)
    object exists in an address space of its own. This is clearly indicated
    by the interdiction to even compare pointers that don't point to the same
    object (or one byte after).

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
     
    Dan Pop, Dec 2, 2003
    #13
  14. jacob navia

    Paul Hsieh Guest

    "jacob navia" <> wrote:
    > Valid pointers have two states. Either empty (NULL), or filled with an
    > address that must be at a valid address.
    >
    > Valid addresses are: [... pointer classification description deleted ...]


    Indeed this is very similar to an idea I've had about extending the C
    language. If you think about it, tools like Purify, etc. must already
    do things like this.

    There are, of course, some problems with the approach you propose:

    1. C is commonly extended to be used for hardware device drivers. In
    such situations memory mapped pointers which are outside of any of
    your classifications will exist.

    2. C/C++ is commonly extended to include multithreading. In such
    environments there are *many* stacks. This would require that you
    retain a list of all such stack as any time.

    3. Some operating systems may expose a *shared memory* region from
    which different applications may share access to pointers coming from
    the OS. But as has been pointed out by another poster, the OS can, in
    general, give you pointers that come from who knows where.

    So rather than making the hardline assertion about whether or not a
    pointer is valid, why don't you instead try to determine the nature of
    the pointer as best you can determine it? For example:

    enum PTR_CLASSIFICATION {
    PTRCL_NULL = 0, /* NULL */
    PTRCL_UNKNOWN = 1, /* Unknown classification */
    PTRCL_ERR = 2, /* Pointers we *know* are wrong */
    PTRCL_STATIC_DATA = 3, /* In your data or program areas */
    PTRCL_AUTO = 4, /* A local variable (live stack) */
    PTRCL_HEAP = 5 /* In the heap */
    PTRCL_MAXIMUM = 5
    };

    enum PTR_CLASSIFICATION getPtrClassification (void *p);

    The point being that any compiler could extend this by adding in more
    classifications after PTRCL_HEAP, but not make unfounded assertions
    (there may be pointer types it doesn't know about, but others that are
    known for sure to be wrong.) The function is only required to make a
    best effort range check -- the pointer may be invalid for other
    reasons such as alignment which cannot be determined since the type is
    not provided.

    This then leaves it to the application to try to use this to test the
    validity of a pointer. In this way, even if the application has
    access to pointers outside of classifications that the compiler is
    aware of, one can still somehow try to account for these by hook or by
    crook in the application itself without being mislead about the true
    validity of the the pointer.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Dec 3, 2003
    #14
  15. jacob navia

    Richard Bos Guest

    (Paul Hsieh) wrote:

    > So rather than making the hardline assertion about whether or not a
    > pointer is valid, why don't you instead try to determine the nature of
    > the pointer as best you can determine it? For example:
    >
    > enum PTR_CLASSIFICATION {
    > PTRCL_NULL = 0, /* NULL */
    > PTRCL_UNKNOWN = 1, /* Unknown classification */
    > PTRCL_ERR = 2, /* Pointers we *know* are wrong */
    > PTRCL_STATIC_DATA = 3, /* In your data or program areas */
    > PTRCL_AUTO = 4, /* A local variable (live stack) */
    > PTRCL_HEAP = 5 /* In the heap */
    > PTRCL_MAXIMUM = 5
    > };
    >
    > enum PTR_CLASSIFICATION getPtrClassification (void *p);


    Question: why do you give a damn about anything but "pointer is null",
    "pointer points to an object", "pointer points to a function" and
    "pointer is invalid"? To a well-written program, it should not matter if
    a pointer points to automatic, static or allocated memory.

    Richard
     
    Richard Bos, Dec 3, 2003
    #15
  16. jacob navia

    Paul Hsieh Guest

    says:
    > (Paul Hsieh) wrote:
    > > So rather than making the hardline assertion about whether or not a
    > > pointer is valid, why don't you instead try to determine the nature of
    > > the pointer as best you can determine it? For example:
    > >
    > > enum PTR_CLASSIFICATION {
    > > PTRCL_NULL = 0, /* NULL */
    > > PTRCL_UNKNOWN = 1, /* Unknown classification */
    > > PTRCL_ERR = 2, /* Pointers we *know* are wrong */
    > > PTRCL_STATIC_DATA = 3, /* In your data or program areas */
    > > PTRCL_AUTO = 4, /* A local variable (live stack) */
    > > PTRCL_HEAP = 5 /* In the heap */
    > > PTRCL_MAXIMUM = 5
    > > };
    > >
    > > enum PTR_CLASSIFICATION getPtrClassification (void *p);

    >
    > Question: why do you give a damn about anything but "pointer is null",
    > "pointer points to an object", "pointer points to a function" and
    > "pointer is invalid"? To a well-written program, it should not matter if
    > a pointer points to automatic, static or allocated memory.


    The most obvious use for this is *DEBUGGING*. In the realm of debugging, the
    more information you recover at the time of error, the better.

    The classic example is being passes a string which you then store into a
    structure. You check back at some other point and the data is corrupted --
    why? Because the string was really just a local char array that is long gone.
    The fact that a pointer is pointing to an object is useless information if the
    fact that its wrong is because a local that's no longer there. This is one of
    many notoriously difficult debugging cases for which you would wish to have
    more information about what your data is, and where it came from.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Dec 3, 2003
    #16
  17. jacob navia

    Richard Bos Guest

    (Paul Hsieh) wrote:

    > says:
    > > (Paul Hsieh) wrote:
    > > > enum PTR_CLASSIFICATION {
    > > > PTRCL_NULL = 0, /* NULL */
    > > > PTRCL_UNKNOWN = 1, /* Unknown classification */
    > > > PTRCL_ERR = 2, /* Pointers we *know* are wrong */
    > > > PTRCL_STATIC_DATA = 3, /* In your data or program areas */
    > > > PTRCL_AUTO = 4, /* A local variable (live stack) */
    > > > PTRCL_HEAP = 5 /* In the heap */
    > > > PTRCL_MAXIMUM = 5
    > > > };
    > > >
    > > > enum PTR_CLASSIFICATION getPtrClassification (void *p);

    > >
    > > Question: why do you give a damn about anything but "pointer is null",
    > > "pointer points to an object", "pointer points to a function" and
    > > "pointer is invalid"? To a well-written program, it should not matter if
    > > a pointer points to automatic, static or allocated memory.

    >
    > The most obvious use for this is *DEBUGGING*. In the realm of debugging, the
    > more information you recover at the time of error, the better.


    Of course. But that's what debuggers are for; you shouldn't be doing
    this yourself.

    Richard
     
    Richard Bos, Dec 4, 2003
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marcus Alanen
    Replies:
    1
    Views:
    341
    Alf P. Steinbach
    Sep 7, 2003
  2. tuko
    Replies:
    4
    Views:
    402
    Dave Rahardja
    Sep 17, 2004
  3. Ioannis Vranos

    Validity of pointer conversions

    Ioannis Vranos, Jan 5, 2008, in forum: C++
    Replies:
    35
    Views:
    784
    James Kanze
    Jan 9, 2008
  4. sinbad

    validity of a pointer

    sinbad, Jan 27, 2009, in forum: C Programming
    Replies:
    34
    Views:
    1,241
    Tim Rentsch
    Feb 12, 2009
  5. jacob navia

    Re: Checking validity of a file pointer

    jacob navia, Jan 13, 2010, in forum: C Programming
    Replies:
    11
    Views:
    1,813
    Keith Thompson
    Jan 16, 2010
Loading...

Share This Page