Re: To find the size of array using its pointer

Discussion in 'C Programming' started by Keith Thompson, Jan 20, 2005.

  1. "Sontu" <> writes:
    > Consider the following code:
    >
    > int main(void)
    > {
    > char buffer[20];
    > func(buffer);
    > }
    >
    > void func(char *bufpas)
    > {
    > ......
    >
    > }
    >
    > Now can i find size of "buffer" using its pointer "bufpas"? In "main" i
    > can do it using "sizeof(buffer)" but in "func", "sizeof(bufpas)" would
    > give 4 (size of pointer variable)as output (if using gcc on red hat
    > linux).


    You can't. If you want func to know the length of the array, you have
    to pass that information to it.

    > I am thinking of some method to do it. For example modify the compiler
    > to put some distinguishing element at the end of every array and then
    > find the size using the pointer.


    C strings use that method (strings are terminated by a '\0'
    character), but at the expense of not being able to have a '\0'
    character within a string -- and the '\0' marks the end of the string,
    not necessarily the end of the array object. For an array of int, for
    example, there is no value you can use as a distinguishing element
    (unless you're willing to give up on storing that value in the array).

    > But there is a problem. It will work in the case of those data types
    > that have 1-byte storage but for others for eg. array of type "int", if
    > no. of bytes is 16, sizeof(<integer array>) should return 4. Thus we
    > need to know the type of array too.
    >
    > If any of you have some other bright idea, please do share with me.


    void func(size_t buflen, char *bufpas)
    {
    ...
    }

    int main(void)
    {
    char buffer[20];
    func(sizeof buffer, buffer);
    ...
    }

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Jan 20, 2005
    #1
    1. Advertising

  2. Keith Thompson

    Sontu Guest

    Thanks. So nice of you all, for replying, but I think if i tell you my
    intent then my problem will become much clearer.

    In C, parameters to functions are either passed using "call by value"
    or "call by reference" [infact only "call by value", "call by
    reference" can be considered as the former if value is address].

    Now if need to implement "call by value-result", precisely on EXISTING
    CODE, what should i do?

    I mean,

    int main(void)
    {
    char buffer[20];
    func(buffer);
    }


    void func(char *bufpas)
    {
    char _buffer[20]; //temporary array
    memcpy(_buffer,bufpas,20);
    .......
    ....

    memcpy(bufpas,_buffer,20);
    }

    should happen. For this to happen either i can use source-2-source
    transformation or modify the compiler. But in any case i need to know
    the size of "bufpas" to allocate a temporary array of same size.

    If its not clear, kindly mention, i will try to make it comprehensive.
    Sontu, Jan 20, 2005
    #2
    1. Advertising

  3. Keith Thompson

    gooch Guest

    Sontu wrote:
    > Thanks. So nice of you all, for replying, but I think if i tell you

    my
    > intent then my problem will become much clearer.
    >
    > In C, parameters to functions are either passed using "call by value"
    > or "call by reference" [infact only "call by value", "call by
    > reference" can be considered as the former if value is address].


    They are not really the same. You are passing the pointer value to the
    function but that is really not relevant. Thereason for the pointer is
    to allow you to access the memory that it points to.
    >
    > Now if need to implement "call by value-result", precisely on

    EXISTING
    > CODE, what should i do?


    What code is preexisting? What does I need to implement "call by
    value-result" mean?
    >
    > I mean,
    >
    > int main(void)
    > {
    > char buffer[20];
    > func(buffer);
    > }
    >
    >
    > void func(char *bufpas)

    You need to pass the size of the buffer to the function. If you do that
    you can perform the operations directly on the buffer passed and there
    is no need to copy the value to the function locally with memcpy. If
    you feel you need to copy the value locally you need to allocate memory
    for the buffer based on the size value passed and then copy it. If you
    do this don't forget to free the memory when you are done withh it.
    > {
    > char _buffer[20]; //temporary array
    > memcpy(_buffer,bufpas,20);
    > ......
    > ...
    >
    > memcpy(bufpas,_buffer,20);
    > }
    >
    > should happen. For this to happen either i can use source-2-source
    > transformation or modify the compiler. But in any case i need to know
    > the size of "bufpas" to allocate a temporary array of same size.
    >
    > If its not clear, kindly mention, i will try to make it

    comprehensive.
    gooch, Jan 20, 2005
    #3
  4. On Thu, 20 Jan 2005 09:06:25 -0800, Sontu wrote:

    > Thanks. So nice of you all, for replying, but I think if i tell you my
    > intent then my problem will become much clearer.
    >
    > In C, parameters to functions are either passed using "call by value"
    > or "call by reference" [infact only "call by value", "call by
    > reference" can be considered as the former if value is address].
    >
    > Now if need to implement "call by value-result", precisely on EXISTING
    > CODE, what should i do?


    I think you'd really need to convince us that this is a sensible thing to
    do. Presumably thr existing code is designed to work without these "call
    by value-result" semantics so why change it? What are you REALLY trying to
    achieve? What you are describing is a means, not an end.

    > I mean,
    >
    > int main(void)
    > {
    > char buffer[20];
    > func(buffer);
    > }
    > }
    > }
    > void func(char *bufpas)
    > {
    > char _buffer[20]; //temporary array
    > memcpy(_buffer,bufpas,20);


    How does the existing func() code know how big buffer is?

    > ......
    > ...
    >
    > memcpy(bufpas,_buffer,20);
    > }
    > }
    > should happen. For this to happen either i can use source-2-source
    > transformation


    Then maybe you can add an extra size argument automatically. It may not be
    easy to figure out the needed size though, e.h. if the caller is passing a
    value from a pointer variable rather than a declared array directly.

    > or modify the compiler.


    I doubt whether that will be a viable alternative.

    > But in any case i need to know
    > the size of "bufpas" to allocate a temporary array of same size.
    >
    > If its not clear, kindly mention, i will try to make it comprehensive.


    My reaction to how to do something like this is "don't", reexamine the
    underlying problem and find a better solution. Which brings us back to my
    original qustion, what are you really trying to do?

    Lawrence
    Lawrence Kirby, Jan 20, 2005
    #4
  5. On 20 Jan 2005 09:06:25 -0800, Sontu
    <> wrote:

    > Thanks. So nice of you all, for replying, but I think if i tell you my
    > intent then my problem will become much clearer.
    >
    > In C, parameters to functions are either passed using "call by value"
    > or "call by reference" [infact only "call by value", "call by
    > reference" can be considered as the former if value is address].
    >
    > Now if need to implement "call by value-result", precisely on EXISTING
    > CODE, what should i do?


    Use Fortran or Ada, they are the only languages I know with "pass by
    value-result" semantics (I'm not certain about Ada for that matter, it's
    been over 20 years since I used it but I remember something like that).

    > I mean,
    >
    > int main(void)
    > {
    > char buffer[20];
    > func(buffer);
    > }


    You mean that you already have that code which calls your function and
    you aren't allowed to change it? Then you're out of luck.

    > void func(char *bufpas)
    > {
    > char _buffer[20]; //temporary array
    > memcpy(_buffer,bufpas,20);
    > ......
    > ...
    >
    > memcpy(bufpas,_buffer,20);
    > }
    >
    > should happen. For this to happen either i can use source-2-source
    > transformation or modify the compiler. But in any case i need to know
    > the size of "bufpas" to allocate a temporary array of same size.


    You do indeed. To make it worse, what if the pointer had been passed
    through another function? It's not possible without passing the size at
    each stage.

    Sorry, C does not have that functionality. Possibly you can run things
    through a source converter as you suggest, but make very sure that it is
    well tested before using it on production code...

    (If you used C++ you could pass a vector by reference and copy that, but
    you would still have to change everywhere which declared such a buffer
    to be passed around...)

    Chris C
    Chris Croughton, Jan 20, 2005
    #5
  6. Keith Thompson

    Sontu Guest

    Hi all,

    Actually i have designed a solution for "Buffer Overflow Attack". There
    is a vulnerability in C and C++ ( i can't comment on other languages
    because i haven't used them excpet that Java has this but too a very
    less extent). If we try to write into an array beyond its defined size,
    and if the array is allocated on stack (its not global or static) then
    it will overwrite the contents of stack. We know that on stack lie the
    frames corresponding to each called function. These frames contain
    important control structures like RETURN ADDRESS and PREVIOUS FRAME
    POINTER that are necessary for maintaining the control flow. If some
    one is able to overflow the array (preferrably of characters) and is
    able to overwrite these data structures selectively, he can change the
    flow. Most of the attackers use this vulnerability.

    What i have thought to do is that when a function is called, i am going
    to make all the previous frames as write protected, so that no
    operation in the current function can write into the crucial control
    structures and modify them. But this brings a new problem.
    If a pointer to some array that is allocated in the previous function
    is passed to the current function and there is instruction in the
    current function that tries to write into that array using the passed
    pointer, it will generate an exception although the instruction is
    genuine.

    So what i've thought to do is, that i will make copies of those
    variables whose addresses are passed (in the called function), do all
    modifications in them, and before returning from the function, i will
    copy them to the original ones. ]

    Since i am going to make the copy on stack, i need not deallocate them
    explicitly because that is done as a part of "return" sequence.

    I don't want to use heap because there are already many solutions to
    BOA that make use of heap.
    Existing code means, for eg code for ftp daemon.

    Thanks
    Sontu, Jan 22, 2005
    #6
  7. Keith Thompson

    Sontu Guest

    Can I know why C chose to use "call by value" or "call by reference"
    because before suggesting the change that i propose (that effectively
    will use "call by value result"), i need to know why creators did not
    prefer "call by value result". Was it purely due to memory concerns or
    something else? Infact i read that for RPC, call by value result is a
    better alternative.

    Thanks
    Sontu, Jan 22, 2005
    #7
  8. On 22 Jan 2005 01:07:54 -0800, Sontu
    <> wrote:

    > Can I know why C chose to use "call by value" or "call by reference"


    C only supports "call by value", sometimes those values are pointers
    which can be dereferenced to access something elsewhere.

    > because before suggesting the change that i propose (that effectively
    > will use "call by value result"),


    Ths is a proposed change to what, exactly? You want to change the C
    standard? Sorry, that is very unlikely to be done (try converting a
    vegetarian to eating beef, it's easier). If you want to write your own
    language and take it through standardisation feel free.

    > i need to know why creators did not prefer "call by value result". Was
    > it purely due to memory concerns or something else?


    C supports call by value only. It's simple, efficient, and easy to
    validate (the original authors possibly had some of those as criteria).
    As I said before, the only language I know for certain that has call by
    value-result is Fortran, and I don't know why they decided on that, all
    more recent languages I've seen (with the possible exception of Ada,
    which is a "kitchen sink" language with bits from all over) have used
    either call by value or call by reference. So perhaps you need to ask
    why no other major recent languages use call by value/result...

    > Infact i read that for RPC, call by value result is a
    > better alternative.


    For RPC it's pretty much the only alternative, because the call is
    remote (that's what the R stands for) so all data must be passed whole
    in both directions. However, C isn't written for RPC, if you want a
    language which is then you need a different language.

    Chris C
    Chris Croughton, Jan 22, 2005
    #8
  9. Keith Thompson

    Chris Torek Guest

    In article <>
    > <> wrote:
    >> Can I know why C chose to use "call by value" ...
    >> i need to know why creators did not prefer "call by value result". Was
    >> it purely due to memory concerns or something else?


    Chris Croughton <> wrote:
    >On 22 Jan 2005 01:07:54 -0800, Sontu
    >C supports call by value only. It's simple, efficient, and easy to
    >validate (the original authors possibly had some of those as criteria).
    >As I said before, the only language I know for certain that has call by
    >value-result is Fortran, and I don't know why they decided on that ...


    Actually, Fortran did not mandate value-result, it merely allowed
    it.

    The difference shows up if you do something that is illegal in
    Fortran (at least pre-F90; I am not particularly familiar with
    anything post-F77). In C, you may write the following program:

    #include <stdio.h>

    int global_var = 42;

    void test_for_value_result(int *p) {
    *p = 0;
    printf("in function: global_var is %d while *p is now 0\n",
    global_var);
    }

    int main(void) {
    printf("before call: global_var is %d\n", global_var);
    test_for_value_result(&global_var);
    printf("after call: global_var is %d\n", global_var);
    return 0;
    }

    Running this on any (corrrect) C system must produce:

    before call: global_var is 42
    in function: global_var is 0 while *p is now 0
    after call: global_var is 0

    A straightforward transliteration to Fortran (F77, probably with
    some extensions and not exactly idiomatic) produces the following
    illegal-but-does-not-require-a-diagnostic program:

    SUBROUTINE TEST(P)
    INTEGER P

    INTEGER GLOBAL
    COMMON GLOBAL

    P = 0
    PRINT *, 'in function: global var is', GLOBAL, 'while P is now 0'

    RETURN
    END

    PROGRAM MAIN
    INTEGER GLOBAL
    COMMON GLOBAL

    PRINT *, 'before call: global var is', GLOBAL
    CALL TEST(GLOBAL)
    PRINT *, 'after call: global var is', GLOBAL

    STOP
    END

    If this program compiles and runs (it typically does), the output
    is typically either:

    before call: global var is 42
    in function: global var is 0 while P is now 0
    after call: global var is 0

    (in which case the compiler used call-by-reference) or:

    before call: global var is 42
    in function: global var is 42 while P is now 0
    after call: global var is 0

    (in which case the compiler used value-result).

    Programs that can distinguish between the two calling methods
    have undefined behavior in F77, which allows a compiler to use
    either one without affecting any (legal) program.

    >more recent languages I've seen (with the possible exception of Ada,
    >which is a "kitchen sink" language with bits from all over) have used
    >either call by value or call by reference. So perhaps you need to ask
    >why no other major recent languages use call by value/result...


    Any language that rules out the kind of aliasing C allows, allows
    value-result for "by-reference" arguments, in the same way F77
    does. In other words, if you cannot tell one from the other (because
    a program that tells you which the compiler uses produces undefined
    results and therefore tells you nothing :) ), how can you tell
    which one the compiler actually used? (Or, equivalently: your
    program tells you that you got by-reference today -- but if you
    compile tomorrow, perhaps the compiler will use value-result,
    because it depends on the phase of the moon.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jan 22, 2005
    #9
  10. Keith Thompson

    Bill Reid Guest

    Sontu <> wrote in message
    news:...
    > Hi all,
    >
    > Actually i have designed a solution for "Buffer Overflow Attack". There
    > is a vulnerability in C and C++ ( i can't comment on other languages
    > because i haven't used them excpet that Java has this but too a very
    > less extent). If we try to write into an array beyond its defined size,
    > and if the array is allocated on stack (its not global or static) then
    > it will overwrite the contents of stack. We know that on stack lie the
    > frames corresponding to each called function. These frames contain
    > important control structures like RETURN ADDRESS and PREVIOUS FRAME
    > POINTER that are necessary for maintaining the control flow. If some
    > one is able to overflow the array (preferrably of characters) and is
    > able to overwrite these data structures selectively, he can change the
    > flow. Most of the attackers use this vulnerability.
    >
    > What i have thought to do is that when a function is called, i am going
    > to make all the previous frames as write protected, so that no
    > operation in the current function can write into the crucial control
    > structures and modify them. But this brings a new problem.


    I'm not a programmer (at least not professionally), so I may be
    speaking way out of school here, but wouldn't it be easier just
    to not overwrite the array/buffer in the first place?

    Can't this be done by just installing error checks in some places
    where needed, but in general just by making the default termination
    of the buffer-filling loop the size of the array (rather than the much
    simpler and more typical "special character" like EOF)?

    Of course, then you have to add the generally-redundant
    check for EOF (or whatever) inside the loop, but if you're
    as committed to security as Microsoft claims to be...

    ---
    William Ernest Reid
    Bill Reid, Jan 22, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. , India

    pointer to an array vs pointer to pointer

    , India, Sep 20, 2011, in forum: C Programming
    Replies:
    5
    Views:
    452
    James Kuyper
    Sep 23, 2011
  2. thunk
    Replies:
    1
    Views:
    308
    thunk
    Mar 30, 2010
  3. thunk
    Replies:
    0
    Views:
    477
    thunk
    Apr 1, 2010
  4. thunk
    Replies:
    14
    Views:
    617
    thunk
    Apr 3, 2010
  5. Daniel Fugmann

    Adapt size of popup to its content's size?

    Daniel Fugmann, Aug 17, 2004, in forum: Javascript
    Replies:
    4
    Views:
    129
    kaeli
    Aug 18, 2004
Loading...

Share This Page