confused: vector<char*> and malloc

Discussion in 'C++' started by Richard, Aug 16, 2005.

  1. Richard

    Richard Guest

    vector<char*> m_Text;
    m_Text.resize(1);
    char* foo = "FOO";
    char* bar = "BAR";
    char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);
    if (foobar)
    {
    strcpy(foobar, foo);
    strcat(foobar, bar);
    }
    m_Text[0] = foobar;

    // Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
    call free(m_Text[0]) in the destructor?
     
    Richard, Aug 16, 2005
    #1
    1. Advertising

  2. Richard wrote:
    > vector<char*> m_Text;
    > m_Text.resize(1);


    This makes 'm_Text' to be 1 element long. And since 'm_Text' was empty
    prior to that, it adds 1 pointer to it and makes it null.

    > char* foo = "FOO";
    > char* bar = "BAR";
    > char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);
    > if (foobar)
    > {
    > strcpy(foobar, foo);
    > strcat(foobar, bar);
    > }


    At this point 'foobar' is a pointer to 7 character array, allocated in
    the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
    in it, in sequence.

    > m_Text[0] = foobar;


    This makes the value of the only element in the vector 'm_Text' to be the
    same as the pointer to that 7-character array.

    > // Will m_Text[0] get freed when m_Text goes out of scope?


    No.

    > If not, should I
    > call free(m_Text[0]) in the destructor?


    Probably. Assuming you're talking about the destructor of the class in
    which 'm_Text' is a data member.

    V
     
    Victor Bazarov, Aug 16, 2005
    #2
    1. Advertising

  3. Richard

    Kyle Guest

    Richard wrote:
    > vector<char*> m_Text;
    > m_Text.resize(1);
    > char* foo = "FOO";
    > char* bar = "BAR";
    > char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);


    malloc is C, why dont you use new ?

    > if (foobar)
    > {
    > strcpy(foobar, foo);
    > strcat(foobar, bar);
    > }
    > m_Text[0] = foobar;
    >
    > // Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
    > call free(m_Text[0]) in the destructor?
    >


    try this if you dont want to manage memory on your own

    vector<string> m_Text;

    char* foo = "FOO";
    char* bar = "BAR";

    string foobar(foo);
    foobar.append( bar );

    m_Text.push_back( foobar );
     
    Kyle, Aug 16, 2005
    #3
  4. "Kyle" <> wrote in message
    news:ddtj91$c16$...
    > Richard wrote:
    >> vector<char*> m_Text;
    >> m_Text.resize(1);
    >> char* foo = "FOO";
    >> char* bar = "BAR";
    >> char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);

    >
    > malloc is C, why dont you use new ?


    malloc is C++ too.

    It is used to allocate raw memory in C++. Since new (and new[]) allocates
    space and constructs object(s), new (and new[]) is not suitable when there
    is not enough information to construct the object(s).

    > try this if you dont want to manage memory on your own
    >
    > vector<string> m_Text;


    Great advice!

    Ali
     
    =?iso-8859-1?Q?Ali_=C7ehreli?=, Aug 16, 2005
    #4
  5. Richard

    Richard Guest

    "Kyle" <> wrote in message
    news:ddtj91$c16$...
    > Richard wrote:
    > > vector<char*> m_Text;
    > > m_Text.resize(1);
    > > char* foo = "FOO";
    > > char* bar = "BAR";
    > > char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);

    >
    > malloc is C, why dont you use new ?
    >
    > > if (foobar)
    > > {
    > > strcpy(foobar, foo);
    > > strcat(foobar, bar);
    > > }
    > > m_Text[0] = foobar;
    > >
    > > // Will m_Text[0] get freed when m_Text goes out of scope? If not,

    should I
    > > call free(m_Text[0]) in the destructor?
    > >

    >
    > try this if you dont want to manage memory on your own
    >
    > vector<string> m_Text;
    >
    > char* foo = "FOO";
    > char* bar = "BAR";
    >
    > string foobar(foo);
    > foobar.append( bar );
    >
    > m_Text.push_back( foobar );


    That leads me to another question. Is it common practice to pass std::string
    as a function argument? Is there much overhead?

    void Foo(string& str)

    If I went with vector<string>, would I need to covert all my functions that
    currently use char* as an argument?

    Thanks.
     
    Richard, Aug 16, 2005
    #5
  6. Richard wrote:
    > [...] Is it common practice to pass std::string
    > as a function argument?


    By value, no. By a reference, or by a reference to const, yes.

    > Is there much overhead?


    There can be. Just like with any other UDT.

    > void Foo(string& str)
    >
    > If I went with vector<string>, would I need to covert all my functions that
    > currently use char* as an argument?


    Yes, most likely. BTW, if your functions that currently use 'char*' do
    not change the characters in the arrays, you should declare 'char const*'
    as the argument type. Then you don't have to change much, but you will
    need to use 'c_str' member:

    void my_function(char const*);
    ...
    my_function(v.c_str());

    V
     
    Victor Bazarov, Aug 16, 2005
    #6
  7. Richard

    Richard Guest

    "Victor Bazarov" <> wrote in message
    news:X%sMe.30145$01.us.to.verio.net...
    > Richard wrote:
    > > [...] Is it common practice to pass std::string
    > > as a function argument?

    >
    > By value, no. By a reference, or by a reference to const, yes.
    >
    > > Is there much overhead?

    >
    > There can be. Just like with any other UDT.
    >
    > > void Foo(string& str)
    > >
    > > If I went with vector<string>, would I need to covert all my functions

    that
    > > currently use char* as an argument?

    >
    > Yes, most likely. BTW, if your functions that currently use 'char*' do
    > not change the characters in the arrays, you should declare 'char const*'
    > as the argument type. Then you don't have to change much, but you will
    > need to use 'c_str' member:
    >
    > void my_function(char const*);
    > ...
    > my_function(v.c_str());
    >
    > V



    My array of strings is very large. I did a small test using char* vs string
    and the results were really bad. I must be doing something wrong:

    vector<char*> test;
    test.resize(10000000);
    for(int t = 0; t != 10000000; ++t)
    {
    test[t] = "TEST 123";
    }

    That takes up a minimal amount of memory if using char*. But if you change
    it to vector<string> test; it takes up around 1000 times more memory! What
    am I doing wrong?
     
    Richard, Aug 16, 2005
    #7
  8. Richard wrote:
    > My array of strings is very large. I did a small test using char* vs string
    > and the results were really bad. I must be doing something wrong:
    >
    > vector<char*> test;
    > test.resize(10000000);
    > for(int t = 0; t != 10000000; ++t)
    > {
    > test[t] = "TEST 123";
    > }
    >
    > That takes up a minimal amount of memory if using char*.


    Yeah... You got a vector all elements of which are the same. No extra
    memory is allocated, just 10 million pointers. The consumption of memory
    (not considering the overhead for dynamic memory management) is quite easy
    to calculate:

    sizeof(vector<char*>) +
    sizeof(char*) * test.size() + sizeof("TEST 123")

    (which should give about 40000000, give or take a few bytes)

    > But if you change
    > it to vector<string> test; it takes up around 1000 times more memory! What
    > am I doing wrong?


    I am not sure, to be honest with you. Every 'string' maintains its own
    array of char internally. When you resize the 'test' vector to contain
    ten millions of 'string' objects, it first puts a bunch of empty ones
    there, and then when you in the loop assign those values, every string
    in the vector needs to allocate its own small array (and possibly a bit
    larger than asked for), which may lead to severe fragmentation of memory,
    especially considering that a temporary is probably created to accommodate
    your "TEST 123" literal... The final memory cost should be around

    sizeof(vector<string>) +
    sizeof(string) * test.size() +
    sizeof("TEST 123") * test.size()

    It is hard to believe it takes "around 1000 times more memory". The
    string objects themselves are not that big, so you might be looking at
    four, maybe ten, times the memory consumption, but definitely not the
    thousand times. Are you running on a 64-bit machine? 1000 times more
    with a vector of 10 million pointers is beyond what a 32-bit machine can
    handle, that's for sure.

    V
     
    Victor Bazarov, Aug 16, 2005
    #8
  9. Richard

    Default User Guest

    Richard wrote:


    > My array of strings is very large. I did a small test using char* vs
    > string and the results were really bad. I must be doing something
    > wrong:
    >
    > vector<char*> test;
    > test.resize(10000000);
    > for(int t = 0; t != 10000000; ++t)
    > {
    > test[t] = "TEST 123";
    > }
    >
    > That takes up a minimal amount of memory if using char*. But if you
    > change it to vector<string> test; it takes up around 1000 times more
    > memory! What am I doing wrong?



    You stuff 100 million copies of a pointer to the same piece of memory
    into the vector. In the real application, you would presumably have a
    different string in each slot in the vector.

    To do a fairer comparison:

    vector<char*> test;
    test.resize(10000000);
    char* tmp = "TEST 123";

    for(int t = 0; t != 10000000; ++t)
    {
    test[t] = new char[9];
    strcpy(test[t], tmp);
    }




    Brian
     
    Default User, Aug 16, 2005
    #9
  10. Victor Bazarov wrote:
    > Richard wrote:


    > > vector<char*> m_Text;
    > > m_Text.resize(1);


    <snip>

    > > char* foo = "FOO";
    > > char* bar = "BAR";
    > > char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);
    > > if (foobar)
    > > {
    > > strcpy(foobar, foo);
    > > strcat(foobar, bar);
    > > }

    >
    > At this point 'foobar' is a pointer to 7 character array, allocated in
    > the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
    > in it, in sequence.


    'F', 'O', 'O', 'B', 'A', 'R', '\0'

    <snip>
     
    Nick Keighley, Aug 17, 2005
    #10
  11. > It is hard to believe it takes "around 1000 times more memory".

    The pointer case:

    sizeof(vector<char*> + sizeof(char*) * test.size() + sizeof("TEST 123")
    = 16 + 4 * 10,000,000 + 9

    The string case: each string, since it's given a const char*, will
    make a copy, I assume (sorry, I don't use std::string but that would be
    reasonable behaviour: share string only if it can reference count the
    memory)

    So we have:
    sizeof(vector<string>) + sizeof(string) * test.size() + sizeof("TEST
    123") * test.size + heap overhead * test.size()

    Sorry, don't know size of string, but perhaps 12 bytes (ref count,
    length, alloc length)
    On PC, request for 8 bytes requires 40 bytes including heap overhead.
    I don't recall the overhead on linux, but it's less. Perhaps this uses
    24 bytes.

    So our guess of memory usage is:
    16 + (12 + 40) * 10,000,000 on PC, or 13 times the amount of memory
    needed
    by your single pointer case.

    If you are seeing something significantly different then something is
    odd.
    Perhaps 1,000,000 vs. 10,000,000: easy typo.

    Stuart
     
    Stuart MacMartin, Aug 20, 2005
    #11
  12. >> malloc is C, why dont you use new ?
    >malloc is C++ too.


    malloc is there for compatibility. It doesn't handle objects so is less
    general.

    It requires a corresponding free() whereas new requires corresponding
    delete or delete []. Since you need to use new elsewhere in your code,
    why risk using the wrong free/delete call? You just confuse the issue
    by intermixing malloc and new and this can cause a bug that takes
    months to find (experience talking). What if you change a structure to
    something requiring a destructor (e.g. you change a const char* member
    variable to a string). Do you want to go through all your code to
    change malloc/free to new/delete assuming you even notice the problem?

    malloc is dangerous.

    Stuart
     
    Stuart MacMartin, Aug 20, 2005
    #12
  13. Richard

    Old Wolf Guest

    Richard wrote:
    >
    > My array of strings is very large. I did a small test using char* vs string
    > and the results were really bad. I must be doing something wrong:
    >
    > vector<char*> test;
    > test.resize(10000000);
    > for(int t = 0; t != 10000000; ++t)
    > {
    > test[t] = "TEST 123";
    > }
    >
    > That takes up a minimal amount of memory if using char*. But if you change
    > it to vector<string> test; it takes up around 1000 times more memory! What
    > am I doing wrong?


    You are comparing apples with oranges. This test program maintains
    one string and keeps 10 million pointers to it. If you modify one
    string then they will all change. But with the string example, there
    are 10 million fat pointers and 10 million strings.

    I doubt it takes 1000 times more memory, unless you have 40 gigabytes
    of RAM, as Victor pointed out.

    Change your test program to:

    vector<char*> test;
    test.resize(10000000);
    for(int t = 0; t != 10000000; ++t)
    {
    test[t] = (char *)malloc(9);
    std::strcpy( test[t], "TEST 123" );
    }

    and then compare the memory usage. (You will probably find
    this example slightly smaller than the string example, but
    not by a lot).

    Finally, what compiler do you use. Many standard libraries
    use SSO (Small String Optimisation), meaning that if the
    string data is smaller than sizeof(string), then it actually
    stores the entire string internally, without having to
    allocate more memory. In this case, the string example
    might even use less memory than the malloc example.
     
    Old Wolf, Aug 21, 2005
    #13
  14. "Stuart MacMartin" <> wrote in message
    news:...
    >>> malloc is C, why dont you use new ?

    >>malloc is C++ too.

    >
    > malloc is there for compatibility. It doesn't handle objects so is less
    > general.


    That's why I said that it's used to allocate raw memory. A quote from my
    e-mail:

    <quote>
    It is used to allocate raw memory in C++.
    </quote>

    > It requires a corresponding free() whereas new requires corresponding
    > delete or delete []. Since you need to use new elsewhere in your code,
    > why risk using the wrong free/delete call?


    Because like any good C++ code, my code doesn't contain a single delete or
    delete[]. Dynamic objects are handled by smart pointers (the RAII idiom).
    Codes like mine are immune from such problems.

    Having said that, I don't use malloc either.

    > You just confuse the issue
    > by intermixing malloc and new and this can cause a bug that takes
    > months to find (experience talking). What if you change a structure to
    > something requiring a destructor (e.g. you change a const char* member
    > variable to a string).


    No, malloc is for raw buffers only. structs are not necessarily raw buffers.
    For example, I don't keep POD structs around either. Probably all of them
    have constructors.

    > malloc is dangerous.


    Sorry, I never heard that one before and I don't agree with the statement.
    (A search for "malloc is dangerous" on Google finds only your statement.) I
    agree that we can do bad things with malloc, but malloc would not be
    dangerous alone.

    Ali
     
    =?iso-8859-1?Q?Ali_=C7ehreli?=, Sep 1, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. lovecreatesbeauty
    Replies:
    1
    Views:
    1,151
    Ian Collins
    May 9, 2006
  2. Replies:
    8
    Views:
    1,999
    Csaba
    Feb 18, 2006
  3. davidb
    Replies:
    0
    Views:
    809
    davidb
    Sep 1, 2006
  4. davidb
    Replies:
    6
    Views:
    1,598
    Default User
    Sep 1, 2006
  5. arnuld
    Replies:
    19
    Views:
    724
    Jacek Dziedzic
    Mar 30, 2007
Loading...

Share This Page