confused: vector<char*> and malloc

R

Richard

vector<char*> m_Text;
m_Text.resize(1);
char* foo = "FOO";
char* bar = "BAR";
char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);
if (foobar)
{
strcpy(foobar, foo);
strcat(foobar, bar);
}
m_Text[0] = foobar;

// Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
call free(m_Text[0]) in the destructor?
 
V

Victor Bazarov

Richard said:
vector<char*> m_Text;
m_Text.resize(1);

This makes 'm_Text' to be 1 element long. And since 'm_Text' was empty
prior to that, it adds 1 pointer to it and makes it null.
char* foo = "FOO";
char* bar = "BAR";
char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);
if (foobar)
{
strcpy(foobar, foo);
strcat(foobar, bar);
}

At this point 'foobar' is a pointer to 7 character array, allocated in
the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
in it, in sequence.
m_Text[0] = foobar;

This makes the value of the only element in the vector 'm_Text' to be the
same as the pointer to that 7-character array.
// Will m_Text[0] get freed when m_Text goes out of scope?
No.

> If not, should I
call free(m_Text[0]) in the destructor?

Probably. Assuming you're talking about the destructor of the class in
which 'm_Text' is a data member.

V
 
K

Kyle

Richard said:
vector<char*> m_Text;
m_Text.resize(1);
char* foo = "FOO";
char* bar = "BAR";
char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);

malloc is C, why dont you use new ?
if (foobar)
{
strcpy(foobar, foo);
strcat(foobar, bar);
}
m_Text[0] = foobar;

// Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
call free(m_Text[0]) in the destructor?

try this if you dont want to manage memory on your own

vector<string> m_Text;

char* foo = "FOO";
char* bar = "BAR";

string foobar(foo);
foobar.append( bar );

m_Text.push_back( foobar );
 
?

=?iso-8859-1?Q?Ali_=C7ehreli?=

Kyle said:
malloc is C, why dont you use new ?

malloc is C++ too.

It is used to allocate raw memory in C++. Since new (and new[]) allocates
space and constructs object(s), new (and new[]) is not suitable when there
is not enough information to construct the object(s).
try this if you dont want to manage memory on your own

vector<string> m_Text;

Great advice!

Ali
 
R

Richard

Kyle said:
Richard said:
vector<char*> m_Text;
m_Text.resize(1);
char* foo = "FOO";
char* bar = "BAR";
char* foobar = (char*)malloc(strlen(foo) + strlen(bar) + 1);

malloc is C, why dont you use new ?
if (foobar)
{
strcpy(foobar, foo);
strcat(foobar, bar);
}
m_Text[0] = foobar;

// Will m_Text[0] get freed when m_Text goes out of scope? If not, should I
call free(m_Text[0]) in the destructor?

try this if you dont want to manage memory on your own

vector<string> m_Text;

char* foo = "FOO";
char* bar = "BAR";

string foobar(foo);
foobar.append( bar );

m_Text.push_back( foobar );

That leads me to another question. Is it common practice to pass std::string
as a function argument? Is there much overhead?

void Foo(string& str)

If I went with vector<string>, would I need to covert all my functions that
currently use char* as an argument?

Thanks.
 
V

Victor Bazarov

Richard said:
[...] Is it common practice to pass std::string
as a function argument?

By value, no. By a reference, or by a reference to const, yes.
> Is there much overhead?

There can be. Just like with any other UDT.
void Foo(string& str)

If I went with vector<string>, would I need to covert all my functions that
currently use char* as an argument?

Yes, most likely. BTW, if your functions that currently use 'char*' do
not change the characters in the arrays, you should declare 'char const*'
as the argument type. Then you don't have to change much, but you will
need to use 'c_str' member:

void my_function(char const*);
...
my_function(v.c_str());

V
 
R

Richard

Victor Bazarov said:
Richard said:
[...] Is it common practice to pass std::string
as a function argument?

By value, no. By a reference, or by a reference to const, yes.
Is there much overhead?

There can be. Just like with any other UDT.
void Foo(string& str)

If I went with vector<string>, would I need to covert all my functions that
currently use char* as an argument?

Yes, most likely. BTW, if your functions that currently use 'char*' do
not change the characters in the arrays, you should declare 'char const*'
as the argument type. Then you don't have to change much, but you will
need to use 'c_str' member:

void my_function(char const*);
...
my_function(v.c_str());

V



My array of strings is very large. I did a small test using char* vs string
and the results were really bad. I must be doing something wrong:

vector<char*> test;
test.resize(10000000);
for(int t = 0; t != 10000000; ++t)
{
test[t] = "TEST 123";
}

That takes up a minimal amount of memory if using char*. But if you change
it to vector<string> test; it takes up around 1000 times more memory! What
am I doing wrong?
 
V

Victor Bazarov

Richard said:
My array of strings is very large. I did a small test using char* vs string
and the results were really bad. I must be doing something wrong:

vector<char*> test;
test.resize(10000000);
for(int t = 0; t != 10000000; ++t)
{
test[t] = "TEST 123";
}

That takes up a minimal amount of memory if using char*.

Yeah... You got a vector all elements of which are the same. No extra
memory is allocated, just 10 million pointers. The consumption of memory
(not considering the overhead for dynamic memory management) is quite easy
to calculate:

sizeof(vector<char*>) +
sizeof(char*) * test.size() + sizeof("TEST 123")

(which should give about 40000000, give or take a few bytes)
> But if you change
it to vector<string> test; it takes up around 1000 times more memory! What
am I doing wrong?

I am not sure, to be honest with you. Every 'string' maintains its own
array of char internally. When you resize the 'test' vector to contain
ten millions of 'string' objects, it first puts a bunch of empty ones
there, and then when you in the loop assign those values, every string
in the vector needs to allocate its own small array (and possibly a bit
larger than asked for), which may lead to severe fragmentation of memory,
especially considering that a temporary is probably created to accommodate
your "TEST 123" literal... The final memory cost should be around

sizeof(vector<string>) +
sizeof(string) * test.size() +
sizeof("TEST 123") * test.size()

It is hard to believe it takes "around 1000 times more memory". The
string objects themselves are not that big, so you might be looking at
four, maybe ten, times the memory consumption, but definitely not the
thousand times. Are you running on a 64-bit machine? 1000 times more
with a vector of 10 million pointers is beyond what a 32-bit machine can
handle, that's for sure.

V
 
D

Default User

Richard wrote:

My array of strings is very large. I did a small test using char* vs
string and the results were really bad. I must be doing something
wrong:

vector<char*> test;
test.resize(10000000);
for(int t = 0; t != 10000000; ++t)
{
test[t] = "TEST 123";
}

That takes up a minimal amount of memory if using char*. But if you
change it to vector<string> test; it takes up around 1000 times more
memory! What am I doing wrong?


You stuff 100 million copies of a pointer to the same piece of memory
into the vector. In the real application, you would presumably have a
different string in each slot in the vector.

To do a fairer comparison:

vector<char*> test;
test.resize(10000000);
char* tmp = "TEST 123";

for(int t = 0; t != 10000000; ++t)
{
test[t] = new char[9];
strcpy(test[t], tmp);
}




Brian
 
N

Nick Keighley

Victor said:
Richard wrote:


At this point 'foobar' is a pointer to 7 character array, allocated in
the free store. The array has 'F', 'O', 'O', 'O', 'B', 'A', 'R', '\0'
in it, in sequence.

'F', 'O', 'O', 'B', 'A', 'R', '\0'

<snip>
 
S

Stuart MacMartin

It is hard to believe it takes "around 1000 times more memory".

The pointer case:

sizeof(vector<char*> + sizeof(char*) * test.size() + sizeof("TEST 123")
= 16 + 4 * 10,000,000 + 9

The string case: each string, since it's given a const char*, will
make a copy, I assume (sorry, I don't use std::string but that would be
reasonable behaviour: share string only if it can reference count the
memory)

So we have:
sizeof(vector<string>) + sizeof(string) * test.size() + sizeof("TEST
123") * test.size + heap overhead * test.size()

Sorry, don't know size of string, but perhaps 12 bytes (ref count,
length, alloc length)
On PC, request for 8 bytes requires 40 bytes including heap overhead.
I don't recall the overhead on linux, but it's less. Perhaps this uses
24 bytes.

So our guess of memory usage is:
16 + (12 + 40) * 10,000,000 on PC, or 13 times the amount of memory
needed
by your single pointer case.

If you are seeing something significantly different then something is
odd.
Perhaps 1,000,000 vs. 10,000,000: easy typo.

Stuart
 
S

Stuart MacMartin

malloc is C, why dont you use new ?
malloc is C++ too.

malloc is there for compatibility. It doesn't handle objects so is less
general.

It requires a corresponding free() whereas new requires corresponding
delete or delete []. Since you need to use new elsewhere in your code,
why risk using the wrong free/delete call? You just confuse the issue
by intermixing malloc and new and this can cause a bug that takes
months to find (experience talking). What if you change a structure to
something requiring a destructor (e.g. you change a const char* member
variable to a string). Do you want to go through all your code to
change malloc/free to new/delete assuming you even notice the problem?

malloc is dangerous.

Stuart
 
O

Old Wolf

Richard said:
My array of strings is very large. I did a small test using char* vs string
and the results were really bad. I must be doing something wrong:

vector<char*> test;
test.resize(10000000);
for(int t = 0; t != 10000000; ++t)
{
test[t] = "TEST 123";
}

That takes up a minimal amount of memory if using char*. But if you change
it to vector<string> test; it takes up around 1000 times more memory! What
am I doing wrong?

You are comparing apples with oranges. This test program maintains
one string and keeps 10 million pointers to it. If you modify one
string then they will all change. But with the string example, there
are 10 million fat pointers and 10 million strings.

I doubt it takes 1000 times more memory, unless you have 40 gigabytes
of RAM, as Victor pointed out.

Change your test program to:

vector<char*> test;
test.resize(10000000);
for(int t = 0; t != 10000000; ++t)
{
test[t] = (char *)malloc(9);
std::strcpy( test[t], "TEST 123" );
}

and then compare the memory usage. (You will probably find
this example slightly smaller than the string example, but
not by a lot).

Finally, what compiler do you use. Many standard libraries
use SSO (Small String Optimisation), meaning that if the
string data is smaller than sizeof(string), then it actually
stores the entire string internally, without having to
allocate more memory. In this case, the string example
might even use less memory than the malloc example.
 
?

=?iso-8859-1?Q?Ali_=C7ehreli?=

Stuart MacMartin said:
malloc is there for compatibility. It doesn't handle objects so is less
general.

That's why I said that it's used to allocate raw memory. A quote from my
e-mail:

<quote>
It is used to allocate raw memory in C++.
It requires a corresponding free() whereas new requires corresponding
delete or delete []. Since you need to use new elsewhere in your code,
why risk using the wrong free/delete call?

Because like any good C++ code, my code doesn't contain a single delete or
delete[]. Dynamic objects are handled by smart pointers (the RAII idiom).
Codes like mine are immune from such problems.

Having said that, I don't use malloc either.
You just confuse the issue
by intermixing malloc and new and this can cause a bug that takes
months to find (experience talking). What if you change a structure to
something requiring a destructor (e.g. you change a const char* member
variable to a string).

No, malloc is for raw buffers only. structs are not necessarily raw buffers.
For example, I don't keep POD structs around either. Probably all of them
have constructors.
malloc is dangerous.

Sorry, I never heard that one before and I don't agree with the statement.
(A search for "malloc is dangerous" on Google finds only your statement.) I
agree that we can do bad things with malloc, but malloc would not be
dangerous alone.

Ali
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top