How to zero-initialize a C string (array of wchar_t)?

  • Thread starter Niels Dekker - no reply address
  • Start date
N

Niels Dekker - no reply address

Are all the following initializations semantically equivalent?

wchar_t a[8] = {L'\0'};
wchar_t b[8] = {'\0'};
wchar_t c[8] = {0};
wchar_t d[8] = {};

If so, why don't we all use an empty initializer list, {}, when
zero-initializing a C-style string? I was wondering, because the book
C++ Coding Standards (Sutter & Alexandrescu) says at item 19, "Always
initialize variables":

char path[MAX_PATH] = { '\0' };


Niels Dekker
http://www.xs4all.nl/~nd/dekkerware
 
E

Eric Boutin

Are all the following initializations semantically equivalent?

wchar_t a[8] = {L'\0'};
wchar_t b[8] = {'\0'};
wchar_t c[8] = {0};
wchar_t d[8] = {};


If so, why don't we all use an empty initializer list, {}, when
zero-initializing a C-style string? I was wondering, because the book
C++ Coding Standards (Sutter & Alexandrescu) says at item 19, "Always
initialize variables":

char path[MAX_PATH] = { '\0' };

you need to fill the array with '\0' because if the array is of dimension
10 let's say, the last char is '\0' and you set the first 5 characters
to let's say 'hello'
you'll get :

0 1 2 3 4 5 6 7 8 9
h e l l o t d r a '\0'

because the 5-6-7-8 characters were not set and are still using unset
memory. Maybe your compiler will clear the memory for you but I'm not
shure it's a standard behavior and it's not a good habbit to take


however if you do
memset(path, '\0', 10);
//set path value here
you'll get

hello'\0'
whatever's the lenght of path
which is the wanted result



or even better !!
give a try to the STL

#include <string>
std::string path = "hello";


beside std::string is a lot more secure than char[] because it's size
will autoincrease as you append data.

A small performance hit but a more secure program...

beside;
cin >> path; //where path is std::string

is almost impossible to overflow... however
cin >> path; //where path is char[]

is really easy to overflow.. if you allocated 10 char and the user input
40 characters... your program may not recover




Hope it helped


========
Eric Boutin
(e-mail address removed)
 
P

Phlip

Eric said:
Niels Dekker wrote :
Are all the following initializations semantically equivalent?

wchar_t a[8] = {L'\0'};
wchar_t b[8] = {'\0'};
wchar_t c[8] = {0};
wchar_t d[8] = {};


If so, why don't we all use an empty initializer list, {}, when
zero-initializing a C-style string? I was wondering, because the book
C++ Coding Standards (Sutter & Alexandrescu) says at item 19, "Always
initialize variables":

char path[MAX_PATH] = { '\0' };

you need to fill the array with '\0' because if the array is of dimension
10 let's say, the last char is '\0' and you set the first 5 characters
to let's say 'hello'
you'll get :

0 1 2 3 4 5 6 7 8 9
h e l l o t d r a '\0'

because the 5-6-7-8 characters were not set and are still using unset
memory. Maybe your compiler will clear the memory for you but I'm not
shure it's a standard behavior and it's not a good habbit to take


however if you do
memset(path, '\0', 10);

The expression

char yo[3][5][7] = { 0 };

zero-fills everything, relatively optimally. No character contains the
memory's previous contents.

Avoid memset() in C++ as a rule of thumb. There are various reasons why.

I don't know about that, but typing the 0 won't kill you.
or even better !!
give a try to the STL

#include <string>
std::string path = "hello";

Yep. Learn to use C++ as a high-level language first, before worrying about
bits.
 
R

Rob Williscroft

Niels Dekker - no reply address wrote in @this.is.invalid in comp.lang.c++:
Are all the following initializations semantically equivalent?

wchar_t a[8] = {L'\0'};
wchar_t b[8] = {'\0'};
wchar_t c[8] = {0};
wchar_t d[8] = {};

If so, why don't we all use an empty initializer list, {}, when

Because we all don't know that using {} works, I do and I use it.

It may be that using {} doesn't or didn't work with some C and pre-standard
C++ compilers, but AIUI using { 0 } has allways worked, which is probably
why the advice is repeated.
zero-initializing a C-style string? I was wondering, because the book
C++ Coding Standards (Sutter & Alexandrescu) says at item 19, "Always
initialize variables":

char path[MAX_PATH] = { '\0' };

Rob.
 
N

Niels Dekker - no reply address

Phlip said:

I don't know about that, but typing the 0 won't kill you.

I'd rather not use an integer (0) when initializing a character. On the
other hand, using L'\0' is quite verbose. Actually I used to do:

wchar_t e[8] = L"";

But now I realize that this L"" literal might unnecessarely take up some
space (however small) while running my program.

An empty initializer list, {}, seems preferable, but apparently most C++
programmers aren't familiar with this notation. Except for Rob
Williscroft, luckily :)


Eric said:
give a try to the STL

#include <string>
std::string path = "hello";

In some cases, especially when you're depending on C-style libraries
(e.g., Windows API functions), using to C-style strings might be more
convenient.

Learn to use C++ as a high-level language first, before
worrying about bits.

Okay, but I like the bits too :)


Regards,

Niels Dekker
http://www.xs4all.nl/~nd/dekkerware
 
A

Andrey Tarasevich

Niels said:
Are all the following initializations semantically equivalent?

wchar_t a[8] = {L'\0'};
wchar_t b[8] = {'\0'};
wchar_t c[8] = {0};
wchar_t d[8] = {};
Yes.

If so, why don't we all use an empty initializer list, {}, when
zero-initializing a C-style string?

Most likely it is a C language inheritance. C language doesn't allow
'{}' initializer. Sometimes C-compatibility (of header files, for
example) might be important. Also not all compilers (and not all
programmers) "know" that '{}' initializer is allowed in C++ and continue
to enforce C specification for aggregate initializers.
I was wondering, because the book
C++ Coding Standards (Sutter & Alexandrescu) says at item 19, "Always
initialize variables":

char path[MAX_PATH] = { '\0' };

Maybe just to avoid confusing a reader, who's using a compiler that
doesn't accept '{}' initializer (MSVC++ 6, for example).
 
M

msalters

Niels said:
Phlip said:
wchar_t d[8] = {};

I don't know about that, but typing the 0 won't kill you.

I'd rather not use an integer (0) when initializing a character. On the
other hand, using L'\0' is quite verbose. Actually I used to do:

wchar_t e[8] = L"";

But now I realize that this L"" literal might unnecessarely take up some
space (however small) while running my program.

So may any other expression with the same result. An optimizer may
replace any expression with a more efficient form, as long as both
forms meet the requirements laid out in the standard. E.g. a CPU
which has an register hardwired to zero (e.g. Itanium IIRC ) can
simply use a store instruction using that register as a source.
Regards,
Michiel Salters
 
N

Niels Dekker - no reply address

I used to do:
wchar_t e[8] = L"";

But now I realize that this L"" literal might unnecessarely take up
some space (however small) while running my program.

Michiel Salters replied:
So may any other expression with the same result. An optimizer may
replace any expression with a more efficient form, as long as both
forms meet the requirements laid out in the standard.

So is there any difference (semantically, including their memory usage)
between the following initializations?

wchar_t a[8] = {L'\0'};
wchar_t e[8] = L"";


Kind regards,

Niels Dekker
http://www.xs4all.nl/~nd/dekkerware
 
M

msalters

Niels said:
I used to do:

wchar_t e[8] = L"";

But now I realize that this L"" literal might unnecessarely take up
some space (however small) while running my program.

Michiel Salters replied:
So may any other expression with the same result. An optimizer may
replace any expression with a more efficient form, as long as both
forms meet the requirements laid out in the standard.

So is there any difference (semantically, including their memory usage)
between the following initializations?

wchar_t a[8] = {L'\0'};
wchar_t e[8] = L"";

Notice the word "may" in my reply. That's not "must". That means
the answer here is "Perhaps". I don't know, it /will/ differ between
compilers and I don't know which version will be more efficient where.

Besides, your use of "memory usage" here and in earlier posts
covers concepts that are not included when the C++ standard talks
about "semantics".

Regards,
Michiel Salters
 
N

Niels Dekker - no reply address

So is there any difference (semantically, including their memory
usage) between the following initializations?

wchar_t a[8] = {L'\0'};
wchar_t e[8] = L"";

Michiel Salters replied:
Notice the word "may" in my reply. That's not "must". That means
the answer here is "Perhaps". I don't know, it /will/ differ between
compilers and I don't know which version will be more efficient where.

I don't see how the initialization to {L'\0'} could reasonably be
implemented less efficiently than the initialization to L"". On the
other hand I can imagine an implementation that reserves memory for each
string literal it encounters, therefore having a less efficient
initialization to L"". Am I missing the point?

Besides, your use of "memory usage" here and in earlier posts
covers concepts that are not included when the C++ standard talks
about "semantics".

By "memory usage" I refer to the fact that string literals have static
storage duration.


Thanks so far,

Niels Dekker
http://www.xs4all.nl/~nd/dekkerware
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,149
Latest member
Vinay Kumar Nevatia0
Top