Does std::string::c_str() always add a null byte?

J

jl_post

Hi,

I have a couple of questions about the std::string class that I
haven't been able to answer by looking in the documentation.

Basically, if I do this:

std::string str = "abc";
char s[4];
memcpy(s, str.c_str(), 4);

then the following will hold true:

s[0] == 'a'
s[1] == 'b'
s[2] == 'c'
s[3] == '\0'

(s[3] is a null character because ::c_str() guarantees that its return
string is null-terminated.)

But if I were to end my str variable with null bytes, like this:

std::string str("a\0\0", 3);
char s[4];
memcpy(s, str.c_str(), 4);

then I figure that the following two expressions are true:

s[0] == 'a'
s[1] == '\0'

But what about these expressions?:

s[2] == '\0'
s[3] == '\0'

We know that s[1] HAS to be a null character (because ::c_str()
guarantees that its return string will be NULL-terminated).


So my questions (which pertain to the second example) are:

1. Since the second character is already a null byte, is ::c_str()
still guaranteed to return '\0' as its third character (since that's
what was passed in to the std::string constructor)? (Basically, I'm
asking if s[2] is guaranteed to hold '\0'.)

2. Since str already ends in a null byte (two of them, actually)
is ::c_str() still guaranteed to return a string with an extra null
byte at the end (like it did in the first example to the "abc"
string)? (Basically, I'm asking if s[3] is guaranteed to hold '\0'.)


Thanks in advance for any input.

-- Jean-Luc
 
B

Barry

Hi,

I have a couple of questions about the std::string class that I
haven't been able to answer by looking in the documentation.

Basically, if I do this:

std::string str = "abc";
char s[4];
memcpy(s, str.c_str(), 4);

then the following will hold true:

s[0] == 'a'
s[1] == 'b'
s[2] == 'c'
s[3] == '\0'

(s[3] is a null character because ::c_str() guarantees that its return
string is null-terminated.)

But if I were to end my str variable with null bytes, like this:

std::string str("a\0\0", 3);
char s[4];
memcpy(s, str.c_str(), 4);

then I figure that the following two expressions are true:

s[0] == 'a'
s[1] == '\0'

But what about these expressions?:

s[2] == '\0'
s[3] == '\0'

We know that s[1] HAS to be a null character (because ::c_str()
guarantees that its return string will be NULL-terminated).

Yes
21.3.6

const charT* c_str() const;
1 Returns: A pointer to the initial element of an array of length size()
+ 1 whose first size() elements
equal the corresponding elements of the string controlled by *this and
whose last element is a
null character specified by charT().
So my questions (which pertain to the second example) are:

1. Since the second character is already a null byte, is ::c_str()
still guaranteed to return '\0' as its third character (since that's
what was passed in to the std::string constructor)? (Basically, I'm
asking if s[2] is guaranteed to hold '\0'.)

IMHO,
well, basic_string has no special treatment on char '\0' with its
representation.
IMHO again,
the representation of string data can be summarized:

CharT* data;
size_type size;
size_type capacity;

after
string str("a\0\0", 3);
the representation would become:
data -> [ 'a', '0', '0', ... ];
| size |
| capacity |

after you call c_str();
data-> ['a', '0', '0', '0', ...];

be aware that the above is my own understanding, without any standard
reference.
2. Since str already ends in a null byte (two of them, actually)
is ::c_str() still guaranteed to return a string with an extra null
byte at the end (like it did in the first example to the "abc"
string)? (Basically, I'm asking if s[3] is guaranteed to hold '\0'.)


Thanks in advance for any input.

-- Jean-Luc
 
?

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=

Hi,

I have a couple of questions about the std::string class that I
haven't been able to answer by looking in the documentation.

Basically, if I do this:

std::string str = "abc";
char s[4];
memcpy(s, str.c_str(), 4);

then the following will hold true:

s[0] == 'a'
s[1] == 'b'
s[2] == 'c'
s[3] == '\0'

(s[3] is a null character because ::c_str() guarantees that its return
string is null-terminated.)

But if I were to end my str variable with null bytes, like this:

std::string str("a\0\0", 3);
char s[4];
memcpy(s, str.c_str(), 4);

then I figure that the following two expressions are true:

s[0] == 'a'
s[1] == '\0'

But what about these expressions?:

s[2] == '\0'
s[3] == '\0'

We know that s[1] HAS to be a null character (because ::c_str()
guarantees that its return string will be NULL-terminated).


So my questions (which pertain to the second example) are:

1. Since the second character is already a null byte, is ::c_str()
still guaranteed to return '\0' as its third character (since that's
what was passed in to the std::string constructor)? (Basically, I'm
asking if s[2] is guaranteed to hold '\0'.)

2. Since str already ends in a null byte (two of them, actually)
is ::c_str() still guaranteed to return a string with an extra null
byte at the end (like it did in the first example to the "abc"
string)? (Basically, I'm asking if s[3] is guaranteed to hold '\0'.)

str.c_str() returns a const charT* pointing to an array of size
str.size() + 1, where the last element is a null-character, the
str.size() first characters in the array are the characters in str.
 
A

Andre Kostur

Barry said:
Hi,

I have a couple of questions about the std::string class that I
haven't been able to answer by looking in the documentation.

Basically, if I do this:

std::string str = "abc";
char s[4];
memcpy(s, str.c_str(), 4);

then the following will hold true:

s[0] == 'a'
s[1] == 'b'
s[2] == 'c'
s[3] == '\0'

(s[3] is a null character because ::c_str() guarantees that its return
string is null-terminated.)

But if I were to end my str variable with null bytes, like this:

std::string str("a\0\0", 3);
char s[4];
memcpy(s, str.c_str(), 4);

then I figure that the following two expressions are true:

s[0] == 'a'
s[1] == '\0'

But what about these expressions?:

s[2] == '\0'
s[3] == '\0'

We know that s[1] HAS to be a null character (because ::c_str()
guarantees that its return string will be NULL-terminated).

Yes
21.3.6

const charT* c_str() const;
1 Returns: A pointer to the initial element of an array of length size ()
+ 1 whose first size() elements
equal the corresponding elements of the string controlled by *this and
whose last element is a
null character specified by charT().
So my questions (which pertain to the second example) are:

1. Since the second character is already a null byte, is ::c_str()
still guaranteed to return '\0' as its third character (since that's
what was passed in to the std::string constructor)? (Basically, I'm
asking if s[2] is guaranteed to hold '\0'.)

IMHO,
well, basic_string has no special treatment on char '\0' with its
representation.
IMHO again,
the representation of string data can be summarized:

CharT* data;
size_type size;
size_type capacity;

after
string str("a\0\0", 3);
the representation would become:
data -> [ 'a', '0', '0', ... ];
| size |
| capacity |

after you call c_str();
data-> ['a', '0', '0', '0', ...];

be aware that the above is my own understanding, without any standard
reference.

AFAIK that is not necessarily true. The array returned by c_str()
doesn't necessarily have to be the same array returned by data().
Theoretically the string could allocate a completely new chunk of
memory, copy all of the data into it, add on the nul character, and
return a pointer to that memory. (Of course, it would be the string's
responsibility to dispose of that memory at the appropriate time.) (I
don't know of any implementation that does this, but it's theoretically
possible.)
 
J

James Kanze


[...]
well, basic_string has no special treatment on char '\0' with its
representation.
IMHO again,
the representation of string data can be summarized:
CharT* data;
size_type size;
size_type capacity;
after
string str("a\0\0", 3);
the representation would become:
data -> [ 'a', '0', '0', ... ];
| size |
| capacity |
after you call c_str();
data-> ['a', '0', '0', '0', ...];
be aware that the above is my own understanding, without any standard
reference.
AFAIK that is not necessarily true. The array returned by
c_str() doesn't necessarily have to be the same array returned
by data(). Theoretically the string could allocate a
completely new chunk of memory, copy all of the data into it,
add on the nul character, and return a pointer to that memory.
(Of course, it would be the string's responsibility to dispose
of that memory at the appropriate time.) (I don't know of any
implementation that does this, but it's theoretically
possible.)

That was definitly the intent in the original standard. It's
not certain that the wording actually adopted does allow it
however, and as you say, no implementation has made use of this
freedom. As a result, the next release of the standard will
constrain the implementations more: there will be both a const
and a non-const data() function, and that function must return a
pointer to the actual character data, which must be contiguous.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top