returning char arrays from a function

R

Robert Smith

I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack, why does the first one not cause a
compiler warning?

#include <stdio.h>

char * funca() {
char *a = "blah"; //1 - ok
// char a[] = "blah"; //2 - not ok
return a;
}

int main() {
funca();
return 0;
}
 
C

Chris Dollin

Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack,

Forward reference:
char *a = "blah"; //1 - ok
// char a[] = "blah"; //2 - not ok

No: string literals go into static store and exist for the program's
lifetime.
 
Z

zhousqy

Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack, why does the first one not cause a
compiler warning?

#include <stdio.h>

char * funca() {
char *a = "blah"; //1 - ok
// char a[] = "blah"; //2 - not ok
return a;
}

int main() {
funca();
return 0;
}

the first a is pointed to a *const* string , it is not be allocated on
the stack, it's on the DATA segment.
 
R

Richard Bos

Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack, why does the first one not cause a
compiler warning?

#include <stdio.h>

char * funca() {
char *a = "blah"; //1 - ok
// char a[] = "blah"; //2 - not ok
return a;
}

int main() {
funca();
return 0;
}

the first a is pointed to a *const* string ,

Nitpick: no, it isn't. It points at an array of unmodifiable chars which
are, for historical reasons, _not_ const.
it is not be allocated on the stack, it's on the DATA segment.

You don't know that. What you do know is that, wherever they are, they
have static duration. One possible way to give them static duration is
to put them on the stack in a frame which gets released only just before
the program exits.

Richard
 
M

Me

Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?

char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l, 'a', 'h', '\0' };
char *a = dummy;
Both would be allocated on the stack, why does the first one not cause a
compiler warning?

#include <stdio.h>

char * funca() {
char *a = "blah"; //1 - ok
// char a[] = "blah"; //2 - not ok
return a;
}

int main() {
funca();
return 0;
}

This is like asking why:

static int foo = 50;
int *ret = &foo;
return ret;

is ok but not:

int foo;
int *ret = &foo;
return ret;

This is obviously bad in C because local variables are destroyed when
you leave their scope and static variables inside a function are
exactly like global variables except their variable name is only
visible to whatever is inside the function.
 
K

Keith Thompson

Me said:
Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?

char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l, 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

Making string literals const would bave been cleaner; the reasons for
not doing so are historical. Pre-ANSI C didn't have the "const"
keyword. A declaration such as

char *a = "blah";

would be illegal if string literals were const.
 
J

Jordan Abel

Me said:
Robert said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?

char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l, 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

So it's more like

static const char dummy[5] = { 'b','l','a','h',0 };
char *a=(char*)dummy;

attempting to modify the values of a const array via a non-const pointer
invokes undefined behavior too.
 
B

Barry Schwarz

I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack, why does the first one not cause a
compiler warning?

C does not know from a stack. It does know about static and automatic
objects.

Unless you specify otherwise, objects defined in a function (or block)
are automatic. They are created when the function (or block is
entered) and destroyed when the function (or block) is exited.

Static objects are created before the program begins execution and
survive for the life of the program. String literals are simply
static arrays of unmodifiable (but not const) char.
#include <stdio.h>

char * funca() {
char *a = "blah"; //1 - ok

a points to a static array. The array continues to exist even after
the function returns.
// char a[] = "blah"; //2 - not ok

In the return statement, a would evaluate to the address of a[0] which
is part of an automatic array. The array will not exist after the
function returns and any attempt to evaluate the returned value would
invoke undefined behavior.

If you changed this to
static char a[] = ...;
it would be OK also.
return a;
}

int main() {
funca();
return 0;
}


Remove del for email
 
K

Keith Thompson

Jordan Abel said:
Me said:
Robert Smith wrote:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?

char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l, 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

So it's more like

static const char dummy[5] = { 'b','l','a','h',0 };
char *a=(char*)dummy;

attempting to modify the values of a const array via a non-const pointer
invokes undefined behavior too.
[...]

Maybe. I can't think of any reason why that wouldn't be equivalent.

But I prefer to describe string literals the way the standard does,
rather than trying to pretend that they're syntactic sugar for
something else.
 
J

Jordan Abel

Jordan Abel said:
Robert Smith wrote:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?

char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l, 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

So it's more like

static const char dummy[5] = { 'b','l','a','h',0 };
char *a=(char*)dummy;

attempting to modify the values of a const array via a non-const pointer
invokes undefined behavior too.
[...]

Maybe. I can't think of any reason why that wouldn't be equivalent.

But I prefer to describe string literals the way the standard does,
rather than trying to pretend that they're syntactic sugar for
something else.

Eh. people have no problem describing the -> and [] operators as
syntactic sugar.
 
K

Keith Thompson

Jordan Abel said:
So it's more like

static const char dummy[5] = { 'b','l','a','h',0 };
char *a=(char*)dummy;

attempting to modify the values of a const array via a non-const pointer
invokes undefined behavior too.
[...]

Maybe. I can't think of any reason why that wouldn't be equivalent.

But I prefer to describe string literals the way the standard does,
rather than trying to pretend that they're syntactic sugar for
something else.

Eh. people have no problem describing the -> and [] operators as
syntactic sugar.

Sure, because they're very easy to describe that way. As we've seen
in this thread, describing string literals as syntactic sugar isn't
quite as straghtforward, and is very easy to get wrong.

But as I said, that's my preference; you're not required to share it.
 
M

Me

Keith said:
Me said:
char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l', 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

Well I didn't intend to get *that* deep into the semantics of syntax
sugar but if you want to go there, this pretty much captures what C
does:

static const char dummy[5] = { 'b', 'l', 'a', 'h', '\0' };
char *a = *(char(*)[5])&dummy;

(except it doesn't work for explaining an implementation that shares
substrings without breaking an aliasing rule)
 
S

stathis gotsis

Everyone pointed out why the former is correct while the latter can lead to
undefined behaviour. But the compiler warning may be another thing. For
example, the following program does not cause any warning even in the
highest warning level in my compiler (gcc 4.0.1):

#include <stdio.h>

char * funca() {
char b[] = "blah";
char *a=b;
return a;
}

int main() {
funca();
return 0;
}

It is not required to issue a warning but whether it will depends on the
semantic checks the compiler performs. And of course the above can lead to
undefined behaviour as well.
 
C

CBFalconer

stathis said:
.... snip ...

Everyone pointed out why the former is correct while the latter can
lead to undefined behaviour. But the compiler warning may be another
thing. For example, the following program does not cause any warning
even in the highest warning level in my compiler (gcc 4.0.1):

#include <stdio.h>

char * funca() {
char b[] = "blah";
char *a=b;
return a;
}

int main() {
funca();
return 0;
}

It is not required to issue a warning but whether it will depends
on the semantic checks the compiler performs. And of course the
above can lead to undefined behaviour as well.

Also you have hidden the invalid usage behind an unnecessary
intermediate variable. Contrast the following:

[1] c:\c\junk>cat junk.c
#include <stdio.h>

char * funca() {
char b[] = "blah";
char *a=b;
return a;
}

char * funcb() {
char b[] = "foo";
return b;
}

int main() {
funca();
funcb();
return 0;
}

[1] c:\c\junk>cc junk.c
junk.c: In function `funcb':
junk.c:11: warning: function returns address of local variable

Compilers can be helpful, but nothing can replace the use of the
Mark I brain.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
J

Jordan Abel

Keith said:
Me said:
char *a = "blah";

is just syntax sugar for:

static const char dummy[5] = { 'b', 'l', 'a', 'h', '\0' };
char *a = dummy;

Not quite. The characters of a string literal aren't const, but
attempting to modify them invokes undefined behavior.

Well I didn't intend to get *that* deep into the semantics of syntax
sugar but if you want to go there, this pretty much captures what C
does:

static const char dummy[5] = { 'b', 'l', 'a', 'h', '\0' };
char *a = *(char(*)[5])&dummy;

(except it doesn't work for explaining an implementation that shares
substrings without breaking an aliasing rule)

An implementation is free to share parts of any const object, isn't it?
 
M

Martin Ambuhl

stathis said:
Everyone pointed out why the former is correct while the latter can lead to
undefined behaviour. But the compiler warning may be another thing. For
example, the following program does not cause any warning even in the
highest warning level in my compiler (gcc 4.0.1):

#include <stdio.h>

char * funca() {
char b[] = "blah";
char *a=b;
return a;
}

int main() {
funca();
return 0;
}

It is not required to issue a warning but whether it will depends on the
semantic checks the compiler performs. And of course the above can lead to
undefined behaviour as well.

And the absence of warnings might mislead you in your expectations.
Notice the output that my implementation (gcc 4.1.0) produces below:


$cat t.c
#include <stdio.h>

char *funca()
{
char b[] = "blah";
char *a = b;
return a;
}

int main()
{
printf("[output]\n\"%s\"\n", funca());
return 0;
}

$./t
[output]
""
 
S

stathis gotsis

Martin Ambuhl said:
stathis said:
I am wondering why it is possible to return a pointer to a string literal
(ie. 1) but not an array that has been explicitly allocated. (ie. 2) ?
Both would be allocated on the stack, why does the first one not cause a
compiler warning?


Everyone pointed out why the former is correct while the latter can lead to
undefined behaviour. But the compiler warning may be another thing. For
example, the following program does not cause any warning even in the
highest warning level in my compiler (gcc 4.0.1):

#include <stdio.h>

char * funca() {
char b[] = "blah";
char *a=b;
return a;
}

int main() {
funca();
return 0;
}

It is not required to issue a warning but whether it will depends on the
semantic checks the compiler performs. And of course the above can lead to
undefined behaviour as well.

And the absence of warnings might mislead you in your expectations.

I did not have particular expectations anyway, the result is undefined. I
was trying to address one of the OP's questions which was why the compiler
does not yield a warning in the case of 1 (while it does so in the case of
2).

char * funca() {
char *a = "blah"; /* 1 */
/* char a[] = "blah";*/ /* 2 */
return a;
}

Everyone suggested that was because a points to a string literal, which has
static duration. Maybe the compiler does not not take into account where a
points to, when it checks "return a;" for semantic correctness. The absence
of a warning in the following could support my argument:

char * funca() {
char c='a';
char *a=&c;
return a;
}

I suspect that case 2 causes a warning because a is declared as an array.
Returning an array could be explicitly defined as a semantic error in the
compiler. All these are just assumptions of course.
 
K

Keith Thompson

Jordan Abel said:
Well I didn't intend to get *that* deep into the semantics of syntax
sugar but if you want to go there, this pretty much captures what C
does:

static const char dummy[5] = { 'b', 'l', 'a', 'h', '\0' };
char *a = *(char(*)[5])&dummy;

(except it doesn't work for explaining an implementation that shares
substrings without breaking an aliasing rule)

An implementation is free to share parts of any const object, isn't it?

I don't think so. According to the definition of the "==" operator
for pointers, the addresses of distinct objects must be distinct.
(The standard specifically allows the arrays corresponding to string
literals to be shared.)
 
H

hydro

I think the proper function for your aim is as follows.
void funca(char *a) /*a should point to a buffer of character*/
{
/*copy "blah" to the buffer that a pointed*/
int i;
for(i=0;i<4;i++)
{
a=....;
}
}
 
S

stathis gotsis

hydro said:
I think the proper function for your aim is as follows.
void funca(char *a) /*a should point to a buffer of character*/
{
/*copy "blah" to the buffer that a pointed*/
int i;
for(i=0;i<4;i++)
{
a=....;
}
}


You should consider including appropriate context in your messages to this
group. I do not find your suggestion too helpful either. For one thing, you
forgot to null terminate the string. There are also standard functions
capable of copying strings, such as strcpy().
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top