Strings in C

  • Thread starter Bilgehan.Balban
  • Start date
B

Bilgehan.Balban

Hi,

For a declaration such as:

char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

Is there any difference between an array of strings, e.g.

char mystring[10];

and strings of type char *, in terms of where they're stored? If these
are compiler dependent, is there at least a general storage convention?

Thanks,
Bahadir
 
S

Simon Biber

Hi,

For a declaration such as:

char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

It is defined as "static storage duration", which means it is available
from program startup to program shutdown. The actual location in memory
is not specified.
Is there any difference between an array of strings, e.g.

char mystring[10];

That's not an array of strings. It's an array of char, which may be used
to hold a string. In fact, it could hold anywhere from zero to ten
strings. For example,
char mystring[10] = {'m', 'y', 0, 's', 't', 'r', 'i', 'n', 'g', 0};
contains two strings: "my" at offset zero, and "string" at offset 3.

Its storage depends on where it is defined. If that definition occurs
outside of any function, then it has static storage duration (available
at all times) and external linkage (the symbol is visible from other
translation units).

However, if that definition occurs inside a function, then it has
automatic storage duration (only exists within the block it is defined
in), and internal linkage (the symbol is not visible from other
translation units).
and strings of type char *, in terms of where they're stored?

Any string can be pointed to by a 'char *'. The pointer type makes no
difference to the storage of the string.

There are three storage types defined in C:
static
automatic
allocated (ie. malloc, calloc, realloc)

String literals always have static storage, and last until the end of
the program. Objects defined outside of any function, or with the
'static' keyword, have static storage, and last until the end of the
program.

Objects defined within a function body, without the 'static' keyword,
have automatic storage, and last until the end of the block.

A memory block allocated by malloc, calloc or realloc lasts until the
base address is passed to free or realloc.
> If these
are compiler dependent, is there at least a general storage convention?

Some platforms make additional constraints on memory layout, such as
dividing memory into "segments". That is not specified as part of the C
language. Ask in a group devoted to your particular platform or family
of platforms (for example comp.unix.programmer).
 
P

pete

Hi,

For a declaration such as:

char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

Is there any difference between an array of strings, e.g.

char mystring[10];

and strings of type char *, in terms of where they're stored? If these
are compiler dependent,
is there at least a general storage convention?

Storage in C, is characterized by duration.
There are 3 kinds:
1 automatic
2 static
3 allocated

When a string literal converts to a pointer,
it points to the first element of an array with static duration.
Arrays defined outside of any function have static duration.
Arrays defined with the static keyword, have static duration.
Arrays and other variables defined inside of function definitions
without the static keyword, have automatic duration.
malloc and friends return pointers to objects
with allocated duration.
the static keyword

Automatic duration lasts within the block where
the object is defined.
static duration lasts from before program startup,
until the end of the program.
Allocated duration lasts until the pointer is freed
or the program ends, whichever is first.
 
S

Simon Biber

Simon said:
char mystring[10];

That's not an array of strings. It's an array of char, which may be used
to hold a string. In fact, it could hold anywhere from zero to ten
strings. For example,
char mystring[10] = {'m', 'y', 0, 's', 't', 'r', 'i', 'n', 'g', 0};
contains two strings: "my" at offset zero, and "string" at offset 3.

You could also say that the array contains nine different strings:
"my" at offset 0
"y" at offset 1
"" at offsets 2 and 9
"string" at offset 3
"tring" at offset 4
"ring" at offset 5
"ing" at offset 6
"ng" at offset 7
"g" at offset 8
 
G

Gordon Burditt

For a declaration such as:
char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

As far as the language definition is concerned, there is no "where
strings are stored" (The Bronx?). The closest thing there is is
the issue that some things you can write on and some things you
might not be able to. There is no stack, heap, text smegment, data
smegment, bss smegment, etc.
and strings of type char *, in terms of where they're stored? If these
are compiler dependent, is there at least a general storage convention?

Writing on a string literal invokes the wrath of undefined behavior.
Writing on an array does not (unless it's const).

Gordon L. Burditt
 
P

pete

Gordon Burditt wrote:
Writing on a string literal invokes the wrath of undefined behavior.
Writing on an array does not (unless it's const).

String literals and arrays are not mutually exclusive.
 
L

leo2100

I might have a clue as for where string literals are stored. From my
experience programming assembler code for PICs (microcontrollers), when
you need to bring a constant out of nowhere to the program, you store
it in the program memory. That is, program memory being the place for
where the actual code resides, the physical storage for the code, which
in this case is the compiled file or the executable file. That`s why
you can`t directly modify it, because modifying it means modifying the
actual file from which the code is being executed. But it is a
different case if you load the literal into a RAM-stored char array.
 
J

Jordan Abel

Hi,

For a declaration such as:

char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

They are stored in externally-linked static-duration space to which it
is undefined to write. How that's done is of course the linker's
business, but that doesn't affect C per se
Is there any difference between an array of strings, e.g.

char mystring[10];

and strings of type char *, in terms of where they're stored?
Often.

If these are compiler dependent, is there at least a general storage
convention?

ISO/IEC 9899.
 
K

Keith Thompson

I might have a clue as for where string literals are stored. From my
experience programming assembler code for PICs (microcontrollers), when
you need to bring a constant out of nowhere to the program, you store
it in the program memory. That is, program memory being the place for
where the actual code resides, the physical storage for the code, which
in this case is the compiled file or the executable file. That`s why
you can`t directly modify it, because modifying it means modifying the
actual file from which the code is being executed. But it is a
different case if you load the literal into a RAM-stored char array.

First, please provide some context when you post a followup.
Read <http://cfaj.freeshell.org/google/> and follow its advice.

Second, what you describe is *extremely* system-specific. As far as
the C language is concerned, string literals are stored somewhere; as
long as they exist for the duration of the program's execution, it
doesn't matter where. Anything that depends on some particular scheme
is going to be non-portable.
 
?

=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=

For a declaration such as:

char * mystring = "ABCDabcd123";

Is it a linker issue where such strings are stored in C, or is it
defined as part of the language definition?

This defines a pointer to char and assigns to it the address of a
string literal. String literals are not writable, so { mystring[0] =
'X'; } triggers undefined behaviour.
Is there any difference between an array of strings, e.g.

char mystring[10];

and strings of type char *, in terms of where they're stored?

This defines an array of char which is implicitly initialized to
all-zeroes at program start (assuming none of this code is within a
function)

The following code:

char mystring[] = "ABCDabcd123";

defines an array of char and initializes it with a copy of the
provided string literal. Since no explicit size is provided, the
array will be precisely large enough to contain its initial value
(including the terminating null character). Unlike in the first
example, { mystring[0] = 'X'; } is well-defined.
If these are compiler dependent, is there at least a general storage
convention?

No, these things vary widely from system to system.

DES
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top