how to store strings of any length into an array of type char*?

A

arkobose

my earlier post titled:
"How to input strings of any lengths into arrays of type: char
*array[SIZE] ?"
seems to have created a confusion. therefore i paraphrase my problem
below.

consider the following program:
#include<stdio.h>
#define SIZE 1
int main()
{
char *array[SIZE];
scanf("%s", array[0]); // type a string of any length whatsoever.
for(int i = 0; *(array[0] + i) != '\0'; i++)
printf("%c", *(array[0] + i));
return 0;
}

when you run this program, you will find that the "printf" outputs the
whole string which you entered through "scanf", no matter how long your
string was.

now suppose you change the constant SIZE to some bigger value, 4, for
example, and then modify the program to this:

#include<stdio.h>
#define SIZE 4

int main()
{
char *array[SIZE] = {"string of any size", "type",
"praetertranssubstantiationali­stically", "another string"};
for(int i = 0; i < SIZE; i++){
for(int j = 0; *(array + j) != '\0'; j++){
printf("%c", *(array + j));
}
printf("\n");
}
return 0;
}

then this program will output the strings with which the array has been
initialized exactly.

but if you want that the strings be entered at run time by user, rather
than be given at initialization as above, then how do you do this?
that is, during execution you type your strings one by one and they get
stored exactly as in the above program.

any ideas?
-arko
 
W

Walter Roberson

consider the following program:
#include<stdio.h>
#define SIZE 1
int main()
{
char *array[SIZE];
scanf("%s", array[0]); // type a string of any length whatsoever.
for(int i =3D 0; *(array[0] + i) !=3D '\0'; i++)
printf("%c", *(array[0] + i));
return 0;
}
when you run this program, you will find that the "printf" outputs the
whole string which you entered through "scanf", no matter how long your
string was.

It doesn't even compile on my C89 compiler.

If I compile it with gcc --std=c99 then it coredumps if I enter
even one character of input.

Purify complains about an uninitialized memory read. That's not
surprising, as you are passing in the -content- of array[0], which
you have not initialized.

When I put in input, Purify then complains about Null Pointer Write.
That's because it -happened- that array[0] had the value 0
(which happens to be the NULL pointer on my system), so when you
write to that buffer via the scanf() call, it tries to write to NULL.
That in turn triggers a COR, coredump because of a SIGSEGV
(Segmentation Violation.)


I didn't bothere to read the rest of your question, as it was
predicated upon the truth of your program functioning properly,
which it absolutely does not do.
 
A

arkobose

dear walter,
the programs which i have posted run absolutely correctly on my
Turboc++ IDE compiler.
if you don't mind giving me your email id, then i can send you both the
source code and the .exe file. you can see for yourself.
-arko
 
W

Walter Roberson

dear walter,
the programs which i have posted run absolutely correctly on my
Turboc++ IDE compiler.
*Sigh.*

if you don't mind giving me your email id, then i can send you both the
source code and the .exe file. you can see for yourself.

I don't think so. I already explained the problems with the code,
and was echoed by another poster.

Re-read the documentation for scanf(). What are the values
that you pass in after the format? Addresses, right? And you
are passing the -content- of array[0] as the address, right?
Now, what is that -content-? Did you initialize array[0] ?
If not, then how do you know what the value will be? Is there
any rule in the C standard that controls what the value is
of variables of "automatic" storage if you do not explicitly
initialize them? If so, then what is the default value, and
what leads you to think that it will happen to be the location
of a chunk of storage which is indefinitely long? If there is
no default value for automatic variables, then what leads you
to think that the random value that happens to be there will be
the location of a chunk of storage which is indefinitely long?
 
M

Michael Mair

my earlier post titled:
"How to input strings of any lengths into arrays of type: char
*array[SIZE] ?"
seems to have created a confusion. therefore i paraphrase my problem
below.

No, the problem is that you do not understand the answers given
to you. This is no insult, just a matter of fact.
You have an understanding problem with the concepts of arrays,
pointers and strings in C.
consider the following program:
#include<stdio.h>
#define SIZE 1
int main()
{
char *array[SIZE];

array is an array of SIZE elements of type char*.
char* is a type that can hold the address of an object of
type char. As it is, the SIZE elements of array are
uninitialised.
If you want to store something through a pointer, then
this pointer must point to storage you own which is also
of the respective type, i.e. to use array[0], it has to
to contain the address of ("point to")
a) a single char
b) the "first" char of an array of char (*)
c) the "start" of a malloc()ed storage region with
effective type char.

This means
a)
char c = 42;
array[0] = &c; /* *array[0] == 42 */
b)
char s[42];
array[0] = s; /* equiv. array[0] = &s[0] */
c)
array[0] = malloc(sizeof *array[0] * 42);
if (array[0] == NULL)
{
/* handle memory trouble */
....
}
scanf("%s", array[0]); // type a string of any length whatsoever.

You did none of the above.
You just have used the arbitrary bit pattern found at &array[0]
as if it were an address.
If you cleanly initialise array[0] to NULL, you will
see what we all are talking about.
for(int i = 0; *(array[0] + i) != '\0'; i++)
printf("%c", *(array[0] + i));
return 0;
}

when you run this program, you will find that the "printf" outputs the
whole string which you entered through "scanf", no matter how long your
string was.

I bet you that if you throw in a large enough file redirected
to stdin that you will get a segfault/access violation.

now suppose you change the constant SIZE to some bigger value, 4, for
example, and then modify the program to this:

#include<stdio.h>
#define SIZE 4

int main()
{
char *array[SIZE] = {"string of any size", "type",
"praetertranssubstantiationali­stically", "another string"};

Now you initialised array[0] through array[3] to point to
a string literal each.
They could, for example, hold the addresses 4, 7456, 900000,
and 28.
for(int i = 0; i < SIZE; i++){
for(int j = 0; *(array + j) != '\0'; j++){
printf("%c", *(array + j));
}
printf("\n");
}
return 0;
}

then this program will output the strings with which the array has been
initialized exactly.

but if you want that the strings be entered at run time by user, rather
than be given at initialization as above, then how do you do this?
that is, during execution you type your strings one by one and they get
stored exactly as in the above program.


exactly in the way described to you by Chuck, Malcolm, me and maybe
others.
Get yourself some storage to point to and be done.
Reread their answers and have a look at sections 6 and 8 of the
c.l.c FAQ, available from
http://www.eskimo.com/~scs/C-faq/top.html
(maybe also 4, 5)


Cheers
Michael

----
(*) It is of course also possible to set the pointer _not_
to the beginning, e.g.
b)
char s[42];
array[0] = &s[1]; /* equiv. ....= s + 1 */
c)
char *ptr = malloc(sizeof *ptr * 458);
array[0] = ptr[42];
array[1] = ptr[142];
 
Q

Quentarez

my earlier post titled:
"How to input strings of any lengths into arrays of type: char
*array[SIZE] ?"
seems to have created a confusion. therefore i paraphrase my problem
below.

consider the following program:
#include<stdio.h>
#define SIZE 1
int main()
{
char *array[SIZE];
scanf("%s", array[0]); // type a string of any length whatsoever.
for(int i = 0; *(array[0] + i) != '\0'; i++)
printf("%c", *(array[0] + i));
return 0;
}

when you run this program, you will find that the "printf" outputs the
whole string which you entered through "scanf", no matter how long your
string was.

now suppose you change the constant SIZE to some bigger value, 4, for
example, and then modify the program to this:

#include<stdio.h>
#define SIZE 4

int main()
{
char *array[SIZE] = {"string of any size", "type",
"praetertranssubstantiationali­stically", "another string"};
for(int i = 0; i < SIZE; i++){
for(int j = 0; *(array + j) != '\0'; j++){
printf("%c", *(array + j));
}
printf("\n");
}
return 0;
}

then this program will output the strings with which the array has been
initialized exactly.

but if you want that the strings be entered at run time by user, rather
than be given at initialization as above, then how do you do this?
that is, during execution you type your strings one by one and they get
stored exactly as in the above program.

any ideas?
-arko


I highly recommend you read the excellent page "Getting Data from an Input
Stream" written by Richard Heathfield. It can be found at:
http://www.cognitiveprocess.com/~rjh/prg/writings/fgetdata.html

-Quentarez
 
A

arkobose

o.k
i solved the problem using an array of the following type:
char *array[SIZE][1];
for any defined value of constant SIZE, i was able to store arbitrary
length strings (o.k, i won't stick to "indefinite length strings").
i hope it works on your compiler!

-arko
 
W

Walter Roberson

i solved the problem

You didn't quote enough context to say which problem. You had at
least 3 different problems with the first program you posted,
and we don't know which of the 3 problems you are referring to.
using an array of the following type:
char *array[SIZE][1];
for any defined value of constant SIZE, i was able to store arbitrary
length strings

I suggest you post your modified sample code, along with an
description of what you think it is doing. Considering the
solution you have posted, I am suspicious that you have merely
traded one undefined behaviour for another.
 
A

arkobose

here's a sample code:

int main()
{
char *array[SIZE][1];

for(int i = 0; i < SIZE; i++)
gets(a[0]);

for(i = 0; i < SIZE; i++){
for(int j = 0; *(a[0] + j) != '\0'; j++)
printf("%c", *(a[0] + j));
printf("\n");
}
return 0;
}
when one enters strings (which may contain white spaces) one by one (of
course, one has to define the constant SIZE first) then the program
prints all the different strings of different lengths (the maximum
length of any string is not specified in the code, and this is the
point) one after another.
a word of caution, though. this code runs fine on my Turbo C++ IDE
(version 3.0) compiler but not on my Turbo C++ (version 4.5) compiler.
so, yes, it may not run on all compilers.

-arko
 
C

CBFalconer

here's a sample code:

int main()
{
char *array[SIZE][1];

for(int i = 0; i < SIZE; i++)
gets(a[0]);

for(i = 0; i < SIZE; i++){
for(int j = 0; *(a[0] + j) != '\0'; j++)
printf("%c", *(a[0] + j));
printf("\n");
}
return 0;
}
when one enters strings (which may contain white spaces) one by
one (of course, one has to define the constant SIZE first) then
the program prints all the different strings of different lengths
(the maximum length of any string is not specified in the code,
and this is the point) one after another.
a word of caution, though. this code runs fine on my Turbo C++
IDE (version 3.0) compiler but not on my Turbo C++ (version 4.5)
compiler. so, yes, it may not run on all compilers.


I have no idea what you are replying to, due to lack of relevant
quotes. See my sig. below for a cure.

This code is fatally flawed. It is storing data via uninitialized
pointers, and can do absolutely anything, including appearing to
work. It also uses gets, which is always a fatal error, because it
is uncontrollable. So even if you initialize the pointers behavior
will remain undefined.

For an example of code that can store and arbitrary number of
arbitrary strings in memory (barring memory exhaustion) see the
freverse demonstration in my ggets package. You can download it
at:

<http://cbfalconer.home.att.net/download/ggets.zip>
 
F

Flash Gordon

here's a sample code:

int main()
{
char *array[SIZE][1];

for(int i = 0; i < SIZE; i++)
gets(a[0]);


Wrong for at *least* two reasons. One of which I am sure you have been
told before. a[0] has not been initialised, so where it points to is
completely random.

gets should never be used because there is no way to specify the buffer
size.
for(i = 0; i < SIZE; i++){
for(int j = 0; *(a[0] + j) != '\0'; j++)
printf("%c", *(a[0] + j));
printf("\n");
}
return 0;
}
when one enters strings (which may contain white spaces) one by one (of
course, one has to define the constant SIZE first) then the program
prints all the different strings of different lengths (the maximum
length of any string is not specified in the code, and this is the
point) one after another.
a word of caution, though. this code runs fine on my Turbo C++ IDE
(version 3.0) compiler but not on my Turbo C++ (version 4.5) compiler.
so, yes, it may not run on all compilers.


As you were told with previous attempts. It might not work tomorrow. It
might not work later today. It might not work if you run it outside the
IDE. It might not work if you run a different program before running it.
If you are running a poor multi-tasking operating system such as Windows
9x, it might cause some *other* program to crash.

If you want to read a string of arbitrary length the *only* way to do it
is by reading it a bit at a time increasing the size of the buffer as
you go. People have written routines to do this such as, IIRC, ggets,
but they are *not* part of the C language and they do what I've just
described.
 
K

Keith Thompson

CBFalconer said:
This code is fatally flawed. It is storing data via uninitialized
pointers, and can do absolutely anything, including appearing to
work. It also uses gets, which is always a fatal error, because it
is uncontrollable. So even if you initialize the pointers behavior
will remain undefined.

A small quibble:

The gets() function is dangerous, evil, and its mother dresses it
funny. It should never be used. But a call to gets() invokes
undefined behavior *only* if the input happens to overflow the buffer.

Behavior that's undefined only for certain inputs isn't much better
than behavior that's always undefined, but it's an important
distinction.

The one argument I've seen for keeping gets() in the language is that
it can be used safely *if* you have complete control over the contents
of stdin. For example, if the program is part of a system that
(though means outside the language) creates an input file and then
executes the program with stdin redirected from that file, with no
interactive user input, gets() can be used "safely".
 
M

Michael Wojcik

The one argument I've seen for keeping gets() in the language is that
it can be used safely *if* you have complete control over the contents
of stdin. For example, if the program is part of a system that
(though means outside the language) creates an input file and then
executes the program with stdin redirected from that file, with no
interactive user input, gets() can be used "safely".

It's still unsafe in that hypothetical design (without assuming
significant additional constraints), because it's vulnerable to
TOCTOU (Time of Creation versus Time of Use) attacks, for example.

If I understand correctly, you're not arguing for using gets even in
this sort of application, merely citing an argument you've seen
elsewhere, so this shouldn't be taken as anything more than a demon-
stration of why arguments for "safely using gets" require situations
so contrived that they're largely moot. In practice, even using gets
with "controlled" input is dangerous, because in practice it's very
hard to completely control input.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top