arrays of strings and pointers

R

raphfrk

I have the following code:

char buf[10][10000];

printf("%lp\n", buf);
printf("%lp\n", &buf[0]);
printf("%lp\n", buf[0]);
printf("%lp\n", buf[1]);
printf("%d\n", buf[1]-buf[0]);

The first 3 printfs give the same result and the last 2 show that
buf[1] is 10000 away from buf[0]. Is this the expected result?

I had assumed that the result would have been:

buf:

This should be a pointer to the first of a set of 10 pointers. These
pointers would be accessed by buf[0] ... buf[9]. They would point to
the 10 10000 char strings (which are a single block of memory).

&buf[0]

This should be the same as above. buf[0] is the same as *(buf+0), so
&buf[0] is &(*(buf+0)) which should be just buf.

buf[0]

This should point to the first string. It shouldn't be the same as
&buf[0].

buf[1]

This is as expected 10000 higher than buf[0].

I have a function that takes as its input an array of pointers:

void funct(char **base);

I had assumed that I could just pass buf to it, is it necessary to
create the 10 array of pointers manually ? However, it looks like buf
is of type pointer to a char rather than pointer to a pointer to a char.
 
M

Mark McIntyre

buf[0]

This should point to the first string. It shouldn't be the same as
&buf[0].

It is. Think about it.
void funct(char **base);

I had assumed that I could just pass buf to it, is it necessary to
create the 10 array of pointers manually ?

No, this'll work. What are you /actually/ trying to do? Stop trying to
guess whats happening!
However, it looks like buf
is of type pointer to a char rather than pointer to a pointer to a char.

If its defined as above, its an array of arrays of chars.

Note that this isn't the same as either a pointer or a pointer to a
pointer.
 
R

raphfrk

Mark said:
If its defined as above, its an array of arrays of chars.

Note that this isn't the same as either a pointer or a pointer to a
pointer.

Isn't an array and a pointer basically the same once memory has been
allocated etc. ? The difference being that you can't move where the
array base is and also that the array will have memory allocated
automatically. For example, these are basically equivalent (assuming
you ignore run-time and compile time memory allocation differences):

char buf[255];

strcpy(buf,"string");

printf("%c", buf[3]);

and

char *buf;

buf = malloc(sizeof(char)*7);

strcpy(buf,"string");

printf("%c", buf[3]);

You can also use *buf in both cases and that will give you the first
character of the sting.

Anyway, perhaps this equivalence breaks down when you start dealing
with arrays of arrays.

So, I guess the question is, if I define buf as:

char buf[10][10000];

and want to pass it to a function, what should the function prototype
be ? I had assumed that a double array was effectively a pointer to a
pointer.

Also, I assume that it still passes by reference and doesn't copy the
entire array to the function's local variables.
 
M

Mike Deskevich

Anyway, perhaps this equivalence breaks down when you start dealing
with arrays of arrays.

So, I guess the question is, if I define buf as:

char buf[10][10000];

and want to pass it to a function, what should the function prototype
be ? I had assumed that a double array was effectively a pointer to a
pointer.

Also, I assume that it still passes by reference and doesn't copy the
entire array to the function's local variables.

yes, this is where you're confused. a multiply dimentioned array is
still just a pointer, not a pointer to a pointer.

char buf[10][10000] allocates the same memory as char buf[10*10000]
which is similar to buf=(char*)malloc(10*10000*sizof(char));

the difference is that if you do char buf[10][10000] the compiler is
smart enough to know how to increment the memory addresses depending on
the two indices.

so you could have

char buf[10][10000];

printf("%c",buf[j])

which is equivalent to

char buf[10*10000]
printf("%c",buf[j*10+i]) //i think i got that right - gurus: please
check me on this

so passing a multiply dimensioned array to a function is a pain. here's
how you do it

char buf[10][10000]
dosomething(buf);

void dosomething(char buf[10][])
{
buf[j]
}

the compiler needs to know the size of all the dimensions except for
the last one. becasue internally the compiler is doing the j*10+i
step, it's just hiding it from you.

there are two ways around this:

1 - if you actually need to have the stuff in memory ordered as a
matrix (doing fast math stuff), then you can take care of the j*10+i
step yourself

char buf[10*10000];
dosomething(buf);

void dosomething(char* buf)
{
buf[j*10+i]; //same as buf[j]
}

2 - if you don't care about speed, then you can do what you were
thinking of with pointers to pointers

char** buf;

buf=(char**)malloc(10*sizeof(char*));
for (i=0;i<10;i++)
{
buf=(char*)malloc(10000*sizeof(char));
}
dosomething(buf)
for (i=0;i<10;i++)
{
free(buf);
}
free(buf);

void dosomething(char** buf)
{
buf[j]; //just as you expected
}

gurus: please feel free to correct any misstatements, i think i know
what i'm talking about, but when i read this newsgroup i realize how
much c i don't know.

mike
 
O

Old Wolf

I have the following code:

char buf[10][10000];


buf:

This should be a pointer to the first of a set of 10 pointers.

buf is an ARRAY. It is not a pointer, or a set of pointers, etc.
Arrays are not pointers. Arrays are a set of adjacent memory
locations for storing objects. Pointers store the address of
other objects. Pointer declarations are indicated by a '*'
symbol (except when it is a function formal parameter).

Please read this newsgroup FAQ, it has a section on arrays
and pointers.
These pointers would be accessed by buf[0] ... buf[9].

That would be:

char *buf[10];

Note the '*' which indicates that we have pointers.
They would point to the 10 10000 char strings (which are a
single block of memory).

That would be:

char *buf[10];
char bufmem[10][10000];
for (int i = 0; i != 10; ++i)
buf = bufmem;
buf[0]

This should point to the first string. It shouldn't be the same as
&buf[0].

buf[0] is an array of 10000 chars. The first element of an array
always starts at the same memory location as the entire array.
So that is why you are seeing the same value displayed.
I have a function that takes as its input an array of pointers:

void funct(char **base);

I had assumed that I could just pass buf to it

Buf is not an array of pointers. So you cannot do this.
However, it looks like buf is of type pointer to a char rather
than pointer to a pointer to a char.

buf is of type "array[10] of array[10000] of char".
It can decay to "pointer to array[10000] of char", but not to
"pointer to pointer to char". Read the FAQ for a more detailed
explanation.
 
F

Flash Gordon

Isn't an array and a pointer basically the same once memory has been
allocated etc. ?

No. Try using sizeof on it. Taking the address also give you a pointer
of a different type. I suggest you read the comp.lang.c FAQ, I'll point
you at a few of the appropriate sections, but you should read rather
more than I will specifically point you at.

Start with http://www.eskimo.com/~scs/C-faq/q6.3.html

Anyway, perhaps this equivalence breaks down when you start dealing
with arrays of arrays.

It breaks down with sizeof and & as well.
So, I guess the question is, if I define buf as:

char buf[10][10000];

and want to pass it to a function, what should the function prototype
be ? I had assumed that a double array was effectively a pointer to a
pointer.

No, see http://www.eskimo.com/~scs/C-faq/q6.18.html
Also, I assume that it still passes by reference and doesn't copy the
entire array to the function's local variables.

The array name decays to a pointer to its first element, see
http://www.eskimo.com/~scs/C-faq/q6.4.html, but since pointers and
arrays are different types a pointer to an array and a pointer to a
pointer are different.

Read the rest of section 6, and also read what your text book says about
arrays and pointers (get K&R2 if you don't have a text book, see the
bibliography of the FAQ for the full name).
 
K

Keith Thompson

Isn't an array and a pointer basically the same once memory has been
allocated etc. ? The difference being that you can't move where the
array base is and also that the array will have memory allocated
automatically. For example, these are basically equivalent (assuming
you ignore run-time and compile time memory allocation differences):

No. Arrays are arrays, and pointers are pointers.

The rule is that an expression of array type, in most contexts, is
implicitly converted to a pointer to the array's first element. The
exceptions are when the array expression is the operand of a unary "&"
or "sizeof" operator, or when it's a string literal in an initializer.

Read section 6 of the C FAQ, <http://www.eskimo.com/~scs/C-faq/faq.html>.
 
B

bitshadow

char buf[10][10000]
dosomething(buf);

void dosomething(char buf[10][])
{
buf[j]

}



shouldn't that be:

void dosomething(char *[]);
or// void dosomething(char **)
or// void dosomething(char [][])

and its subsequent definition:

void dosomething(char [][COLS])
or// void dosomething(char (*ptr_chr)[COLS] )

the point being that a matrix is really a contiguous block of memory
stored in the computer. the abstraction of a deminsion higher than one
is purely a way to simplify it for the programmer. As the computer
obviously cannot do that, it thinks in a linear pattern and the memory
is allocated thus. thus for multidemensional arrays all the compiler
really needs to know is how many elements it needs to read and what
data type it is pointing to.
 
S

Simon Biber

So, I guess the question is, if I define buf as:

char buf[10][10000];

and want to pass it to a function, what should the function prototype
be ? I had assumed that a double array was effectively a pointer to a
pointer.

That's where you went wrong. It's not a pointer to a pointer. It's
effectively a 'pointer to an array of 10000 char'. Read on for more
explanation of what that means.

The correct prototype is:
void function(char (*buf)[10000]);

An alternative prototype, which means *exactly* the same thing, is:
void function(char buf[][10000]);

Also, I assume that it still passes by reference and doesn't copy the
entire array to the function's local variables.

It doesn't copy the entire array. If you have
char buf[10][10000];
then buf is an array of ten elements. Yes, only ten elements. But what
type does each element have?

Hint: it is NOT a pointer type!

Each element is an array of 10000 chars.

The memory for each of the ten arrays of 10000 chars are packed together
as a single block of 100,000 bytes. There are no pointers stored in memory.

There is a fundamental RULE about arrays, which always applies:

When you use an array, including passing it to a function, what
was an "array" actually becomes a "pointer to the first element
of the array".

What is the first element of your array? It's buf[0].
What type does it have? Array of 10000 char. It is not a pointer.

A pointer to the first element of your array has a special new type:
"pointer to array of 10000 char". You may not have come across the C
syntax for writing such a thing yet. If you wanted to declare 'p' as a
pointer to array of 10000 char, you would write:
char (*p)[10000];

These pointers to arrays are clever beasties. If you add a number to one
of them, it will automatically skip that many blocks of 10000 char, in
effect finding the address of the N'th array of 10000 char. So,
buf + 0 is a pointer to the first block of 10000 char,
buf + 1 is a pointer to the second block of 10000 char,
buf + 2 is a pointer to the third block of 10000 char, etc.

If you dereference it (apply the * operator), it will resolve back into
an array of 10000 char, which will in turn become a pointer to the first
element of that array, ie. a pointer to char.

(buf + 2) is a pointer to the second block of 10000 char

*(buf + 2) is the second block itself, which resolves into
a pointer to the first element of the second block.

However, the pointer *(buf + 2) is never actually stored in the array.
It is *calculated*, probably by adding 2 * 10000 to the base address of
the array.

C has a nice piece of 'syntactical sugar', an unnecessary addition to
the language, which allows you to re-write an expression
*(a + b)
as
a

The two forms are equivalent in *all* situations, no matter whether you
are using arrays, pointers, strings, etc.

So, *(buf + 2) is equivalent to buf[2]

Obviously, the second form is considered more stylish, but you must
understand their equivalence, to understand how C arrays and pointers work.
 
N

Netocrat

]
a multiply dimentioned array is
still just a pointer, not a pointer to a pointer.

It's not a pointer, but it does automatically decay to one in most
contexts.

[...]
char buf[10][10000];

printf("%c",buf[j])

which is equivalent to

char buf[10*10000]
printf("%c",buf[j*10+i]) //i think i got that right - gurus: please

^^^^^^
i*10000+j

[...]
void dosomething(char buf[10][])

Won't compile as written.

Try: buf[][10000],
or buf[10][10000]
or (*buf)[10000]

This allows a two-dimensional array with fixed-size second dimension to be
passed without recourse to the two alternatives you later suggested,
although the pointer-to-pointer technique is a common means of dealing
with a variable-size second dimension. C99 provides variable arrays with
further utility when passing array parameters as discussed in detail in a
prior thread.

[For array parameters in function prototypes]
the compiler needs to know the size of all the dimensions except for the
last one.
^^^^
first
 
N

Netocrat

On Wed, 26 Oct 2005 19:27:19 -0700, bitshadow wrote:
[apparently quoting Mike Deskevich's code snippet]
char buf[10][10000]
dosomething(buf);

void dosomething(char buf[10][])
{
buf[j]

}

shouldn't that be:

void dosomething(char *[]);


That's OK.
or// void dosomething(char **)

Incompatible with char[10][10000].
or// void dosomething(char [][])

Dubious at best, and if I recall correctly, technically disallowed by the
standard - at least in the function definition. Better to specify at
least the final dimension size as you do below.
and its subsequent definition:

void dosomething(char [][COLS])
or// void dosomething(char (*ptr_chr)[COLS] )

These are fine.
the point being that a matrix is really a contiguous block of memory
stored in the computer.

Correct, but char ** is different and also not necessarily contiguous
(consult the FAQ for details).
 
M

Mark McIntyre

Isn't an array and a pointer basically the same once memory has been
allocated etc. ?

No, no, a thousand times no. Please read the FAQ 6.1, 6.2, 6.3 and
following.
char buf[255];

this is an array of 255 chars.
char *buf;
buf = malloc(sizeof(char)*7);

this is a pointer to a block of seven chars.

They're not remotely similar.
Anyway, perhaps this equivalence breaks down when you start dealing
with arrays of arrays.

Well before that......

char buff[255]={0};
char *buf1;
buf = "hello";
buf1 = "hello";
buf[0]="a";
buf1[0]="a";
if I define buf as:

char buf[10][10000];

and want to pass it to a function, what should the function prototype
be ? I had assumed that a double array was effectively a pointer to a
pointer.

The prototype should match the variable - ie char[10][10000].
Optionally, the first dimension may be left empty, and the compiler
will work it out at compile-time.
Also, I assume that it still passes by reference

C /never/ passes by reference, only by value.
and doesn't copy the entire array to the function's local variables.

Thats because it passes the value of the /address/ of the variable.
 
B

bitshadow

or// void dosomething(char **)
Incompatible with char[10][10000].
yes i believe you're right. though a 2d array can be dereferneced with
** such as:
int main(int argc, char **) the number of elements is still missing.
better:

void dosomething(char (*optional_id)[10000]).
an oversight on my part. see below.
or// void dosomething(char [][])

Dubious at best, and if I recall correctly, technically disallowed by
the
standard - at least in the function definition. Better to specify at
least the final dimension size as you do below.

for the prototype the num of elements needs to be indicated as you
said -which is the inconvenience of it all as that necesscitate a
physic ability, for the definition obvioulsy a identifier must be
included. however this was done to show the prototype.
 
B

bitshadow

[blockquote]
char buf[10][10000] allocates the same memory as char buf[10*10000]

which is similar to buf=(char*)malloc(10*10000*sizof(char));
[/blockquote]

just wanted to add, the first statement was correcnt in the allocation
of memory, however, we are coding in C and malloc doesn't need to be
cast in C. C++ yes, not here.
also the *sizeof(char) is redundant. malloc allocates bytes which
defaults to the sizof char so 10 * 10000 is sufficient.
 
N

Netocrat

On Thu, 27 Oct 2005 11:16:17 -0700, bitshadow wrote:
[quoting Netocrat, in turn
quoting bitshadow]
or// void dosomething(char **)
Incompatible with char[10][10000].
yes i believe you're right. though a 2d array can be dereferneced with
** such as:
int main(int argc, char **) the number of elements is still missing.

Sure, you can dereference a 2d array using **, because an array
decays to a pointer in that context. They're different and incompatible
types though. What's passed in to main as the second argument
uses explicit pointers (and requires more memory) whereas char[X][Y] uses
implicit pointers only, and provides necessarily contiguous memory.

Check out the FAQ if that sounds a little vague.

[...]
 
K

Keith Thompson

Simon Biber said:
There is a fundamental RULE about arrays, which always applies:

When you use an array, including passing it to a function, what
was an "array" actually becomes a "pointer to the first element
of the array".

No, it doesn't always apply.

It's clearer to say that it's implicitly converted to a pointer, not
that it "becomes" a pointer (the array is still there, after all).

An expression of array type is implicitly converted to a pointer to
its first element *unless* it's the operand of a unary "&" or "sizeof"
operator, or it's a string literal in an initializer.

<HINT>This Question is Asked Frequently.</HINT> (See section 6.)
 
K

Keith Thompson

bitshadow said:
[blockquote]
char buf[10][10000] allocates the same memory as char buf[10*10000]

which is similar to buf=(char*)malloc(10*10000*sizof(char));
[/blockquote]

The correct way to quote is to use a "> " prefix on each quoted line,
and to provide an attribution line to indicate who is being quoted.
See nearly every article in this newsgroup for examples. Google makes
this gratuitously difficult, which is why the following advice has
been offered here over 1000 times:

If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.

And please complain to Google about their broken interface.
just wanted to add, the first statement was correcnt in the allocation
of memory, however, we are coding in C and malloc doesn't need to be
cast in C. C++ yes, not here.
also the *sizeof(char) is redundant. malloc allocates bytes which
defaults to the sizof char so 10 * 10000 is sufficient.

It's also possible that the expression 10*10000 could overflow (it's
of type int, which needn't be able to represent values greater than
32767).

The malloc() is similar to the array declaration in that it allocates
(or attempts to allocate) the same amount of space, and the variable
name is spelled the same way. It's very different in that
char buf[10[[10000];
declares buf as an array object, whereas
buf = malloc(whatever);
implies that buf is a pointer. C FAQ, section 6.
 
K

Keith Thompson

bitshadow said:
or// void dosomething(char **)
Incompatible with char[10][10000].
yes i believe you're right. though a 2d array can be dereferneced with
** such as:
int main(int argc, char **) the number of elements is still missing.

No, argv is not a 2d array; it's a pointer-to-pointer-to-char. The
actual value passed is going to be a pointer to the first element of
an array of pointers to char; each element of the array points to the
first character of a string (or has the value NULL).

There are at least four different ways to implement a data structure
that acts like 2-dimensional array.

You can declare an actual array:
int arr1[10][10];
but that's of fixed size (even if it's a VLA, the size can't change
once the object is created).

If you want a variable number of fixed-size arrays, you can declare
an array of pointers:
int *arr2[10];

If you want a fixed number of variable-size arrays, you can declare
a pointer to an array:
int (*arr3)[10];

If you want maximum flexibility, you can declare a pointer-to-pointer:
int **arr4;

For arr2, arr3, and arr4, you have to do your own memory management.
For arr4, you have to allocate an array of pointers *and* multiple
arrays of int, one for each row of the 2d-array-like data structure.

Now remember that the indexing operator x[y] actually operates on
pointers, not necessarily on arrays; it's equivalent to *(x+y). It
works on arrays because an array name is (usually) implicitly
converted to a pointer.

The declarations of arr1, arr2, arr3, and arr4 create four very
different things: a true two-dimensional array (actually an array of
arrays), an array of pointers, a pointer to an array, and a pointer to
a pointer. In a non-C-like language, these would have very little in
common. In C, they still have very little in common. But because of
the implicit conversion of arrays to pointers, the following
expressions (assuming a and b are integers) are *all* valid:

arr1[a]
arr2[a]
arr3[a]
arr4[a]

You can *sometimes* get away with using arrays and pointers in C
without understanding all of this, but if you want to program
effectively and understand what your code is doing, you should
understand how all this stuff actually works under the hood.

Section 6 of the C FAQ is a good starting point.
 
T

Tim Rentsch

There are at least four different ways to implement a data structure
that acts like 2-dimensional array. []
int arr1[10][10]; []
int *arr2[10]; []
int (*arr3)[10]; []
int **arr4;

Hey Keith,

I like what you wrote here. Just one suggestion: if you
have occasion to post it again, start with

int arr1[10][20];

so subsequent declarations for arr2 and arr3 can make
it more obvious which dimension corresponds to which.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top