static char arrays size

D

daniel

Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

A detailed explanation would be very useful.

Thx,
Daniel.
 
B

Bartc

daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

You're thinking about array element sizes, where the calculation of the
address of a (or the offset from a[0]) can be simpler when the size of
each element of a is a power of two.

(Although alignment can also come into it, when the element is a struct for
example. But this should be taken care of for you.)
 
J

James Kuyper

daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

A detailed explanation would be very useful.

An array should always be big enough to hold the biggest thing that you
plan to put in it. Making it any bigger than that is, in general,
wasteful. Therefore, you only have a choice to make if you're still at
the point in the design process where you get to decide what you plan to
put into that array, and where the answer to that question is not forced
upon you by other considerations.

In that case, a length that is a power of 2 does indeed provide a minor
benefit. Many computer systems push around data more efficiently when it
has a size which is a multiple of a certain number of bytes. There's no
guarantees that any such sizes exist, and there's no guarantees that
they will be powers of 2. However, as a practical matter the special
sizes are almost always powers of 2.

Since every power of 2 is a multiple of all smaller powers of 2,
rounding an array size up to the next power of 2 is a choice that in
general won't hurt, an on many systems will slightly improve the
efficiency of your code. However, I would strongly recommend not making
a big deal about this. Just about any other reason for choosing a
particular size for the array is going to be more important than this
one. Use this criterion only when all other constraints still leave you
with a choice to make.

Please note that this doesn't apply just to arrays of char; it applies
to all arrays.
 
B

Bartc

James Kuyper said:
daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

A detailed explanation would be very useful.
In that case, a length that is a power of 2 does indeed provide a minor
benefit. Many computer systems push around data more efficiently when it
has a size which is a multiple of a certain number of bytes.

I don't quite understand how having an unused 28 bytes on the end of a
100-byte array is going to help.

If a machine can do 8-byte moves then there might be a case for a 104-byte
array size instead of 100, because of the fiddly 4 bytes at the end, but no
need for 128. That's assuming C can move around entire arrays, which it
can't directly. And anyway this is a matter for the compiler to worry about.
Since every power of 2 is a multiple of all smaller powers of 2, rounding
an array size up to the next power of 2 is a choice that in general won't
hurt, an on many systems will slightly improve the efficiency of your
code.

Rounding a 1048577-byte array up to 2097152 bytes I think would definitely
hurt.
 
J

James Kuyper

Bartc said:
James Kuyper said:
daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

A detailed explanation would be very useful.
In that case, a length that is a power of 2 does indeed provide a
minor benefit. Many computer systems push around data more efficiently
when it has a size which is a multiple of a certain number of bytes.

I don't quite understand how having an unused 28 bytes on the end of a
100-byte array is going to help.

It won't. The only case where you should ever use this criterion for
deciding for deciding the length of the array, is if you're still at the
point of deciding whether to design the program to work with a maximum
of 100 bytes, or a maximum of 128 bytes. I'm assuming that, regardless
of what decision is made, there will be cases which use the remaining 28
bytes; cases which, at that point of the design cycle, you're still free
to decide whether or not the program should handle those cases as
normal, or as exceptions to be handled by some other method.

Prime example: fields meant to hold people's names. There's no
reasonable fixed length that can deal with every possible name. If
variable-length fields are not acceptable (which is often the case),
then the exact length you should use is fairly arbitrary, and might as
well be chosen based in part upon the minor efficiencies of power-of-2.
If a machine can do 8-byte moves then there might be a case for a
104-byte array size instead of 100, because of the fiddly 4 bytes at the
end, but no need for 128.

You're assuming 8-bytes is the special number, and for some systems, for
some purposes, it is. However, there's many different sizes that carry
special advantages, for different purposes. I've known the technical
details of systems where sizes of 16 bytes, 128 bytes, 256 bytes and
1024 bytes have been relevant, for a variety of different reasons. When
I consider the small number of machines which I'm personally familiar
with, compared to the huge variety of real-world machines for which C
code is written, I wouldn't recommend ignoring the possibility that any
particular power of 2 might be relevant on some system, somewhere. On
the flip side, I wouldn't recommend attaching any great significance to
that possibility, either.
That's assuming C can move around entire
arrays, which it can't directly.

I'm not sure what you mean by that comment. I use memcpy() frequently
for that purpose. Some versions of memcpy() do work more efficiently on
objects which have particular alignments, and a size which is a multiple
of that alignment, which is usually a power of 2.
... And anyway this is a matter for the
compiler to worry about.

The issue I'm thinking about cannot be decided by the compiler; it is
inherently exclusively in the domain of the designer of a program. You
may be thinking of a different issue than I am.
Rounding a 1048577-byte array up to 2097152 bytes I think would
definitely hurt.

If you know that 1048577 bytes is guaranteed to be sufficient, that's
the size you should use. The requirements for reasonable use of this
reason for selecting the size of an array to be pow(2,N) are as follows:
the chance that a size greater than 'n' will be needed is unacceptably
high for n==pow(2,N-1), and is acceptably low at n==pow(2,N+1), and the
transition point between "unacceptably high" and "acceptably low" is
sufficiently poorly known to justify estimating it as pow(2,N).

To use my example above; if I had a complete list of every name that
might ever be entered in my database, or at least a statistical summary
of the characteristics of such a list, and sufficient knowledge of how
those names might be used, I could put together a mathematical model
that calculates the costs and benefits of using different lengths for
the name field, and solve that model to determine the optimal length. If
the issue was sufficiently important, that is precisely what I would do,
and I would set the field length to that optimum. However, it's
relatively rare to have that kind of information available for free-form
text fields, and it's not unusual to have a similar lack of information
for other kinds of arrays. Even if it is technically feasible to collect
the relevant information, it might be unacceptably expensive. Under such
circumstances, rounding a rough estimate of the space needed to the next
power of 2 is entirely reasonable.
 
F

Flash Gordon

Bartc wrote, On 11/08/08 11:57:
daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

You're thinking about array element sizes, where the calculation of the
address of a (or the offset from a[0]) can be simpler when the size of
each element of a is a power of two.


I've not come across that being an issue.
(Although alignment can also come into it, when the element is a struct
for example. But this should be taken care of for you.)

Indeed.

The one (and only) time where I seriously go for a power of two size is
when I'm implementing a circular buffer. Then I know that my "modulus
buffer_size" operation can be reduced to applying a simple mask. I still
use the modulus operator and leave it up to the compiler to optimise it
so that if someone changes the buffer size it all continues to work even
if the efficiency is reduced.
 
B

Bartc

Flash Gordon said:
Bartc wrote, On 11/08/08 11:57:
daniel said:
Hello ,


I always had the feeling that is better to have char arrays with the
size equal to a power of two.

For example:
char a_str[128]; // feels ok
char b_str[100]; //feels not ok.

Is that true , and if is true then what is the reason behind this? ( I
suppose that is something related to memory alignment but it's not
very clear in my mind ).

You're thinking about array element sizes, where the calculation of the
address of a (or the offset from a[0]) can be simpler when the size of
each element of a is a power of two.


I've not come across that being an issue.


It can mean the difference between having to multiply to get at an array
element, and simply shifting (or, sometimes, just using a scale factor in
the address mode of the output code).
The one (and only) time where I seriously go for a power of two size is
when I'm implementing a circular buffer. Then I know that my "modulus
buffer_size" operation can be reduced to applying a simple mask. I still
use the modulus operator and leave it up to the compiler to optimise it so
that if someone changes the buffer size it all continues to work even if
the efficiency is reduced.

I sometimes do the same thing for hashtables.

But unless there's a good reason for it, there's no particular need for
power-of-two array sizes.

In the past I did do this a lot, for good reasons (for example an array of
4096 bytes would exactly fit into one page of my pseudo-virtual memory
system, or I had exactly N bits available for indexing so might as well use
a full 2**N elements).
 
C

christian.bau

It can mean the difference between having to multiply to get at an array
element, and simply shifting (or, sometimes, just using a scale factor in
the address mode of the output code).

The size of the array wouldn't make a difference to the code needed to
access an array element. Where it makes a difference is two-
dimensional arrays, or an array of structs where you can by design
influence whether the struct size is a power of two or not.

HOWEVER, there are processors around that really dislike consecutive
memory accesses where the distance between consecutive accesses is a
large power of two. An example that I found was one platform where I
measured the time for a straightforward matrix multiplication, and
multiplying two matrices of size 128 x 128 took seven times longer
than the case 127 x 127 or 129 x 129.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top