Interpreting mem_map init code

Z

zolli

Hi,

This question is about a piece of Linux kernel code, but is in fact a C
language question. I was looking throught some memory map init code and
ran into the following:

p = mem_map + MAP_NR(end_mem);
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);

(*) This is the line I don't understand.

Where MAP_NR is defined (on an i386, anyway) as:

#define MAP_NR(addr) (__pa(addr) >> 12)
#define __pa(x) ((unsigned long)(x)-0xC0000000)

The first line sets p to the beginning of the heap plus the size of the
memory map which is calculated using the MAP_NR macro.

Using reasonable (?) example values if I do this:

unsigned long start_mem, end_mem, mem_map, p;

start_mem = 0xC1000000;
end_mem = 0xC2000000;
mem_map = start_mem;

p = mem_map + MAP_NR(end_mem);
start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
printf("p: 0x%x\n",p);
printf("start_mem: 0x%x\n",start_mem);

I get this as output:

p: 0xc1002000
start_mem: 0xc1002000

So what does the (*) line of code above do? I know that it changes
start_mem to the point where the memory allocated for the memory map
ends, but wouldn't that be more easily accomplished by:

start_mem += MAP_NR(end_mem);

My guess is that the line of code somehow sets the size of the mem_map
according to free memory minus the size of the mem_map, but I don't see how.

Thanks in advance,
zolli
 
C

CBFalconer

zolli said:
This question is about a piece of Linux kernel code, but is in
fact a C language question. I was looking throught some memory
map init code and ran into the following:

p = mem_map + MAP_NR(end_mem);
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);

(*) This is the line I don't understand.

Nothing complex about it. The portion "~(sizeof(long)-1)" creates
a mask, which will zero a portion of an address so that it is
aligned for longs. That is a '~' complement operator, not a '-'
minus sign. The "+ sizeof(long) - 1" portion advances any address
so that the mask can chop it off, and the result will not be
smaller than the original input address, which is the "(unsigned
long)p" portion. It is all very system specific and non-portable.
 
C

Chris Torek

zolli said:
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
(*) This is the line I don't understand.
Using reasonable (?) example values ...

Chuck Falconer already described the goal, but the action may be
easier to see with additional "reasonable (?)" example values. :)

Let us assume that sizeof(long) is, numerically speaking, either
4 or 8, since those are in fact typical today. Ignoring all the
funky 0xc0000000 type numbers I snipped, let us take a look at what
happens to values in the range [0..8] if sizeof(long) is 4, and
[0..16] if sizeof(long) is 8.

If sizeof(long)==4, and assuming p and start are both "unsigned
long", we have:

start = (p + (size_t)4 - 1) & ~((size_t)4 - 1);

which is just:

start = (p + (size_t)3) & ~(size_t)3;

Of course, all we know about size_t -- the type of the result of
sizeof -- is that it is some unsigned integral type, probably either
unsigned int or unsigned long. For simplicitly let us assume it
is also unsigned long:

start = (p + 3UL) & ~3UL;

The ~3UL presumably produces either 0xfffffffc or 0xfffffffffffffffc,
depending on sizeof(long) again (because we are also assuming that
CHAR_BIT is 8 and there are no "holes" in the value bits of an
unsigned long). As it happens, as long as we stick with numerically
small values, it does not really matter. (But note that if we have
sizeof(long)==8 and sizeof(size_t)==4, this expression goes awry
for larger values -- this is a small flaw in the code you are
looking at: it uses both "unsigned long" and "size_t", assuming
they are equally correct, when it is possible that only one, or
even none, are the correct integer type for this kind of sneaky
pointer manipulation.)

Anyway, so, now we get a table of values for sizeof(long)==4:

p p+3 (p+3)&~3
- --- --------
0 3 0

1 4 4
2 5 4
3 6 4
4 7 4

5 8 8
6 9 8
7 10 8
8 11 8

The vertical white space shows how the results group. Note that the
next input (9) jumps to the next output (12).

The table for (p+7)&~7 is left as an exercise. :)

As another exercise, try computing (p + 7U) & ~7U when p (an unsigned
long) is 0x0123456789abcdef, and sizeof(long)==8; but sizeof(size_t)==4,
so that ~7U is actually 0x00000000fffffff8, instead of the (presumably
desired) 0xfffffffffffffff8.
 
Z

zolli

Chris said:
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
(*) This is the line I don't understand.
Using reasonable (?) example values ...


Chuck Falconer already described the goal, but the action may be
easier to see with additional "reasonable (?)" example values. :)

Let us assume that sizeof(long) is, numerically speaking, either
4 or 8, since those are in fact typical today. Ignoring all the
funky 0xc0000000 type numbers I snipped, let us take a look at what
happens to values in the range [0..8] if sizeof(long) is 4, and
[0..16] if sizeof(long) is 8.

If sizeof(long)==4, and assuming p and start are both "unsigned
long", we have:

start = (p + (size_t)4 - 1) & ~((size_t)4 - 1);

which is just:

start = (p + (size_t)3) & ~(size_t)3;

This simplification makes things much easier to understand.
Of course, all we know about size_t -- the type of the result of
sizeof -- is that it is some unsigned integral type, probably either
unsigned int or unsigned long. For simplicitly let us assume it
is also unsigned long:

start = (p + 3UL) & ~3UL;

The ~3UL presumably produces either 0xfffffffc or 0xfffffffffffffffc,
depending on sizeof(long) again (because we are also assuming that
CHAR_BIT is 8 and there are no "holes" in the value bits of an
unsigned long). As it happens, as long as we stick with numerically
small values, it does not really matter. (But note that if we have
sizeof(long)==8 and sizeof(size_t)==4, this expression goes awry

I tried this, and you're right: the results are unexpected.
for larger values -- this is a small flaw in the code you are
looking at: it uses both "unsigned long" and "size_t", assuming
they are equally correct, when it is possible that only one, or
even none, are the correct integer type for this kind of sneaky
pointer manipulation.)

Anyway, so, now we get a table of values for sizeof(long)==4:

p p+3 (p+3)&~3
- --- --------
0 3 0

1 4 4
2 5 4
3 6 4
4 7 4

5 8 8
6 9 8
7 10 8
8 11 8

Ahh. The light just came on!
The vertical white space shows how the results group. Note that the
next input (9) jumps to the next output (12).

The table for (p+7)&~7 is left as an exercise. :)

As another exercise, try computing (p + 7U) & ~7U when p (an unsigned
long) is 0x0123456789abcdef, and sizeof(long)==8; but sizeof(size_t)==4,
so that ~7U is actually 0x00000000fffffff8, instead of the (presumably
desired) 0xfffffffffffffff8.

Have you ever been a teacher? Thanks for the patient explanations.

Cheers,
zolli
 
C

CBFalconer

zolli said:
Chris Torek wrote:
.... snip ...

Ahh. The light just came on!


Have you ever been a teacher? Thanks for the patient explanations.

I think he is. Illustrating once again the enormous difference
between simply knowing what you are talking about and being able to
teach it. I would have gone on muttering about masks and never
thought of a table.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top