Interpreting mem_map init code

zolli · Mar 27, 2005

Hi,

This question is about a piece of Linux kernel code, but is in fact a C
language question. I was looking throught some memory map init code and
ran into the following:

p = mem_map + MAP_NR(end_mem);
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);

(*) This is the line I don't understand.

Where MAP_NR is defined (on an i386, anyway) as:

#define MAP_NR(addr) (__pa(addr) >> 12)
#define __pa(x) ((unsigned long)(x)-0xC0000000)

The first line sets p to the beginning of the heap plus the size of the
memory map which is calculated using the MAP_NR macro.

Using reasonable (?) example values if I do this:

unsigned long start_mem, end_mem, mem_map, p;

start_mem = 0xC1000000;
end_mem = 0xC2000000;
mem_map = start_mem;

p = mem_map + MAP_NR(end_mem);
start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
printf("p: 0x%x\n",p);
printf("start_mem: 0x%x\n",start_mem);

I get this as output:

p: 0xc1002000
start_mem: 0xc1002000

So what does the (*) line of code above do? I know that it changes
start_mem to the point where the memory allocated for the memory map
ends, but wouldn't that be more easily accomplished by:

start_mem += MAP_NR(end_mem);

My guess is that the line of code somehow sets the size of the mem_map
according to free memory minus the size of the mem_map, but I don't see how.

Thanks in advance,
zolli

CBFalconer · Mar 27, 2005

zolli said:
This question is about a piece of Linux kernel code, but is in
fact a C language question. I was looking throught some memory
map init code and ran into the following:

p = mem_map + MAP_NR(end_mem);
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);

(*) This is the line I don't understand.

Nothing complex about it. The portion "~(sizeof(long)-1)" creates
a mask, which will zero a portion of an address so that it is
aligned for longs. That is a '~' complement operator, not a '-'
minus sign. The "+ sizeof(long) - 1" portion advances any address
so that the mask can chop it off, and the result will not be
smaller than the original input address, which is the "(unsigned
long)p" portion. It is all very system specific and non-portable.

Chris Torek · Mar 27, 2005

zolli said:
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
(*) This is the line I don't understand.
Using reasonable (?) example values ...

Chuck Falconer already described the goal, but the action may be
easier to see with additional "reasonable (?)" example values.

Let us assume that sizeof(long) is, numerically speaking, either
4 or 8, since those are in fact typical today. Ignoring all the
funky 0xc0000000 type numbers I snipped, let us take a look at what
happens to values in the range [0..8] if sizeof(long) is 4, and
[0..16] if sizeof(long) is 8.

If sizeof(long)==4, and assuming p and start are both "unsigned
long", we have:

start = (p + (size_t)4 - 1) & ~((size_t)4 - 1);

which is just:

start = (p + (size_t)3) & ~(size_t)3;

Of course, all we know about size_t -- the type of the result of
sizeof -- is that it is some unsigned integral type, probably either
unsigned int or unsigned long. For simplicitly let us assume it
is also unsigned long:

start = (p + 3UL) & ~3UL;

The ~3UL presumably produces either 0xfffffffc or 0xfffffffffffffffc,
depending on sizeof(long) again (because we are also assuming that
CHAR_BIT is 8 and there are no "holes" in the value bits of an
unsigned long). As it happens, as long as we stick with numerically
small values, it does not really matter. (But note that if we have
sizeof(long)==8 and sizeof(size_t)==4, this expression goes awry
for larger values -- this is a small flaw in the code you are
looking at: it uses both "unsigned long" and "size_t", assuming
they are equally correct, when it is possible that only one, or
even none, are the correct integer type for this kind of sneaky
pointer manipulation.)

Anyway, so, now we get a table of values for sizeof(long)==4:

p p+3 (p+3)&~3
- --- --------
0 3 0

1 4 4
2 5 4
3 6 4
4 7 4

5 8 8
6 9 8
7 10 8
8 11 8

The vertical white space shows how the results group. Note that the
next input (9) jumps to the next output (12).

The table for (p+7)&~7 is left as an exercise.

As another exercise, try computing (p + 7U) & ~7U when p (an unsigned
long) is 0x0123456789abcdef, and sizeof(long)==8; but sizeof(size_t)==4,
so that ~7U is actually 0x00000000fffffff8, instead of the (presumably
desired) 0xfffffffffffffff8.

zolli · Mar 28, 2005

Chris said:
(*) start_mem = ((unsigned long)p + sizeof(long) - 1) &
~(sizeof(long)-1);
(*) This is the line I don't understand.
Using reasonable (?) example values ...

Click to expand...

Chuck Falconer already described the goal, but the action may be
easier to see with additional "reasonable (?)" example values.

Let us assume that sizeof(long) is, numerically speaking, either
4 or 8, since those are in fact typical today. Ignoring all the
funky 0xc0000000 type numbers I snipped, let us take a look at what
happens to values in the range [0..8] if sizeof(long) is 4, and
[0..16] if sizeof(long) is 8.

If sizeof(long)==4, and assuming p and start are both "unsigned
long", we have:

start = (p + (size_t)4 - 1) & ~((size_t)4 - 1);

which is just:

start = (p + (size_t)3) & ~(size_t)3;

This simplification makes things much easier to understand.

Of course, all we know about size_t -- the type of the result of
sizeof -- is that it is some unsigned integral type, probably either
unsigned int or unsigned long. For simplicitly let us assume it
is also unsigned long:

start = (p + 3UL) & ~3UL;

The ~3UL presumably produces either 0xfffffffc or 0xfffffffffffffffc,
depending on sizeof(long) again (because we are also assuming that
CHAR_BIT is 8 and there are no "holes" in the value bits of an
unsigned long). As it happens, as long as we stick with numerically
small values, it does not really matter. (But note that if we have
sizeof(long)==8 and sizeof(size_t)==4, this expression goes awry

I tried this, and you're right: the results are unexpected.

for larger values -- this is a small flaw in the code you are
looking at: it uses both "unsigned long" and "size_t", assuming
they are equally correct, when it is possible that only one, or
even none, are the correct integer type for this kind of sneaky
pointer manipulation.)

Anyway, so, now we get a table of values for sizeof(long)==4:

p p+3 (p+3)&~3
- --- --------
0 3 0

1 4 4
2 5 4
3 6 4
4 7 4

5 8 8
6 9 8
7 10 8
8 11 8

Ahh. The light just came on!

The vertical white space shows how the results group. Note that the
next input (9) jumps to the next output (12).

The table for (p+7)&~7 is left as an exercise.

As another exercise, try computing (p + 7U) & ~7U when p (an unsigned
long) is 0x0123456789abcdef, and sizeof(long)==8; but sizeof(size_t)==4,
so that ~7U is actually 0x00000000fffffff8, instead of the (presumably
desired) 0xfffffffffffffff8.

Have you ever been a teacher? Thanks for the patient explanations.

Cheers,
zolli

CBFalconer · Mar 28, 2005

zolli said:
Chris Torek wrote:
.... snip ...

Ahh. The light just came on!

Have you ever been a teacher? Thanks for the patient explanations.

I think he is. Illustrating once again the enormous difference
between simply knowing what you are talking about and being able to
teach it. I would have gone on muttering about masks and never
thought of a table.

Code do not understand	5	Apr 24, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
Qsort() messing with my entire Code	0	Apr 25, 2022
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Qsort() is messing with my entire Code!!!	0	Apr 25, 2022
Why struct not globally changed in function?	1	Aug 22, 2023
Help with code	0	Jun 12, 2022
Addition and substraction of polynomials is working fine but the multiplication isn't; what's wrong with my code	1	Nov 22, 2022

Interpreting mem_map init code

zolli

CBFalconer

Chris Torek

zolli

CBFalconer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads