Malloc Query

Uncle Steve · May 24, 2011

Not quite - it only has to be in the same translation unit. A
translation unit consists of a given source file plus all of the other
files merged into it by #include statements. The use of the function
must also occur within the scope of the inline declaration, which
basically means that the declaration should occur prior to the use.
These requirements also apply to macros, though the scope rules are a
bit different for them.

Sorry, I'm not up on the technical nomenclature pertinent to
compilers. I *meant* to say 'translation unit', but all my brain had
was 'source file'.

If you replace your function-like macro, wherever it is that you have it
defined, with an inline function definition in that same location, that
function will be usable pretty much wherever the the macro was usable.
It's probably feasible to come up with pathological contexts where a
simple in-place replacement won't work, but such contexts are not the norm.
Agreed.

Perhaps not, but it's one that most popular modern compilers do
routinely and very well; that's part of what makes them popular.

Inline functions are easy to use, but who can measure the performance
advantage of inlining versus the code size tradeoff? I have no idea
how to accurately measure and quantify the real advantages, which may
be different for every architecture and application. I suspect the
trend of blindly setting CFLAGS=-O9, which is not all that uncommon in
makefiles, masks a profound ignorance of what really goes on in a
running program. I also suspect that the complex interaction of
compile and run-time factors defies simplistic optimization
heuristics.

Any decent compiler can be relied upon to take those issues into
consideration when deciding whether or not to inline a function. This is
something it can decide, independently of whether or not the function is
declared inline. As far as actual inlining is concerned, the 'inline'
keyword is only a hint, which a compiler is free to ignore; and it's
perfectly legal for a compiler to decide to inline a function that is
not declared 'inline', as long as doing so doesn't change the observable
behavior. In fact, that was one of the arguments given against the
introducing of the 'inline' keyword. Any static function written to meet
the same special requirements that currently apply to 'inline' functions
could already have been inlined, even if not so declared, so long as the
compiler thought that doing so would be a good idea.

At least these compilers allow one to individually tune the
optimization characteristic of the compilation process. If you don't
want inlining, it's trivial to turn it off.

Unless you know a lot more about the target platform than your compiler
does, it's probably best to rely upon it to make inlining decisions.

All things being equal, that is decent advice.

Defining a function (whether or not you use 'inline') gives it that
option. Defining it as a function-like macro does not - it only allows
inlining. Well, technically, I suppose a sufficiently sophisticated
compiler could perform anti-inlining: recognizing a common code pattern,
and replacing it wherever used with a call to a compiler-generated
function definition. The fact that the common code was the result of a
macro expansion would make it easier to recognize the feasibility of
such an optimization. However, and it seems to me to be a harder
optimization to perform than inlining, and I doubt that it is a common
feature even of the most sophisticated compilers.

That kind of optimization would be the comp-sci equivalent of gilding
the lily.

Regards,

Uncle Steve

Ian Collins · May 26, 2011

Uncle Steve said:
Uncle Steve said:

*((int *) arena_obj_addr(h, i)) = -1;

Click to expand...

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

That's why I told him 5 days ago!

Uncle Steve · May 26, 2011

Uncle Steve said:
Uncle Steve said:

On Thu, May 19, 2011 at 03:22:18AM +0100, Ben Bacarisse wrote:
[snip]

Ok, here's a quick and dirty hack that measures a highly contrived
scenario. This code first allocates two arenas of different size, and
then ping-pongs the allocations between the two arenas for a defined
number of iterations. Then, I do more or less the same thing with
malloc.y

The preliminary results show that my special-purpose allocator is 2-3
times faster than glibc malloc. Not quite and order-of-magnitude
difference in performance (unless you think in base-2), but very
acceptable. In some programs there may be a larger difference in
performance becuase of reduceed memory fragmentation or reduced
overall memory use. I do not plan to isolate those factors at this
time to measure their effect.

Code is compiled with "gcc -O0 -o arena_test arena_test.c -lrt"

The test platform here is an Intel Atom netbook running at 1.333GHz,
1G RAM, OpenSuSE 11.4, libc-2.11.3, gcc 4.5.1 20101208.

Typical result:

[31/22:32] nbts/10 stevet ~/stuff/src/libs/tools/test: ./arena_test
Arena: Iterations: 100000; elapsed CPU time (msec): 39181
Malloc: Iterations: 100000; elapsed CPU time (msec): 107265

Code follows:

#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>

#define TEST_SIZE 1024 * 1024
#define ITERATIONS 100000

#define DIFFERENCE 5

struct arena_head_s {
int obj_size;
int arena_size;
unsigned char *
arena;
int free;
int free_list;
};

typedef struct arena_head_s arena_head;

#define arena_obj_addr(x, n) ((void *) &x->arena[n * x->obj_size])

arena_qa(arena_head *x)
{
int n;

n = x->free_list;

if(n != -1) {
x->free_list = *((int *) arena_obj_addr(x, n));
x->free--;
}

return(n);
}

void arena_free(arena_head *p, int n)
{
*((int *) arena_obj_addr(p, n)) = p->free_list;
p->free_list = n;
p->free++;

return;
}

arena_head * arena_creat(size_t s, int n)
{
arena_head * h;
int i;

h = malloc(sizeof(arena_head));
if(h == NULL) {
perror("malloc()");
exit(1);
}

h->obj_size = s;
h->arena_size = n;
h->free = n;

h->arena = malloc(s * n);
if(h->arena == NULL) {
perror("malloc()");
exit(1);
}

h->free_list = 0;

for(i = 0; i < n; i++)
*((int *) arena_obj_addr(h, i)) = i + 1;

*((int *) arena_obj_addr(h, i)) = -1;

Click to expand...

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

All that statement does is simulate defining the target arena object
as a union of int, and whatever the caller plans to store in each
object slot. It's actually a more convenient solution because in the
case where you do use a union, the code in the loop above would have
to look something like this.

for(i =0; i < (n -1); i++)
h->arena.un.freelist_next = i + 1;

h->arena.un.freelist_next = -1;

And accessing whatever it is that is actually being stored there would
have to be similar to *arena_obj_addr(h, i) = whatever. As the free
list management is internal to the alocator, the application should
not care what is stored in the free slots of the 'array'.

i'm not much smart, i not understand well all this code
but it seems to me the bigger error it is all can overflow and
too much confidence all goes well to the HLL

Click to expand...

There was an off-by-one error in the above code from a transcription
error, which was resolved in a subsequent message. Otherwise, an
out-of-bounds access is the fault of the application programmer.
Caveat emptor.

how is good assembly
i have the difect i see something good each little instruction at time
[too much near and too much far]

return(h);
}

Click to expand...

Click to expand...

As others have said, the requirements of kernel programming are
different from conventional application logic. But this code is not
part of an operating system, or anything even remotely similar to an
operating system. It is a kernel only in a limited and specialized
sense.

Regards,

Uncle Steve

Uncle Steve · May 26, 2011

Uncle Steve said:
Uncle Steve said:

*((int *) arena_obj_addr(h, i)) = -1;

Click to expand...

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

Click to expand...

That's why I told him 5 days ago!

Did you miss the previous message that corrected the typo?

Regards,

Uncle Steve

Ian Collins · May 26, 2011

"Uncle Steve"<[email protected]> ha scritto nel messaggio

*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

Click to expand...

That's why I told him 5 days ago!

Click to expand...

Did you miss the previous message that corrected the typo?

No, why?

Uncle Steve · May 26, 2011

On 05/26/11 06:12 PM, io_x wrote:
"Uncle Steve"<[email protected]> ha scritto nel messaggio

*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

That's why I told him 5 days ago!

Click to expand...

Did you miss the previous message that corrected the typo?

Click to expand...

No, why?

No reason. just curious.

Regards,

Uncle Steve

David Thompson · Jun 1, 2011

On Mon, 23 May 2011 14:44:18 +0100, Ben Bacarisse

gcc -std=c99 -pedantic if you want to avoid extensions. gcc -ansi
-pedantic is you want to avoid more non-portable features. <snip>

Other than those using (implementation-reserved) double-underscore,
which are easy enough to search for (unless you construct them with
token-pasting, and if so you deserve whatever problems you get).

And things which are implementation-dependent (unspec or impl-def) in
the standard. Those can be nonportable in general, but not from gcc
specifically, and gcc doesn't flag them as such. (In some cases, the
impl-dependent range or signedness of a type may trigger value-range,
conversion-range, or unsigned-comparison warnings.)

Malloc Query	8	Oct 15, 2008
a fast malloc/free implementation & benchmarks	0	Mar 20, 2011
Questions regarding specialized malloc()/free() replacements	3	Jan 4, 2009
memory managers and malloc/free	3	Aug 12, 2006
Improving memory consumption in the container library	6	Oct 9, 2009
Dealing with naive malloc() implementations	14	May 9, 2007
malloc and realloc problem	1	May 19, 2008
[Slightly OT] Memory management and custom allocator	64	Dec 31, 2011

Malloc Query

Uncle Steve

Ian Collins

Uncle Steve

Uncle Steve

Ian Collins

Uncle Steve

David Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads