sizeof([ALLOCATED MEMORY])

  • Thread starter ballpointpenthief
  • Start date
R

Robert Latest

The proposal is an expanded interface giving the programmer more
information about the runtime state of his program at little or no cost.
The information does not need to be generated, it already exists.

Correct. The information, to remind myself and the thread in
general, is the usable size of the memory block *really* obtained
in a malloc() call.
It's
just that there is currently no interface with which to obtain this
information.
Correct.

It has been shown that this extra information can have a
positive impact on program performance (not standard library efficiency)
if utilized. Programs that do not need or want this extra information
are free to ignore it at no cost.
Correct.

If you mean that an efficient library interface is beyond the scope of
the C language definition, I think you do the creators of this
definition a disservice.
Probably.

Personally I'm impressed that so many aspects
of C have withstood the test of time over the past 30 years.

So am I.
Given all
of the advances in software development in this time frame, it could not
have done so well without a fundamentally sound and useful interface.

True. Although parts of the interface are a little braindead.
I believe only a small part of this interface needs a minor tweak to
keep up with changing times.

I would like to see it as a commonly available extension. There
is no need to put it into the language itself.

There's a funny conundrum: Let's say that the allocated size of a
memory were available to the programmer just by calling a
function on the pointer to it -- like you said, an easily
implemented feature because the information is there. This would
immediately render the interface to each and every function that
requires a "buffer size" argument to accompany a pointer to
buffer space (such as fgets()) clumsy and stupid. Like in
fgets(buffer, memsize(buffer), stream)

I would be interested to see if there is actually a sound
*technical* reason why the "allocated-size" information cannot be
made available through the C interface, such as that there could
be implementations on which this information is *not* available.

robert
 
K

Keith Thompson

Robert Latest said:
Correct. The information, to remind myself and the thread in
general, is the usable size of the memory block *really* obtained
in a malloc() call.

You're assuming the information is there. I can imagine that it might
be either nonexistent or meaningless in some implementations. (I
don't know enough about actual implementations to know how common this
is.)

[...]
There's a funny conundrum: Let's say that the allocated size of a
memory were available to the programmer just by calling a
function on the pointer to it -- like you said, an easily
implemented feature because the information is there. This would
immediately render the interface to each and every function that
requires a "buffer size" argument to accompany a pointer to
buffer space (such as fgets()) clumsy and stupid. Like in
fgets(buffer, memsize(buffer), stream)

Consider:
char buffer[80];
fgets(buffer, sizeof buffer, stream);
Since buffer isn't allocated by malloc(), there's no way fgets() can
determine how much space is allocated to it.

Presumably the memsize() function would invoke undefined behavior if
invoked with a pointer that doesn't point to space allocated by
malloc().
 
C

CBFalconer

Robert said:
.... snip ...

I would be interested to see if there is actually a sound
*technical* reason why the "allocated-size" information cannot be
made available through the C interface, such as that there could
be implementations on which this information is *not* available.

Yes, there is. Any such function would obviously be system
specific, which is not a problem. However it could only be called
on pointers that had been allocated via malloc in the first place.
This does not apply to most pointers.

Another reason is that a system call will obviously never be as
efficient as a single data access, which is what is required for
simply remembering.

It is conceded that we already have the limitation to malloced
pointers for calls to free and realloc. This is one of the great
insecurities of the C pointer system. It is also the sort of
limitation that, in practice, makes runtime checking impossible.

There is also no reason why the information should be available.
Consider a system in which all memory is 'use once and discard'.
Not many such exist, outside the DeathStar series.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
J

Joe Wright

Robert said:
Correct. The information, to remind myself and the thread in
general, is the usable size of the memory block *really* obtained
in a malloc() call.


So am I.


True. Although parts of the interface are a little braindead.


I would like to see it as a commonly available extension. There
is no need to put it into the language itself.

There's a funny conundrum: Let's say that the allocated size of a
memory were available to the programmer just by calling a
function on the pointer to it -- like you said, an easily
implemented feature because the information is there. This would
immediately render the interface to each and every function that
requires a "buffer size" argument to accompany a pointer to
buffer space (such as fgets()) clumsy and stupid. Like in
fgets(buffer, memsize(buffer), stream)

I would be interested to see if there is actually a sound
*technical* reason why the "allocated-size" information cannot be
made available through the C interface, such as that there could
be implementations on which this information is *not* available.

robert

A couple of years ago I created just such a system. It involves spoofing
such that *alloc and free call a wrapper which saves the requested size
and the returned pointer in a node in a linked list. I implemented a new
function, 'size_t size(void *);' which trips through the list, finds the
pointer and returns the size. The free() function is also wrapped such
that if the pointer is not found, it is a NOP such that attempts to
'free' wild or indeterminate pointers are blocked.
 
G

Gordon Burditt

There's a funny conundrum: Let's say that the allocated size of a
memory were available to the programmer just by calling a
function on the pointer to it -- like you said, an easily
implemented feature because the information is there. This would

The information is there *FOR MALLOC'D MEMORY* and for the *beginning*
of a malloc'd buffer.
immediately render the interface to each and every function that
requires a "buffer size" argument to accompany a pointer to
buffer space (such as fgets()) clumsy and stupid. Like in
fgets(buffer, memsize(buffer), stream)

There is no rule that the pointer passed to fgets() must be obtained
from malloc(), or that you only pass a pointer to the beginning
of such an area to fgets().
I would be interested to see if there is actually a sound
*technical* reason why the "allocated-size" information cannot be
made available through the C interface, such as that there could
be implementations on which this information is *not* available.

I think trying to debug a program that uses the "allocated-size"
information incorrectly would be a problem, especially if it
malfunctions only on systems where "what-you-requested-is-what-you-get"
but the system you're testing on doesn't do that.

Note that there are systems where you cannot (at least not cheaply)
validate a pointer that supposedly came from malloc(), and you
cannot find the beginning of a block given a pointer into it. And
you can't find the size of a buffer not allocated by malloc on
almost every system.

Gordon L. Burditt
 
K

Keith Thompson

Joe Wright said:
A couple of years ago I created just such a system. It involves
spoofing such that *alloc and free call a wrapper which saves the
requested size and the returned pointer in a node in a linked list. I
implemented a new function, 'size_t size(void *);' which trips through
the list, finds the pointer and returns the size. The free() function
is also wrapped such that if the pointer is not found, it is a NOP
such that attempts to 'free' wild or indeterminate pointers are
blocked.

Sounds interesting. It would have been even more interesting if your
free() wrapper had indicated an error if the pointer is not found.
This could be used to discover bugs rather than just masking them.
(If your program free()s an invalid pointer, ignoring the problem
isn't likely to fix the underlying problem.)
 
J

Joe Wright

Keith said:
Sounds interesting. It would have been even more interesting if your
free() wrapper had indicated an error if the pointer is not found.
This could be used to discover bugs rather than just masking them.
(If your program free()s an invalid pointer, ignoring the problem
isn't likely to fix the underlying problem.)
I did think about that but I am not trying to 'fix' the problem, but
rather to make it go away. You can pass any pointer you like to my
free() and if the pointer is valid, the memory is freed. If it is not,
nothing happens.
 
H

Howard Hinnant

Keith Thompson said:
You're assuming the information is there. I can imagine that it might
be either nonexistent or meaningless in some implementations. (I
don't know enough about actual implementations to know how common this
is.)

I know. I've written commercial malloc systems for both desktop and
embedded (even bareboard) systems. If you're going to implement the C
realloc interface, you have to know the size of a pointer passed to you.
Furthermore, in such systems if the concept of adjacent blocks exists,
you can answer what the size of those blocks are, and whether or not
they are allocated. If the concept of adjacent blocks does not exist in
the allocator, then the answer to the question: Is there free memory
adjacent to this block of memory? Is: No.

It is pretty simple and cheap to answer these questions. It is
expensive not to be able to ask these questions.

-Howard
 
S

S.Tobias

Robert Latest said:
No it isn't: (n869.txt is a late draft of C99)

$ for f in {speed,efficiency,performance,resource}; do grep $f n869.txt; done
exceptions. The programmer can achieve the efficiency of

....by using "dedicated (and probably platform-specific) libraries"?
[checking...] No.
# exceptions. The programmer can achieve the efficiency of
# translation-time evaluation through static
# initialization, such as
# const static double one_third = 1.0/3.0;
(It's part of a footnote.)

From the same text file:
# This International Standard specifies the form and
# establishes the interpretation of programs expressed in the
# programming language C. Its purpose is to promote
# portability, reliability, maintainability, and efficient
# execution of C language programs on a variety of computing
# systems.
 
C

CBFalconer

Howard said:
.... snip ...

I know. I've written commercial malloc systems for both desktop and
embedded (even bareboard) systems. If you're going to implement the C
realloc interface, you have to know the size of a pointer passed to you.

Not necessarily. Consider the hypothetical "one time use"
allocater. realloc need only allocate a new chunk of the
appropriate size and copy the old over. It doesn't need to know
the size of the old, even for the copying if it can detect 'the
end' by some other means, analogous to encountering EOF.

The only known system that does this is the DeathStar, and then
only when it will expose coding failures.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
H

Howard Hinnant

CBFalconer said:
Not necessarily. Consider the hypothetical "one time use"
allocater. realloc need only allocate a new chunk of the
appropriate size and copy the old over. It doesn't need to know
the size of the old, even for the copying if it can detect 'the
end' by some other means, analogous to encountering EOF.

I never said that sizeof_alloc(p) should have O(1) complexity. That
indeed would be over constraining. The "one time use" allocator still
must know the sizeof_alloc(p) in order to realloc; in the same way that
strlen(s) knows the size of the string s. One way or another, it must
know the size of the old memory to copy.

The smart interface would combine an allocation (e.g. malloc), with the
size query, so that an efficient implementation is more likely:

I'm requesting N bytes. Please give me the pointer (if able) and the
actual number of bytes it is pointing to.

-Howard
 
C

CBFalconer

Howard said:
I never said that sizeof_alloc(p) should have O(1) complexity. That
indeed would be over constraining. The "one time use" allocator still
must know the sizeof_alloc(p) in order to realloc; in the same way that
strlen(s) knows the size of the string s. One way or another, it must
know the size of the old memory to copy.

Read again please. I postulated a copy mechanism that detected the
equivalent of EOF. Much like:

while (*dest++ = *source++) continue;

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
T

Tomás

CBFalconer posted:
while (*dest++ = *source++) continue;


Redundant, yet nonetheless depictive, use of "continue".

I wonder, however, if it would hinder the compiler's optimization?


-Tomás
 
P

pete

Tomás said:
CBFalconer posted:


Redundant, yet nonetheless depictive, use of "continue".

There are some style guidlines against writing code like this:
while (*dest++ = *source++);
because it looks like an accident.

Myself, I would write that this way:
do {
*dest = *source++;
} while (*dest++ != 0);

I'm not into cramming vertical space.
 
T

Tomás

pete posted:
There are some style guidlines against writing code like this:
while (*dest++ = *source++);
because it looks like an accident.


while (*dest++ = *source++); /* Not an accident */

;)

Myself, I would write that this way:
do {
*dest = *source++;
} while (*dest++ != 0);


More than one way to skin a cat. If I were looking for the most efficient
way, I'd use the "register" keyword for "dest" and "source", and I'd
probably do the following:

while ( *dest = *source ) ++dest, ++source;

This would remove the redundant last incrementation.

-Tomás
 
P

pete

Tomás wrote:
More than one way to skin a cat.
If I were looking for the most efficient way,
I'd use the "register" keyword for "dest" and "source",

I don't use the register keyword.

"Smaller, faster programs can be expected if register
declarations are used appropriately, but future improvements
in code generation may render them unnecessary."
-- K&R, A 8.1, 1978
and I'd probably do the following:

while ( *dest = *source ) ++dest, ++source;

This would remove the redundant last incrementation.


http://www.prism.uvsq.fr/~cedb/local_copies/lee.html
Optimization is simply waste of programmer time if any of these
statements are true:
parts of the program haven't been written yet
the program is not fully tested and debugged
it seems to run fast enough already

Jackson's Rules of Optimisation:
Rule 1: Don't do it.
Rule 2: (for experts only) Don't do it yet - that is, not until you
have a perfectly clear and unoptimized solution.
- Michael Jackson
 
T

Tomás

pete posted:
Jackson's Rules of Optimisation:
Rule 1: Don't do it.
Rule 2: (for experts only) Don't do it yet - that is, not until you
have a perfectly clear and unoptimized solution.
- Michael Jackson


I prefer to think for myself -- I tend to be more open-minded, more
creative, more inventive, more intuitive, and more intelligent than the
person who's trying to spoon-feed me guidelines.

The following code is A-OK by me:

void Strcpy( register char* dest, register const char* source )
{
while ( *dest = *source ) ++dest, ++source;
}


You have your way of doing things, and I have mine. I'm sure we both get
the job done... but I get that extra sprinkle of satisfaction from knowing
I perfected the code to the best of my ability.

-Tomás
 
A

Al Balmer

pete posted:
Rule 2.5: It's much easier to make correct code fast than to make fast
code correct.
I prefer to think for myself -- I tend to be more open-minded, more
creative, more inventive, more intuitive, and more intelligent than the
person who's trying to spoon-feed me guidelines.

The following code is A-OK by me:

void Strcpy( register char* dest, register const char* source )
{
while ( *dest = *source ) ++dest, ++source;
}


You have your way of doing things, and I have mine. I'm sure we both get
the job done... but I get that extra sprinkle of satisfaction from knowing
I perfected the code to the best of my ability.
It's even possible that the compiler can optimize this code in spite
of your attempt to interfere ;-)
 
Z

Zara

pete posted:



I prefer to think for myself -- I tend to be more open-minded, more
creative, more inventive, more intuitive, and more intelligent than the
person who's trying to spoon-feed me guidelines.

The following code is A-OK by me:

void Strcpy( register char* dest, register const char* source )
{
while ( *dest = *source ) ++dest, ++source;
}


You have your way of doing things, and I have mine. I'm sure we both get
the job done... but I get that extra sprinkle of satisfaction from knowing
I perfected the code to the best of my ability.

-Tomás

Well, you could be surprised to see the assembler code geenrated by
the compiler. Many timed, the followin lines give all of them the same
final code:

while ( *dest = *source ) ++dest, ++source;

for(;*dest = *source;++dest,++source);

while ( *dest++ = *source++ );

for(;*dest++ = *source++;);

None of them is the most optimal until proven woth some profiler. You
may choose your preferred style, but don't take for grnated it is the
more optimized solution. Optimum solutions may be optimum in code size
but not in speed, or the contrary, or optimum in local stack size but
not in code size, or whatever.

Profile the results of your compiler. This is not a guideline, this is
the only recognized way to find where to optimize your code.

Best regards,

Zara
 
C

CBFalconer

Tomás said:
pete posted:


while (*dest++ = *source++); /* Not an accident */

Not only have you fouled up the attributions, but you have fouled
the quote. It is considered very bad form to edit quotations. I
actually wrote:

while (*dest++ = *source++) continue;

as an analogy to something else.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,206
Latest member
SybilSchil

Latest Threads

Top