How to check whether malloc has allocated memory properly in case ifmalloc(0) can return valid point

Stephen Sprunk · Dec 28, 2010

...
If malloc was called 1000,000 times then memory used still is 0 bytes
(1000,000 * 0 = 0), hence I still wonder when malloc(0) allocates 0 bytes
why system hanged by eating all memory ?

Allocating memory, even zero bytes of it, is not free. malloc() et al
must keep a record of each allocation, and that record consumes memory
of its own that is _not_ included in the (zero-byte, in this case)
object returned to the caller.

If you're making large allocations, or relatively few small ones, this
overhead is negligible. If you make lots of small allocations, though,
the overhead can consume significantly more memory than the actual
objects you're allocating--infinitely more in the case of zero-byte
objects. That's why programs that do so usually have special-purpose
allocators more efficient than a general-purpose one like malloc().

S

Anand Hariharan · Dec 28, 2010

On Dec 16 said:
malloc(0) itself cannot work this way. If it returns non-NULL, it
must return a value that is distinct from all the other values it
has returned (that have not yet been released). That is, malloc(0)
must satisfy:

void *p = malloc(0);
void *q = malloc(0);
assert (p == NULL || p != q);

I may have misunderstood Keith's post else-thread, but it (Message-ID:
<[email protected]>) contradicts what you say above
(i.e., the implementation could return a pointer that is non-null and
unique so that p == q holds).

- Anand

Ron Shepard · Dec 28, 2010

Nah. They did with the USMIL extensions, which had some later
effect, but had effectively no effect on Fortran before (really)
Fortran 90 and possibly a couple of things in Fortran 77.

A couple of years after the mil-std-1754 specification, there were a
couple of vendors that mentioned a DOE (Department of Energy)
specification for asynchronous I/O in fortran. I tried to track
this down a few times in the early 80's and never found it anywhere.
Then in later versions of their compiler documentation the DOE
references were removed. I suspect that there might have been some
kind of draft document at one time, but perhaps it never matured.

I think the late 70's through the mid 80's was a time when customers
were just beginning to shift over to the idea of writing standard
code as a means to achieve portability. Before then, the vendors
were happy to implement extensions and features to their big
customers. This kept the customers happy and it helped lock in
those customers to the vendor's hardware and software. But in the
80's, especially the late 80's, there were all kinds of new hardware
being adopted by new companies, minicomputers were becoming popular,
PCs were becoming useful for scientific programming, networking was
becoming popular allowing scientists to sit in front of one piece of
hardware and run programs on some remote piece of hardware, the
first commercial parallel computers were appearing, and so on. In
this new environment, portability of source code was more important
than before, and that is why the standards process become more
important to the end programmers. That is also why the failures of
the fortran standards committee during the decade of the 80's to
move forward was such a devastating blow to the popularity of the
language.

$.02 -Ron Shepard

nmm1 · Dec 28, 2010

I think the late 70's through the mid 80's was a time when customers
were just beginning to shift over to the idea of writing standard
code as a means to achieve portability. ...

Some of us had been doing that for a decade before that.

Regards,
Nick Maclaren.

nmm1 · Dec 28, 2010

Some of us had been doing that for a decade before that.

Unless you meant the difference between de facto and de jure
standards. I agree that the latter dated from Fortran 77, because
of the character-handling problem. But the widespread use of
PFORT shows the interest in that a long time earlier.

Regards,
Nick Maclaren.

Richard Maine · Dec 28, 2010

Some of us had been doing that for a decade before that.

I well believe you were. But my observation was that most programmers of
the day weren't until sometime around the time frame that Ron mentions.

Keith Thompson · Dec 28, 2010

Anand Hariharan said:
I may have misunderstood Keith's post else-thread, but it (Message-ID:
<[email protected]>) contradicts what you say above
(i.e., the implementation could return a pointer that is non-null and
unique so that p == q holds).

I think you did misunderstand it, presumably because I didn't state it
clearly enough. Eric is correct.

Here's what I wrote:

| I think the way the current definition came about is something
| like this: Before C89, some implementations had malloc(0) return a
| unique pointer value (that couldn't safely be dereferenced), and
| some had it return a null pointer. The former is arguably more
| consistent with the behavior of malloc() for non-zero sizes, and
| lets you distinguish between results of different malloc(0) calls; it
| makes malloc(0) a convenient way to generate a pointer value that's
| non-null, guaranteed to be unique, and consumes minimal resources.
| The latter avoids the conceptual problems of zero-sized objects.
| The ANSI C commmittee chose to allow either behavior, probably
| to avoid breaking existing implementations; they also defined the
| behavior of realloc() so it could deal consistently with either a
| null pointer or a pointer to a zero-sized object.
|
| Personally, I think it would have been better to define the behavior
| consistently and let implementations conform to what the standard
| requires.

What I meant by "unique pointer value" is a pointer value that's
unique *for each call*.

And something I missed: even an implementation that returns a
unique (per-call) non-null pointer value for malloc(0) cannot do
so arbitrarily many times. Each call (assuming nothing is free()d)
consumes some resources, at least address space if not actual memory.
Eventually there won't be enough left to allocate the bookkeeping
information, and malloc(0) will fail and return a null pointer
anyway.

But if you want a series of unique address values, and you're not
planning to dereference any of them, malloc(1) will serve the same
purpose. Or, with less overhead, you could return addresses of
successive elements of a big char array, expanding it with realloc()
as needed.

Ron Shepard · Dec 29, 2010

Unless you meant the difference between de facto and de jure
standards. I agree that the latter dated from Fortran 77, because
of the character-handling problem. But the widespread use of
PFORT shows the interest in that a long time earlier.

Yes, I remember using PFORT on a decsystem-20 in the late 70's, and
that is one of the tools I had in mind when I wrote that sentence.
Are you saying that PFORT was commonly used a decade earlier?
Another useful tool from that era was FTNCHEK, which I think I
started using about 1986 or so.

$.02 -Ron Shepard

nmm1 · Dec 29, 2010

Yes, I remember using PFORT on a decsystem-20 in the late 70's, and
that is one of the tools I had in mind when I wrote that sentence.
Are you saying that PFORT was commonly used a decade earlier?
Another useful tool from that era was FTNCHEK, which I think I
started using about 1986 or so.

Not a full decade earlier, obviously, because it dates from only
1974. There were tools that preceded it, though I can no longer
remember anything much about them.

Back in the late 1960s, most American codes were system-specific,
but the UK was always a hotchpotch, and MOST people expected to
use a wide variety of systems in a short period. NAG was founded
in 1970, initially for the ICL 1900, but the objective of portability
to all UK academic mainframes was adopted almost immediately, and
quite a lot of people wrote with the same objectives.

I contributed from 1973, and wrote code that was expected to work,
unchanged, on IBM 360, ICL 1900, several PDPs, CDCs, Univac, and
others that I can't remember. We wrote standard-conforming code
to do that, though the actual standard was adapted from Fortran 66,
and not the document itself.

For example, no assumption of one-trip DO-loops (or not), saved
data outside COMMON, extended ranges, etc. etc. But, essentially,
it was the sane subset of Fortran 66 that has remained unchanged
to this day - the code has been redone, but not because it stopped
working!

Regards,
Nick Maclaren.

David Resnick · Dec 29, 2010

You can easily write mymalloc.c. The problem is that your leaf
routines become dependent on it, and therefore no longer leaf
routines. So you can't move them to another program, and they can't be
corrected by someone who isn't an expert in your particular program.

Same applies to any utility function that you use widely in your
program. You can also in a variety of platform specific ways override
the libc malloc to achieve this end while still calling "malloc" in
your program. But using your own code is completely portable and
achieves the desired end -- which is to say consistent behavior of
malloc(0) under your control. Can't really see your issue here.

Anand Hariharan · Dec 29, 2010

I think you did misunderstand it, presumably because I didn't state it
clearly enough. Eric is correct.
(...)
What I meant by "unique pointer value" is a pointer value that's
unique *for each call*.

Thank you for the clarification.

- Anand

Richard Maine · Dec 29, 2010

Back in the late 1960s, most American codes were system-specific,
but the UK was always a hotchpotch, and MOST people expected to
use a wide variety of systems in a short period.

I'd believe that as consistent with my previously mentioned observation
that most programmers paid little attention to portability. I failed to
mention the qualifier, but my observations at the time were solely
within the US. Not until a bit later did I get any exposure to work
across the pond. In the environments I mostly saw, there was only a
single machine at your place of employment, and that machine was likely
to be the only one you used for somewhere close to a decade, with
software compatibility being a big argument for replacing it with
another from the same vendor when the decade was up.

nmm1 · Dec 29, 2010

I'd believe that as consistent with my previously mentioned observation
that most programmers paid little attention to portability. I failed to
mention the qualifier, but my observations at the time were solely
within the US. Not until a bit later did I get any exposure to work
across the pond. In the environments I mostly saw, there was only a
single machine at your place of employment, and that machine was likely
to be the only one you used for somewhere close to a decade, with
software compatibility being a big argument for replacing it with
another from the same vendor when the decade was up.

Yes, this was really the point that I was trying to make before. I
think most programmers up through the 70's (in the US, that was my
only experience at the time) only worked on a single machine. It
might have been an IBM shop, or a Univac shop, or a CDC shop, or
whatever, but that was the hardware that was available to a single
programmer, and his job was to squeeze everything out of it that he
could for his set of applications.[/QUOTE]

The main drive wasn't that the individual programmer had access to
multiple machines at one time, but collaborated with people who had
other ones. The other was that the next machine might well be very
different, whether at another location or even at the same one,
especially in academia.

While we typically had a lot less computer power than people in the
USA, that helped to AVOID extreme coding of the above nature,
because there was no option but to use smarter algorithms.

That often meant using
machine-specific extensions for things like bit operators, or access
to some hardware widget or i/o device, or operations on character
strings, and so on. Given the choice between slow portable code or
fast machine-specific code, the pressure always seemed to be toward
the latter.

That is still true - even when speed isn't important :-(

Regards,
Nick Maclaren.

Ron Shepard · Dec 29, 2010

I'd believe that as consistent with my previously mentioned observation
that most programmers paid little attention to portability. I failed to
mention the qualifier, but my observations at the time were solely
within the US. Not until a bit later did I get any exposure to work
across the pond. In the environments I mostly saw, there was only a
single machine at your place of employment, and that machine was likely
to be the only one you used for somewhere close to a decade, with
software compatibility being a big argument for replacing it with
another from the same vendor when the decade was up.

Yes, this was really the point that I was trying to make before. I
think most programmers up through the 70's (in the US, that was my
only experience at the time) only worked on a single machine. It
might have been an IBM shop, or a Univac shop, or a CDC shop, or
whatever, but that was the hardware that was available to a single
programmer, and his job was to squeeze everything out of it that he
could for his set of applications. That often meant using
machine-specific extensions for things like bit operators, or access
to some hardware widget or i/o device, or operations on character
strings, and so on. Given the choice between slow portable code or
fast machine-specific code, the pressure always seemed to be toward
the latter.

Then in the early 80's when some of the new hardware became
available, such as Cray vector processors, some programmers kept
this same mindset. I've seen rewrites of electronic structure codes
for cray computers that would have almost no chance of compiling on
any other hardware. Every other line of code seemed to have some
kind of special vector operator or something in it. I remember
seeing someone multiply an integer by ten by doing a
shift2-add-shift2 sequence (or maybe it was shift4-shift2-add, I
forget) because he had counted clock cycles and determined that that
was the best way to do it with that version of the hardware and
compiler. But as more and more hardware became available, and it
became necessary to port your codes quickly from one machine to
another, or to be able to run your code simultaneously on multiple
combinations of hardware and software, this kind of coding died out
pretty quickly. Low-level computational kernels, such as the BLAS,
were still done that way, but the higher-level code was written to
be portable, even at the cost of an extra machine cycle here and
there if necessary. All of this was also driven by the need to work
with collaborators who were using different hardware than you, and
the need to access and contribute to software libraries such as
netlib which were used on a wide range of hardware, and network
access to remote machines at the various NSF/DOE/DoD supercomputer
centers around the country.

In the old environment a programmer might take several months or
years to optimize some code for a specific machine, and then that
code might be used for a decade. In the new environment, the code
needed to be ported in a matter of days or weeks, and used for a few
months, at which time the machine might be replaced with new
hardware, or your NSF allocation would expire, or whatever. It was
this newer environment (at least in the US) that I think drove
programmers toward writing portable code, and in many ways that
meant conforming to the fortran standard.

There were some exceptions to this, of course, which have been
discussed often here in clf. One of them was the practical
observation that the nonstandard REAL*8 declarations were more
portable in many situations than the standard REAL and DOUBLE
PRECISION declarations. This was an example of the standard
actually inhibiting portability rather than promoting it. The KINDs
introduced finally in f90 solved this dilemma, but that is probably
part of f90 that should have been included in a smaller revision to
the standard in the early 80's rather than a decade later. There
may be other examples of this, but this is the only one that comes
to mind where I purposely and intentionally avoided using standard
syntax and chose to use instead the common nonstandard extension.
Then when f90 was finally adopted (first by ISO, then force fed to
the foot-dragging ANSI committee), this was one of the first
features that I incorporated into my codes. I even wrote some sed
and perl scripts to help automate these conversions, violating my
"if it ain't broke, don't fix it" guiding principle to code
maintenance.

$.02 -Ron Shepard

Richard Maine · Dec 29, 2010

Ron Shepard said:
...nonstandard REAL*8 declarations

...this is the only one that comes
to mind where I purposely and intentionally avoided using standard
syntax and chose to use instead the common nonstandard extension.

I mostly avoided this particular one because much of my work in the late
70's and early 80's was on CDC machines that did not support that syntax
(and didn't have an 8-byte real, or for that matter bytes at all).

Instead, I developed habits of using a style that made it easy to do
automated translation of "double precision" to "real". F77's generic
intrinsics made this at least reasonably practical, though it was still
a pain to go through the preprocessing/translation stage. I also was a
quick convert to F90 kinds.

Ron Shepard · Dec 29, 2010

I mostly avoided this particular one because much of my work in the late
70's and early 80's was on CDC machines that did not support that syntax
(and didn't have an 8-byte real, or for that matter bytes at all).

I generally found that REAL*8 worked alright on these kinds of
machines. I forget exactly which compilers I used on CDC machines,
but I remember some kind of compiler option or something that mapped
these declarations to the 60-bit floating point type, which is what
I wanted on that hardware (I remember using CDC 6600 and 7600
machines). This also worked fine on the univac and decsystem-20
compilers I used; these were both 36-bit word machines, and the
REAL*8 declaration mapped onto the 72-bit floating point type which
is what I wanted on those. I also used harris computers a little
(these had 3, 6, and 12-byte data types I think), and I remember
getting things to match up alright there too. And of course there
were the cray and cyber computers and the fps array processors which
had 64-bit words, not bytes, and REAL*8 worked fine there too.

When used this way, REAL*8 is sort of a poor man's
selected_real_kind() where the 8 didn't necessarily mean anything
specific, but it resulted in the right kind of floating point. As I
complained before, all of this should have been incorporated into a
minor fortran revision in 1980 or so, along with the mil-std-1754
stuff and maybe a few other similar things. It sure would have made
fortran easier to use in that time period 1980-1995 before f90
compilers eventually became available.

Instead, I developed habits of using a style that made it easy to do
automated translation of "double precision" to "real". F77's generic
intrinsics made this at least reasonably practical, though it was still
a pain to go through the preprocessing/translation stage.

Yes, I did some of this too with sed (and later perl) scripts. In
particular, I wrote and maintained some of my codes using the
nonstandard IMPLICIT NONE, but I included the scripts with my code
distributions to replace these with other declarations for those
compilers that did not support this declaration. And there were a
few compilers like that, I forget which ones, but I know it was
important enough for me to worry about keeping my code consistent
with my conversion scripts.

I know that some programmers went much farther with this approach
than I did. The LINPACK library, for example, was written using
something called TAMPR which took a source file and could output
REAL, DOUBLE PRECISION, or COMPLEX code, including both the correct
declarations and the correct form for floating point constants.
There were other tools like RATFOR and SFTRAN which I think also had
some of this capability. But I was not comfortable getting too far
away from fortran source, so I did not use these kinds of tools
routinely in my own codes.

$.02 -Ron Shepard

Richard Maine · Dec 29, 2010

Ron Shepard said:
I generally found that REAL*8 worked alright on these kinds of
machines. I forget exactly which compilers I used on CDC machines,
but I remember some kind of compiler option or something that mapped
these declarations to the 60-bit floating point type,

The compilers I mostly used didn't accept the syntax at all.

glen herrmannsfeldt · Dec 30, 2010

Metacommand:$DO66
The following F66 semantics are used:

*Statements within a Do loop are always executed at least once.

*Extended range is permitted;control may transfer into the syntactic
body of a DO statement.
The range of the DO statement is thereby extended to include,
logically, any statement that may be executed between a DO
statement and its terminal statement. However, the transfer of
control into the range of a DO statement prior the the execution
of the DO statement or following the final execution of its
terminal statement is invalid.

Before SUBROUTINE, FUNCTION, and CALL, subroutines were done
using GOTO and ASSIGNed GOTO. To allow for such within a DO loop,
one was allowed to GOTO out of a DO loop, do something else, and
then GOTO back again.

I believe that has been removed in newer versions of the
standard.

-- glen

glen herrmannsfeldt · Dec 30, 2010

No. It was just IBM's Fortran, generally with the particular compiler
version number. The term "standard" was neither applicable nor generally
used.

Well, I don't see that it requires a government agency to make
a standard, but yes the IBM versions weren't quite constant, which
is an important part of a standard.

It seems that with Fortran II and Fortran IV, IBM did try to name
their specific versions of Fortran. It doesn't seem too far off
to say that a program that conforms to all the implementations
of IBM Fortran II or IBM Fortran IV follows an IBM standard.

-- glen

glen herrmannsfeldt · Dec 30, 2010

A couple of years after the mil-std-1754 specification, there were a
couple of vendors that mentioned a DOE (Department of Energy)
specification for asynchronous I/O in fortran. I tried to track
this down a few times in the early 80's and never found it anywhere.
Then in later versions of their compiler documentation the DOE
references were removed. I suspect that there might have been some
kind of draft document at one time, but perhaps it never matured.

In the 1970's, some DOE labs ran IBM machines, and some CDC machines.
DOE headquarters, as far as I know, had IBM machines.

The OS/360 Fortran H Extended compiler supported asynchronous I/O.

I don't know about CDC Fortran and asynchronous I/O, though.

I would expect anything DOE related to follow one or the other.
(And note that DOE didn't exist before 1974.)

I think the late 70's through the mid 80's was a time when customers
were just beginning to shift over to the idea of writing standard
code as a means to achieve portability. Before then, the vendors
were happy to implement extensions and features to their big
customers. This kept the customers happy and it helped lock in
those customers to the vendor's hardware and software.

-- glen

Can one get away with an under-allocated union?	5	Dec 25, 2010
When to check the return value of malloc	267	May 15, 2010
why 50% of allocated memory is not freed?	32	Feb 24, 2009
Freeing memory allocated by another function	6	Dec 5, 2007
Does C guarantee the data layout of the memory allocated by malloc function?	6	Sep 23, 2005
Implementing Malloc()	46	Nov 26, 2007
malloc() and alignment	12	Oct 13, 2006
Book-keeping metadata for allocated memory	15	Mar 15, 2006

How to check whether malloc has allocated memory properly in case ifmalloc(0) can return valid point

Stephen Sprunk

Anand Hariharan

Ron Shepard

nmm1

nmm1

Richard Maine

Keith Thompson

Ron Shepard

nmm1

David Resnick

Anand Hariharan

Richard Maine

nmm1

Ron Shepard

Richard Maine

Ron Shepard

Richard Maine

glen herrmannsfeldt

glen herrmannsfeldt

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads