IF-free conception.

Evgeney Knyazhev · Feb 19, 2013

my Friends, your concerns about illegal read access seem a little bit exaggerated ;D Just write it in the code:

array2sort = (int64_t*)malloc((N0fItems+1) * size0fInt);
unsortedArr = (int64_t*)malloc(N0fItems * size0fInt);

So, last item becomes a buffer against out-of-bound >read< access.

well, now on topic:

i think we have no ways to get purely C-solution for no-if code, any method has been hardware-based.

Öö Tiib · Feb 19, 2013

For the usual malloc() implementation, the linkage information is
stored before the data. (Array element [-1].) In that case, you are
safer going off the low end of an array than the high end.

Yes, especially if you are writing there. You are then more likely to
screw that linkage information up and to result with noticeable and
reproducible bad behavior.

Worst case is when you read value of neigbour variable and that only
rarely deviates the decision made. That is the case that passes all
tests and blows up only on table of your richest customer, killing
his wife and sending his credit card info all over the world.

BartC · Feb 19, 2013

Evgeney Knyazhev said:
my Friends, your concerns about illegal read access seem a little bit
exaggerated ;D Just write it in the code:

array2sort = (int64_t*)malloc((N0fItems+1) * size0fInt);
unsortedArr = (int64_t*)malloc(N0fItems * size0fInt);

So, last item becomes a buffer against out-of-bound >read< access.

You can't always rely on being able to allocate such a margin: perhaps you
are writing a sort function and your caller provides the data; the user of
the function won't take kindly to such a restriction.

Or the array to be sorted might be a slice of a bigger one; maybe reading
past the end is safe, or it might still overlap onto the next page and slow
things down rather than speed them up.

If the advantages of no-IF are that great, it might be possible to do a
quick test on the last element's address to see if reading past ought to be
harmless, and then choose the best approach, or even to copy the data into a
local array that has the margin, then copy back. But this will all impact on
the advantages.

Evgeney Knyazhev · Feb 19, 2013

You can't always rely on being able to allocate such a margin: perhaps you

are writing a sort function and your caller provides the data; the user of

the function won't take kindly to such a restriction.

Or the array to be sorted might be a slice of a bigger one; maybe reading

past the end is safe, or it might still overlap onto the next page and slow

things down rather than speed them up.

If the advantages of no-IF are that great, it might be possible to do a

quick test on the last element's address to see if reading past ought to be

harmless, and then choose the best approach, or even to copy the data into a

local array that has the margin, then copy back. But this will all impact on

the advantages.

w/o buffer item, we can use another approach:

dv = (a[child] < a[(child+1])*dv) * dv;

So, access doesn't go bananas

Evgeney Knyazhev · Feb 19, 2013

Sorry

dv = (a[child] < a[(child+1)*dv]) * dv;

James Kuyper · Feb 19, 2013

my Friends, your concerns about illegal read access seem a little bit exaggerated ;D Just write it in the code:

array2sort = (int64_t*)malloc((N0fItems+1) * size0fInt);
unsortedArr = (int64_t*)malloc(N0fItems * size0fInt);

So, last item becomes a buffer against out-of-bound >read< access.

It was never a difficult problem to work around. The only reason we've
made such a big deal about it has been your stubborn insistence that it
didn't need to be worked-around, because it supposedly wasn't a problem.

Note: if you take this approach, you should also make sure to initialize
the extra array element. Left uninitialized, it could, in principle,
contain a trap representation, which would still give your code
undefined behavior, even when multiplied by 0 - the behavior becomes
undefined before the multiplication. Trap representations are not
possible for 'int' on most systems, but it's relatively cheap to protect
against the possibility that your code might someday be ported to a
system where 'int' does have trap representations.

Ben Bacarisse · Feb 20, 2013

Evgeney Knyazhev said:
Sorry

dv = (a[child] < a[(child+1)*dv]) * dv;

Or maybe

dv = (a[child] < a[child + dv]) * dv;

There are lots of solutions. Fixing the bug was never the problem!

Ken Brody · Feb 20, 2013

Ken Brody said:
Ken Brody said:

On 02/18/2013 09:28 PM, glen herrmannsfeldt wrote: [...]
It's sloppy programming to assume that, just because the chance is
small, it can be ignored. If you do enough allocations, allocating
exactly at the end of a page becomes a near certainty, even though each
individual allocation has only a small chance of ending up there.

Click to expand...

Click to expand...

The odds of a large meteor passing within the orbits of geosynchronous
satellites is small. The odds of a medium-sized meteor entering the Earth's
atmosphere is smaller. The odds of both happening within 24 hours of each
other is miniscule.

Click to expand...

But not zero.

Click to expand...

I haven't been following the detail so carefully, but it seems likely
that the two events aren't statistially independent. That is, they might
have the same source.

The orbits of both meteors have been plotted. They're not related:

http://blogs.nasa.gov/cm/blog/Watch the Skies/posts/post_1361037562855.html

Or:

http://www.slate.com/blogs/bad_astr...teors_why_are_we_suddenly_seeing_so_many.html

Believing events to be statistically independent
when they aren't is the cause for many mistakes, possibly including much
of the economic meltdown of 2008.

And believing things to be dependent when they're not is just as erroneous.
(Look up "post hoc ergo propter hoc" if you're not familiar with that phrase.)

[Lame attempt to drift back to C.]

This holds true in C when UB is involved. If a program "works" (ie: gives
the expected results, despite UB), and then you add a line and it stops
"working", the typical assumption is that the new code must be wrong.

There is the story of an old lady (why are stories always about old
ladies) who heard that the chance of a bomb on an airplane is 1 in
1000, but the chance of two bombs 1 in a million. Just to be sure,
the always brings her own bomb.

Ken Brody · Feb 20, 2013

In fact the odds of an allocation ending adjacent to a page boundary
are only about 1/1024, assuming 4K pages and 4 byte entries. Sizes
and alignment and granularities in the memory allocator may tweak that
a bit, of course.

It may be that on some systems it is always be safe to read the four
bytes after non-static allocations, because commonly the memory
allocator puts a length (for linking back to the actual allocation
header from the following block). Or in the case of a stack
allocation, that such a reference would only cause the next stack page
to be allocated. Depending on such is, of course, hideous.

There are platforms which give you the option to change malloc() to always
return a pointer in which the byte after the allocated buffer is at an
address guaranteed to not be valid. (This is for debugging, of course, as
such allocations are suboptimal. But it makes buffer overruns very easy to
track down.)

glen herrmannsfeldt · Feb 21, 2013

(snip)

There are platforms which give you the option to change malloc() to always
return a pointer in which the byte after the allocated buffer is at an
address guaranteed to not be valid. (This is for debugging, of course, as
such allocations are suboptimal. But it makes buffer overruns very easy to
track down.)

I might have previously noted that I used to do that in OS/2 in 16
bit mode. (Back to OS/2 1.0, and still supported in 2.x.)

Instead of calling malloc(), I directly called the OS/2 segment
allocation routine to allocate a segment of exactly the right size.

With the ring and global/local bit, each task can only have 8191
segments (segment 0 is special). For the programs I was working on
at the time, I never ran into the limit.

Which currently popular systems allow for this?

-- glen

glen herrmannsfeldt · Feb 21, 2013

(snip, I wrote)

For the usual malloc() implementation, the linkage information is
stored before the data. (Array element [-1].) In that case, you are
safer going off the low end of an array than the high end.

Click to expand...

While most do store a header in front of the allocated block, many
also store at least a length at the end of the allocated area. That
gets used when the following block gets deallocated, and allows a
quick link back to the prior header to allow merging freed areas and
whatnot. Certainly not a universal design, but a common one.

I might have thought about a doubly linked list with both at the
front, but at the end works, too.

The OS/360 GETMAIN is unlike malloc/free in that you can free part
of an allocated region, beginning, end, or middle, and it can keep
track of what is allocated and what isn't. As well as I understand it,
the same routine is used to allocate memory for user programs as to
allocate regions for programs to run in. (That is, for MVT.)

-- glen

glen herrmannsfeldt · Feb 21, 2013

(snip)

I'm curious too. Certainly many systems provide debug versions of
malloc and friends which add sentinels and other things to help catch
overwrites, but that usually won't catch a spurious read, and even
spurious writes are often caught only well after they've happened
(better late than never, but...).

Some of the debugging system, DynamoRIO, for example, manages
something similar by not actually executing your program directly,
rather doing binary recompilation of your binary into an instrumented
version, which then includes checks for overflows. I don't remember
if Valgrind does that too, but I don't think so.

An obvious technique for doing that would be to align each allocation
to the end of a page, and make sure the next page is not mapped. That
would leave an area in front of the allocation touchable (unless the
allocation is exactly a multiple of the page size), and that doesn't
necessarily catch really wild writes. It would also either trash
alignment, or leave a small are that could be accessed without a trap.
The effect on memory usage would also not be pretty.

That would be easy for an OS, but not so easy for user programs
on most systems.

BTW, you could do the same thing in (protected mode) Win16, and
allocate everything via GlobalAlloc, and get a new segment for each
allocation.

I thought you probably could, but I never did it. I was using OS/2 1.0
in early 1990, not so long before Win 3.0.

With OS/2, when the segmentation exception came it would show the
register contents so it wasn't hard to find out where the problem was.
(Just a little hex arithmetic.)

I didn't use Windows much in those days, so I don't know what it
did with the exception. After that came OS/2 2.x which would run
windows programs without Windows.

-- glen

Philip Lantz · Feb 21, 2013

Robert said:
I'm curious too. Certainly many systems provide debug versions of
malloc and friends which add sentinels and other things to help catch
overwrites, but that usually won't catch a spurious read, and even
spurious writes are often caught only well after they've happened
(better late than never, but...).

An obvious technique for doing that would be to align each allocation
to the end of a page, and make sure the next page is not mapped. That
would leave an area in front of the allocation touchable (unless the
allocation is exactly a multiple of the page size), and that doesn't
necessarily catch really wild writes. It would also either trash
alignment, or leave a small are that could be accessed without a trap.
The effect on memory usage would also not be pretty.

Valgrind has an option to do this -- put each allocation at the end of a
page, with the following page not mapped. It also has an option to put
each allocation at the beginning of a page, with the preceding page not
mapped. For best results, run your test suite once each way.

James Kuyper · Feb 21, 2013

Robert Wessel wrote: ....

Valgrind has an option to do this -- put each allocation at the end of a
page, with the following page not mapped. It also has an option to put
each allocation at the beginning of a page, with the preceding page not
mapped. For best results, run your test suite once each way.

Which option is that? I didn't notice it at
<http://valgrind.org/docs/manual/mc-manual.html#mc-manual.options>,
which seemed the obvious place to look.

glen herrmannsfeldt · Feb 21, 2013

(snip on malloc)
(snip, then I wrote)

There's only a fine line between the stuff stored at the end of one
block and at the beginning of the next. I know of one system that
allocated in 16B chunks, and the header was the 12 bytes immediately
preceding the allocated area, and the trailing length was stored in
the four bytes immediately after the allocated area - IOW, in the same
16 byte chunk as the 12 byte header for the next block. One area?
Two?

I suppose if you put both at each end, then you can't tell.

I have seen stories about doubly linked list storing only the XOR
(or difference) between previous and next. You can then go either way
along the list, as long as you start at one end or the other. I don't
know if anyone would do that for malloc().

(snip, I also wrote)

MVS still does that. But I've wondered for decades exactly what
allocation algorithm is used to support that.

I always thought it was linked list, but then I knew it from
the OS/360 days, when programs were smaller. Much of getting
OS/360 to work had to do with squeezing things into less memory.
For OS/360 PCP only 20K bytes of memory were used, allowing 44K
on a 64K machine for a user task. The actual algorithm could have
changed for MVS as long as the user interface is the same.

Apparently not enough
to go find out, though... The obvious answer of some sort of tree
seems far too dynamic and unstable in performance, unless you went to
a self-balancing tree of some sort, and that seems like it would have
too high overhead.

The overhead is high enough that PL/I allocates larger blocks and then
suballocates them itself, as needed.

-- glen

glen herrmannsfeldt · Feb 21, 2013

(snip, then I wrote, regarding illegal memory access past the end
of allocated memory)

FSVO "most": VirtualAlloc on Windows, mmap on Linux and most *nix. On
MVS you can have an authorized program change storage keys on selected
pages (for the simple implementation, you'd have to lock those pages
in storage, but we're wasting tons of memory anyway, what's a bit
more).

At least for SVS (where I used to know at least a little about how
these things worked) the key would have to go into the page table.
(Or maybe only segment table?)

SVS runs a single virtual address space for all tasks, like OS/360
inside a virtual address space. Storage keys were still used.

With MVS, and more than 15 tasks, ordinary user tasks all run under
the same key, with changes to the segment/page tables at task switch.

So, if you didn't lock the page then there would be a page fault before
the problem was detected, but that shouldn't be so bad.

-- glen

Jorgen Grahn · Feb 21, 2013

Which option is that? I didn't notice it at
<http://valgrind.org/docs/manual/mc-manual.html#mc-manual.options>,
which seemed the obvious place to look.

I bet he's talking about Electric Fence, not Valgrind. Valgrind has
its own layer of control between the code and the hardware, and AIUI
doesn't need the MMU to tell it if the memory being accessed exists or
not.

/Jorgen

James Kuyper · Feb 21, 2013

I bet he's talking about Electric Fence, not Valgrind. Valgrind has
its own layer of control between the code and the hardware, and AIUI
doesn't need the MMU to tell it if the memory being accessed exists or
not.

Well, when I used valgrind on the no-if code, it only complained about
the fact that a[child+1] was reading uninitialized memory. It didn't
complain about the fact that the memory was uninitialized because it was
one past the end of a requested allocation, which is a more serious
problem. If there's an option to turn on warnings about the more serious
problem, I'd appreciate knowing what it is.

Geoff · Feb 21, 2013

Ken Brody said:
Ken Brody said:

On 2/19/2013 11:22 AM, James Kuyper wrote:
On 02/18/2013 09:28 PM, glen herrmannsfeldt wrote: [...]
It's sloppy programming to assume that, just because the chance is
small, it can be ignored. If you do enough allocations, allocating
exactly at the end of a page becomes a near certainty, even though each
individual allocation has only a small chance of ending up there.

Click to expand...

The odds of a large meteor passing within the orbits of geosynchronous
satellites is small. The odds of a medium-sized meteor entering the Earth's
atmosphere is smaller. The odds of both happening within 24 hours of each
other is miniscule.

Click to expand...

But not zero.

Click to expand...

I haven't been following the detail so carefully, but it seems likely
that the two events aren't statistially independent. That is, they might
have the same source.

Click to expand...

The orbits of both meteors have been plotted. They're not related:

http://blogs.nasa.gov/cm/blog/Watch the Skies/posts/post_1361037562855.html

Or:

http://www.slate.com/blogs/bad_astr...teors_why_are_we_suddenly_seeing_so_many.html

Believing events to be statistically independent
when they aren't is the cause for many mistakes, possibly including much
of the economic meltdown of 2008.

Click to expand...

And believing things to be dependent when they're not is just as erroneous.
(Look up "post hoc ergo propter hoc" if you're not familiar with that phrase.)

[Lame attempt to drift back to C.]

This holds true in C when UB is involved. If a program "works" (ie: gives
the expected results, despite UB), and then you add a line and it stops
"working", the typical assumption is that the new code must be wrong.

There is the story of an old lady (why are stories always about old
ladies) who heard that the chance of a bomb on an airplane is 1 in
1000, but the chance of two bombs 1 in a million. Just to be sure,
the always brings her own bomb.

Click to expand...

The ecliptic projection of the orbital paths isn't the only clue.
The inclination of the 2012 DA14 object is over 11.6 degrees from the
ecliptic but the Russian meteor was quite different, given the entry
path. In the case of DA14 the trajectory is from south to north as it
crosses the orbit of Earth. It crosses the plane near Earth's orbit,
the "coincidence" is that we are occupying the nearby space at the
same time. I don't have specific data on the orbit calculated for the
Chelyabinsk object but here's a nice video of the two events involving
3 dimensions.

http://www.minorplanetcenter.net/db_search/show_object?object_id=2012+DA14&commit=Show

Philip Lantz · Feb 22, 2013

Jorgen said:
I bet he's talking about Electric Fence, not Valgrind.

Yes, you're right. Sorry for the error.

What should I do Before I give up programming?	6	Jan 14, 2023
a fast malloc/free implementation & benchmarks	0	Mar 20, 2011
CSS Grid in a nutshell (Part 1)	0	May 31, 2023
Optimization idea: put (y&1) in [] instead of if()	11	Nov 4, 2009
I don't know if im on the right forum but i think someone did something weird on my instagram	0	Jan 31, 2023
Help with Javascript on my wordpress page	1	Aug 31, 2023
There is something weird going on my instagram and i need to know if it was done on purpose by someone else	2	Jan 31, 2023
Free root access	1	Apr 1, 2009

IF-free conception.

Evgeney Knyazhev

Öö Tiib

BartC

Evgeney Knyazhev

Evgeney Knyazhev

James Kuyper

Ben Bacarisse

Ken Brody

Ken Brody

glen herrmannsfeldt

glen herrmannsfeldt

glen herrmannsfeldt

Philip Lantz

James Kuyper

glen herrmannsfeldt

glen herrmannsfeldt

Jorgen Grahn

James Kuyper

Geoff

Philip Lantz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads