Avoiding recursive stack overflow in C on Unix/Linux?

I

Ian Collins

You're conveniently glossing over the number of lines required to adapt
to some particular library interface. Also, using a library function
usually involves callbacks for comparing elements, which adds runtime
overhead.

As I posted else-thread, the C++ binary_search interface is as simple as
it gets and the code happily expands in-line. So there isn't any
adapting or overhead.
Assuming you believe in unit tests at all, testing the unit containing
the search will cover it, and that has to be done regardless of
implementation. I have yet to see anyone write separate unit tests for
every line of code.

Well I only write code to pass tests, so I guess I'm one.
 
K

Keith Thompson

Måns Rullgård said:
A binary search is roughly 10 lines of code. Writing one really doesn't
take long.

In "Programming Pearls", Jon Bentley writes:

While the first binary search was published in 1946, the first
binary search that works correctly for all values of n did not
appear until 1962.
 
M

Michael Press

Joshua Maurice said:
I politely disagree, and I must express my extreme disbelief at this
statement that you can write a good correct binary search algorithm
faster than you can use C++ std::map, or C++ std::lower_bound, et al.

Does the library write your compare routine
and unit test it for me? Does the library
routine inline the code that examines the
return and executed the action dependent on
the return value? Does the library return
a value in exactly the form I want?
 
I

Ian Collins

Does the library write your compare routine
and unit test it for me?

If it has to be written, you have to write and test the compare routine
whether you use a library for the search or not.
Does the library
routine inline the code that examines the
return and executed the action dependent on
the return value?

That's the compiler's job, not the library's.
Does the library return
a value in exactly the form I want?

How many forms of true and false are there?
 
M

Michael Press

Keith Thompson said:
In "Programming Pearls", Jon Bentley writes:

While the first binary search was published in 1946, the first
binary search that works correctly for all values of n did not
appear until 1962.

I've had sixteen years to get it right too.

I feel sorry for those without the confidence
to write their own binary search.
 
D

Dr Nick

Måns Rullgård said:
Assuming you believe in unit tests at all, testing the unit containing
the search will cover it, and that has to be done regardless of
implementation. I have yet to see anyone write separate unit tests for
every line of code.

I'm not a unit test freak, but if you write a binary search (something
that's surprisingly difficult to get right in my limited experience) you
absolutely have to test it on - at least - "item is in the list"
(specifically testing first item, last item, exact middle item), "item
is not in the list" (specifically testing off the top, off the bottom)
and probably a couple of others.

Otherwise you will find that it works fine on your test data and fails
in actual use.

Obviously there has to be a line somewhere between code you write
yourself (or the library would be every programme you ever need to
write) and code you don't (or C would only have putchar and would leave
to write puts and printf etc). But unless you are desperate to avoid
the call to the comparison function (do I read the thread as saying that
C++ avoids this by inlining?) I think binary search lives on the library
side of the line.
 
I

Ian Collins

bsearch either
returns NULL or
returns a pointer to an element of an array.

It is also possible to want a binary search
which returns a pointer to the first occurance in the array
or to want a binary search
which returns a pointer to the last occurance in the array.

All of those are supplied by the C++ standard library (plus the option
of returning the range of matching elements).
 
B

boltar2003

I can still write them in less time than it takes me to learn C++.

Perhaps you should learn C++. While it seems to be the case that the standards
committee doesn't seem to know when to leave well enough alone and seem to
want to turn it into the largest, most unwieldy language the world has yet
seen by adding language "features" that should really be left in libraries,
some of its basic functionality is very useful. I'd rather not have to
go back to function pointers in structures to mimic OO again or have endless
function return value error checks instead of just throwing exceptions.

B2003
 
R

Rainer Weikusat

Kleuskes & Moos said:
STL is a library, not a language element. A superbly useful library at
that. That's exactly where stuff like lists, queues, stack and what
have SHOULD be implemented. After all, it's no use inventing the wheel
all over again every time you need one.

Funny as it may seem, but that's what people who construct vehicles
actually often do: They design different types of wheels for different
types of vehicles and actually even construct new types of wheels in
order to improve the designs of existing of wheels. But they
don't usually claim that they invented the basic design because they
also created an implementation of it. It takes a
programmer-turned-math-geek to be so out of touch with reality to not
notice that this analogy is nowhere near being accurate and to also
avoid noticing that a one-size-fits-all-wheel simply doesn't exist
while - at the same time - claiming to have invented something
comparable to 'the wheel' when all he has actually done is create
another implementation of an algorithm which has been well-known for
thirty years or so and was usually the topic of some kind of
undergraduate course.

Given these glaring deficiencies in judgement and the completely
ludicrous self-assessment, such a person is not one one can expect to
give sensible advice to others.
 
M

Malcolm McLean

On Tue, 10 May 2011 04:42:53 -0400


Perhaps you should learn C++.
I abandoned C++. I haven't written any new code in it for years.

There were many reasons. Mainly, I'd been getting more and more
sceptical about object-oriented design. Quite a lot of people have now
come round to this view. But the standard template library was a
factor. The problem is you end up writing copy constructors and
overloading comparators for too many, trivial objects. Then the syntax
isn't nice and readable. It also generated incomprehensible error
messages when something failed - that's probably been fixed now, but
it was a major nuisance at the time. Then there were too many ways of
doing things - the correct way, passing a controlled sequence, worked,
but it was too difficult to get people to use it consistently. A data
structure isn't just a way of storing information, it's also an
interface by which parts of the program talk to each other, if you've
got plain arrays and STL vectors and arrays of objects and arrays of
object pointers all holding similar data, then integration becomes
very difficult.
 
B

boltar2003

There were many reasons. Mainly, I'd been getting more and more
sceptical about object-oriented design. Quite a lot of people have now

It can lead to overblown , overdesigned programs certainly. But it has
its uses. Its just that some coders brought up on OO seem to think it
should be used for everything and don't stop to consider whether its
appropriate for certain tasks. Also the current fad of design patterns has
a lot to answer for - many new coders don't think about how to solve a
problem from the ground up, they just try and solve it by plugging together
the patterns they know in a lego brick style.
factor. The problem is you end up writing copy constructors and
overloading comparators for too many, trivial objects. Then the syntax

True, but I think the latest version of C++ has fixed the constructor issue -
if you don't provide the constructors for your child class it defaults to
using the parent class versions. A godsend if you want to inherit from
something like std::string without having to have a 50 line long class
definition to make it work properly.

B2003
 
K

Kleuskes & Moos

There are good reasons to write it oneself.

Oh sure... Please... It was merely an observations that there are good
reasons to have libs like BOOST and STL, too. Not some kind of fatwa
prohibiting this, that or the other. If you feel you're better off
rolling your own, then you should.
When I want
a binary search I simply write it on the fly; and embed
whatever I want to do with the search result right there.
It takes as long or longer to write the interfaces to a
library binary search as it does to write one that does
exactly what I want.

If that's the best solution for you... Who am i to protest?
 
K

Kleuskes & Moos

On May 10, 12:12 pm, (e-mail address removed) wrote:> On Tue, 10 May 2011 04:42:53 -0400



I abandoned C++. I haven't written any new code in it for years.

Where C provides a programmer with ample rope to hang himself, C++
provides a firing-squad, blindfold and last-cigarette. It requires
quite a lot of attention both to detail _and_ the big picture. If you
loose sight of the big picture or lack the discipline to stick to it, C
++ programs get get very messy, very fast.

If used properly, however, it's a very potent tool to write amazing
software and frankly, i like the language for the same reason i like
C: It does what i tell it to without being a nagging mother-in-law.

STL is one of the reasosns for that. I don't have to mess around to
get all the details of various container-classes right, but i can
concentrate on the problem at hand instead.
There were many reasons. Mainly, I'd been getting more and more
sceptical about object-oriented design. Quite a lot of people have now
come round to this view.

No single programming "paradigm" is a panacea and will suite everyone,
but to quote B. Stroustrup himself: The major cause of complaints is C+
+ undoubted success. As someone remarked: There are only two kinds of
programming languages: those people always bitch about and those
nobody uses.

<snip>
 
M

Michael Press

Dr Nick said:
I'm not a unit test freak, but if you write a binary search (something
that's surprisingly difficult to get right in my limited experience) you
absolutely have to test it on - at least - "item is in the list"
(specifically testing first item, last item, exact middle item), "item
is not in the list" (specifically testing off the top, off the bottom)
and probably a couple of others.

Otherwise you will find that it works fine on your test data and fails
in actual use.

Students are set exercises. The purpose of an exercise
is to get stronger. Sometimes binary search is offered.
The idea is to discover the loop invariant. Eventually
the student learns to write for the edge and corner cases
_first_. Thus a purpose of the exercises is realized.
I am surprised at how many people here admit to not
knowing how to write a binary search.
Obviously there has to be a line somewhere between code you write
yourself (or the library would be every programme you ever need to
write) and code you don't (or C would only have putchar and would leave
to write puts and printf etc). But unless you are desperate to avoid
the call to the comparison function (do I read the thread as saying that
C++ avoids this by inlining?) I think binary search lives on the library
side of the line.

The idea is that in writing a binary search in place
one avoids having to write an interface to a library
routine. I embed bits of code in the search routine
itself, rather than interpreting the library routine
results and then doing what I need.

I cannot see myself ever calling bsearch(3). I will
call qsort(3). Note the existence of qsort_r.
Sometimes I'll write an insertion sort with binary
search. Sometimes I'll write a heapsort. This stuff
is not hard.
 
M

Michael Press

Ian Collins said:
If it has to be written, you have to write and test the compare routine
whether you use a library for the search or not.

Or embed it in the binary search code.
It's typically one line.
That's the compiler's job, not the library's.

No. You missed the point. What do you do with
the result of the binary search? That too can
be embedded in the binary search routine.
How many forms of true and false are there?

Binary search does not refer to the return value---the
return is not binary valued. There are different
purposes for a binary search, and different ways to use
the result.
 
I

Ian Collins

Or embed it in the binary search code.
It's typically one line.

Which is what the C++ search functions do.
No. You missed the point. What do you do with
the result of the binary search? That too can
be embedded in the binary search routine.

Doesn't the search have to complete before it has a result?
Binary search does not refer to the return value---the
return is not binary valued. There are different
purposes for a binary search, and different ways to use
the result.

Indeed, I was hung up on binary_search, the C++ library also includes
upper and lower bounds as well as equal range search functions.

It is better design to decouple the algorithm from the result
processing. If nothing else, it gives you a reusable algorithm. Better
still if this can be done with little or no overhead. Overhead is often
seen as a blocker in C, but it isn't in C++ which is one reason C++ has
an algorithms library and C doesn't.
 
I

ImpalerCore

Students are set exercises. The purpose of an exercise
is to get stronger. Sometimes binary search is offered.
The idea is to discover the loop invariant. Eventually
the student learns to write for the edge and corner cases
_first_. Thus a purpose of the exercises is realized.
I am surprised at how many people here admit to not
knowing how to write a binary search.


The idea is that in writing a binary search in place
one avoids having to write an interface to a library
routine. I embed bits of code in the search routine
itself, rather than interpreting the library routine
results and then doing what I need.

I cannot see myself ever calling bsearch(3). I will
call qsort(3). Note the existence of qsort_r.
Sometimes I'll write an insertion sort with binary
search. Sometimes I'll write a heapsort. This stuff
is not hard.

In this day and age with more and more glue coders, knowing how to
write a binary search is not required anymore. Beneficial yes, but
not required for many programmers. However, having someone on the
team that can do algorithm work when needed is still vital.

And I agree that bsearch is not hard with some study, but you've then
doomed yourself to cut-and-paste search algorithms for every little
quirk of data type or search semantic. There are methods to abstract
bsearch within an array container that make it quite useful, and the
value of providing a canned search algorithm to less skilled coders
cannot be understated.

(crosses fingers that my bsearch is "right")

\code snippet
/*!
* \brief Searches a \c c_array for an object that matches the given
* value using a binary search algorithm.
* \param array A \c c_array.
* \param object The object to find.
* \param cmp_fn The function to call for each object comparison. It
* should return 0 when the objects match.
* \param type The object type.
* \return The address of the object in the array, or \c NULL if the
* object is not found in the array.
*
* The \c c_array_bsearch function requires the array to be sorted
* prior using this to search objects in the array. This sorting
* can be done during insertion using \c c_array_insert_sorted, or
* after all the objects have been inserted into the array using
* \c c_array_sort. If the \c c_array is not sorted, there is the
* possibility for false negatives.
*
* The binary search algorithm is preferred for large arrays of
* objects as the performance of the algorithm is O(log n) rather
* than O(n) for the \c c_array_search and \c c_array_rsearch
* functions. Keep in mind that if duplicate objects are in the
* array, the \c c_array_bsearch algorithm may point to any one of
* them rather than the first or last instance of the object.
*/
#define c_array_bsearch( array, object, cmp_fn, type ) ( (type*)
(gc_array_bsearch( (array), (object), (cmp_fn), sizeof(type) )) )

void* gc_array_bsearch( struct c_array* array,
const void* object,
c_compare_function cmp_fn,
size_t type_sz )
{
void* found = NULL;
ssize_t l, h, m;
void* m_p;
int cmp;

if ( array && cmp_fn )
{
C_ASSERT( array->element_size == type_sz );

if ( array->size )
{
/*
* l - low boundary of region to search
* h - high boundary of region to search
* m - median of region to search (if array is sorted)
* m_p - pointer to the corresponding median object in the
array.
*/
l = 0;
h = array->size - 1;

while ( l <= h )
{
/*
* The following statement gets the midpoint of the range,
* by obtaining the average using a slightly obtuse method.
*
* The normal method to get the midpoint of a range is to
* sum the low and high boundary and divide by 2.
*
* (high + low) / 2
*
* The problem is that this limits the maximum value of the
* mid-point to SSIZE_MAX / 2 without encountering overflow.
*
* low + ((high - low) / 2) or
*
* 2/2 low + 1/2 high - 1/2 low == (high + low) / 2
*
* This still evaluates to the expression '(high + low) / 2',
* but avoids the overflow issue from 'high + low'.
*/
m = l + ((h - l) / 2);
m_p = gc_array_at( array, m );

cmp = cmp_fn( m_p, object );
if ( cmp < 0 ) {
l = m + 1;
}
else if ( cmp > 0 ) {
h = m - 1;
}
else
{
found = m_p;
break;
}
}
}
}

return found;
}

/*!
* \brief Inserts a new object into the array using the comparison
* function.
* \param array A \c c_array.
* \param object The object.
* \param cmp_fn The function to compare objects in the array.
* \param type The object type.
*/
#define c_array_insert_bsorted( array, object, cmp_fn, type )
(gc_array_insert_bsorted( (array), (object), (cmp_fn), sizeof(type) ))

void gc_array_insert_bsorted( struct c_array* array,
void* object,
c_compare_function cmp_fn,
size_t type_sz )
{
size_t index = 0;
ssize_t l, h, m;
void* m_p;
int cmp;

if ( array )
{
C_ASSERT( array->element_size == type_sz );

if ( cmp_fn )
{
if ( array->size )
{
/*
* l - low boundary of region to search
* h - high boundary of region to search
* m - median of region to search (if array is sorted)
* m_p - pointer to the corresponding median pointer in the
array.
*/
l = 0;
m = 0;
h = array->size - 1;
cmp = 0;

while ( l <= h )
{
m = l + ((h - l) / 2);
m_p = gc_array_at( array, m );

cmp = cmp_fn( m_p, object );
if ( cmp < 0 ) {
l = m + 1;
}
else if ( cmp > 0 ) {
h = m - 1;
}
else {
break;
}
}

/*
* At the end of the loop, the value of cmp will determine
which
* side of the object referenced by 'm_p' to insert the
object.
* If the object at 'm_p' is equal or greater than the object
* to be inserted, the insert should place the object before
the
* location 'm'. If it is less, the object should be placed
* after location 'm'.
*/
index = m;
if ( cmp < 0 ) {
++index;
}
}
}

gc_array_insert( array, index, object, array->element_size );
}
}
\endcode

By creating appropriate comparison functions, one can sort and search
based on different fields of a structure quite easily. In my opinion,
writing a valid comparison function is much easier than man-coding
each search as you need them, especially if your struct has several
members to sort and search on. You admit to using qsort, but don't
see the benefit of bsearch? Doesn't seem logical to me.

The real question is whether one should in general trust the standard
C library bsearch implementation over yet another custom coded binary
search that some other programmer designed? If you're creating a
binary search with sprinkles, like one that modifies the container
while searching, that is something different.

Best regards,
John D.
 
J

Jonathan de Boyne Pollard

malloc isn't the system call that adds memory to the process, sbrk
is. [...]
Ah! Like Keith I'd often wondered about just why this happens. That
makes a lot of sense: malloc wants big chunks of memory to parcel out
from.
Yes. malloc() and free() are almost always just library functions, that
are layered on top of the operating system's actual memory allocator.
The operating system memory allocator varies quite widely. M. Margolin
mentioned sbrk() for simplicity, but nowadays that model, which doesn't
really match up very conveniently with the process address spaces that
one gets in these days of dynamically loaded libraries, multiple
threads, and so forth, is commonly relegated. Nowadays, a common
implementation is to suballocate from within usually sparsely committed
(i.e. the address space is reserved but not all pages are committed)
heaps, usually in the MiBs, created on demand via VirtualAlloc(),
mmap(), DosAllocMem(), or whatever mechanism the operating system
provides. Operating system memory allocators usually work in terms of
whole pages; and again a common implementation is to commit and decommit
one or more pages, within the sparse page range used for a heap, as
appropriate as memory is malloc()ed and free()d.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top