Advanced data structures

J

jacob navia

I have started reading
"Advanced data structures" from Peter Brass.

He says:

<quote>
This is not a book on object-oriented programming, but on the
algorithmic side of data structures. In May 2009 I finally removed the
ps/pdf-files of the book; go and buy it. Indeed the printed version
looks nice, and the copyeditor and the production people from Cambridge
UP did good work.
<end quote>

I find this book excellent. Everything is written in C, it
is clear, and explains very well the (quite difficult) stuff.

I see that I will have to read this before going further in the
container library. This is exactly the book I was looking for.

It is obvious too, that C is a good language for conveying
algorithmic thought. No, it is NOT just a language for some
obscure embedded systems.

jacob
 
I

Ian Collins

jacob said:
It is obvious too, that C is a good language for conveying
algorithmic thought. No, it is NOT just a language for some
obscure embedded systems.

Who said it was?
 
B

bartc

Richard Heathfield said:
I'm glad you like it. I haven't read it, but you make it sound worth the
trouble.



I couldn't agree more. It is precisely because (well-written) C is so good
at describing algorithms in all their nitty-gritty detail that it is the
lingua franca of nearly all the best general programming books.

There might be more apt languages than C (not everyone is comfortable with
zero-based indexing for example), but if it's a choice between 'MIX' and C,
then C wins.
 
F

Frank

I have started reading
"Advanced data structures" from Peter Brass.

He says:

<quote>
This is not a book on object-oriented programming, but on the
algorithmic side of data structures. In May 2009 I finally removed the
ps/pdf-files of the book; go and buy it. Indeed the printed version
looks nice, and the copyeditor and the production people from Cambridge
UP did good work.
<end quote>

I find this book excellent. Everything is written in C, it
is clear, and explains very well the (quite difficult) stuff.

I see that I will have to read this before going further in the
container library. This is exactly the book I was looking for.

It is obvious too, that C is a good language for conveying
algorithmic thought. No, it is NOT just a language for some
obscure embedded systems.

jacob

Sounds great. How much is it? Does it come with a companion disc?
 
F

Frank

Who said it was?

I find a lot of C literature slanted this way. You certainly don't blame a
guy for publishing any technical book, but the embedded stuff usually has a
lot of proprietary and non-standard stuff. I know I'm not the owner, so I
lose interest.

No one will ever put a label on standard c and say "this belongs to rupert
murdoch." I'm fine with our Switzerland ethic of making it commercially
viable.

Chuck would never say a program were portable unless it could be ported to
a clicker. I think jacob looks at portability like I do.
 
J

jacob navia

Frank a écrit :
Sounds great. How much is it? Does it come with a companion disc?

I paid 50.62$ at Amazon.com (new) but maybe you can find it for less. There is
a used version (at amazon) starting at 45$, other sites may have better offers.

The code of the book is available at the author's web site. Just cut/paste
from there.

jacob
 
S

Seebs

For its power to communicate algorithmic ideas readably, I would choose
C not just over MIX, but over Java, C++, Li*p... everything. Have you a
more apt language in mind?

In terms of number of people who can read them, probably not. For many
algorithms, though, I'd probably find it easier to express them in Ruby,
and I'm told Icon is also excellent if you know it...

C is pretty expressive, but a lot of things end up with a bunch of
overhead that isn't actually the implementation of the algorithm, but
rather, the bookkeeping to allow that implementation to work.

-s
 
F

Frank

Anyone who is not comfortable with 0-based indexing needs to buy a new
mattress. The contortions you have to go through to use other bases just
melt away when you switch to 0.

It's not the indexing. I wend in an out of fortran and c. In support of
your point, I found that the fortran bit model for integers is right to
left and from 0 to wordsize - 1. It's a serious gotcha under
circumstances, e.g. your word is shifted off its axis by one, you've lost a
bit and have another that came out of nowhere.
For its power to communicate algorithmic ideas readably, I would choose
C not just over MIX, but over Java, C++, Li*p... everything. Have you a
more apt language in mind?

If I take your "you" to be familiar plural, then I would distinguish
between algorithm and caller. I love the C bindings in fortran! I can
look at output in a language where I can write
print *, anything
, while the computational capability came out of a can named C.
 
J

jacob navia

Seebs a écrit :
In terms of number of people who can read them, probably not. For many
algorithms, though, I'd probably find it easier to express them in Ruby,
and I'm told Icon is also excellent if you know it...

In ruby, you have built-in hash tables, sets, and many other containers.
You have to use the implementation that the language offers. Obviously
you *could* write your own, but all performance considerations would
be difficult to measure in an interpreter.

Ruby also, has a way of overloading operators that sometimes looks quite
weird:

[1,2,3]+[3,4,5] --> [4,6,8] ?
[1,2,3]-[3,4,5] --> [-2,-2,-2] ?

No. [1,2,3]-[3,4,5]-->[1,2], and [1,2,3]+[3,4,5]-->[1,2,3,4,5]

This is surely not really obvious for people not speaking ruby
fluently...
> C is pretty expressive, but a lot of things end up with a bunch of
overhead that isn't actually the implementation of the algorithm, but
rather, the bookkeeping to allow that implementation to work.

-s

But exactly THAT makes you AWARE of that overhead, that is an inherent
part of the algorithm's cost!

For instance if we design an algorithm without explicit free, using a
garbage collector (and with JAVA you do not have any other choice) the
cost of an algorithm that uses memory sparingly and another that
needlessly thrashes memory is not visible in the code you write.

Hiding "details" can be OK, but when designing an algorithm and
evaluating its cost, those details do matter. In ruby or java
memory management costs are hidden by the language, and you are
forced to use the OO paradigm.
 
J

James Dow Allen

[Check out Peter Brass's book. Source code on-line.]

I looked at the source for tries (digital search trees) and was
disappointed to see that he didn't use "traversor structures."
I know Ben Pfaff uses such things, and it seems excellent idea.
Do others use them? Can Brass' examples be recommended if he
*doesn't* use them?
Seebs a crit :

This seems to me to be a very apt description, though the notion
of *which* overhead is tedious might vary: there's "bookkeepping"
like includes and prototypes, and there's the tedium of pointer
manipulations or bit twiddling.
But exactly THAT makes you AWARE of that overhead, that is an inherent
part of the algorithm's cost!

I agree. The close similarity, at least in character, between
low-level C and a machine's likely instruction set, make it the
more comfortable choice for hardware-interfacing tasks like
memory management.

James Dow Allen
 
B

Bruce Cook

bartc said:
There might be more apt languages than C (not everyone is comfortable with
zero-based indexing for example), but if it's a choice between 'MIX' and
C, then C wins.

I'm interested in why you'd pick zero-based indexing as something people
would specifically have problems with in C.

I personally find zero-based indexing fairly natural, other languages which
1 based indexing such as Fortran and Cobol used to bug me, especially after
I had started programming in C and "knew there was a better way" :)

zero based indexing certainly makes more sense mathematically.

Bruce
 
F

Frank

Frank a écrit :

I paid 50.62$ at Amazon.com (new) but maybe you can find it for less. There is
a used version (at amazon) starting at 45$, other sites may have better offers.

The code of the book is available at the author's web site. Just cut/paste
from there.

jacob

Sounds good. If anyone in the beehive state wants to ask me what I want
for Christmas, I can be precise.

Cheers,
--
 
N

Nick

Bruce Cook said:
I'm interested in why you'd pick zero-based indexing as something people
would specifically have problems with in C.

I personally find zero-based indexing fairly natural, other languages which
1 based indexing such as Fortran and Cobol used to bug me, especially after
I had started programming in C and "knew there was a better way" :)

zero based indexing certainly makes more sense mathematically.

Once you get into "friendly" string manipulation, there are a few arguments
for 1-based indexing, particularly when combined with "zero is false,
everything else is true". For example, it would be "nicer" in
Javascript to be able to write »if(somestring.indexOf("test"))« rather
than having to add »!=-1« at the end. Of course, C using pointers for
this manages to do it quite well with the NULL return from strstr.
 
S

Stefan Ram

jacob navia said:
I find this book excellent. Everything is written in C, it

It seems to allocate a stack entry

st = (stack_t *) malloc( ...

and then, in the next line

st->base = ...

it seems to use »st« without ever taking account for the
possibility that »st« might be 0?

This might be fine when a program is used by the programmer
himself, who can cope with segment violations or other
strange runtime-behavior. But it might not be appropriate
when software is intended to be delievered as a product to a
customer who needs to depend on it.
 
J

jacob navia

James Dow Allen a écrit :
[Check out Peter Brass's book. Source code on-line.]

I looked at the source for tries (digital search trees) and was
disappointed to see that he didn't use "traversor structures."
I know Ben Pfaff uses such things, and it seems excellent idea.
Do others use them? Can Brass' examples be recommended if he
*doesn't* use them?

I do not know but the tries subject is treated very extensively
in that book, sorry. Pages 335 to 356 are about tries, and he
shows several implementations.

Maybe you didn't look at the correct one?

Because in many examples he shows a first "trivial" implementation that
he later refines more and more. If you would explain what do you
mean by "traversor structures" we could go further.

jacob
 
J

jacob navia

Stefan Ram a écrit :
It seems to allocate a stack entry

st = (stack_t *) malloc( ...

and then, in the next line

st->base = ...

it seems to use »st« without ever taking account for the
possibility that »st« might be 0?

This might be fine when a program is used by the programmer
himself, who can cope with segment violations or other
strange runtime-behavior. But it might not be appropriate
when software is intended to be delievered as a product to a
customer who needs to depend on it.
He does that, as he explains in the text, because they are replaced
in the next section by a heap implementation where he builds an
intermediate layer (as I did for the list container) that allocates
nodes from a linked list of blocks.

The book is not a fool proof receipt book, it is a book about
advanced data structures. The code is just to illustrate the text!
 
J

jacob navia

jacob navia a écrit :
The book is not a fool proof receipt book, it is a book about
advanced data structures. The code is just to illustrate the text!

What I mean with this sentence is that when reading code about an
algorithm, it is easier to read if the error handling code is not
shown.

What I am interested in with this book is to learn about the
data structures themselves, not about good software practices,
testing the return value of malloc, writing int main(void) or
similar problems.

In my implementation of the heap I did test for NULL return,
and I issue the corresponding error. In a book about advanced
data structures that is just a distracting detail.
 
S

Stefan Ram

jacob navia said:
What I mean with this sentence is that when reading code about an
algorithm, it is easier to read if the error handling code is not
shown.

I have another definition of »algorithm«: An algorithm is
the implementation of some feature using the means of a
specific target language. That is, for example, an algorithm
for a tree search in Prolog differs from the same algorithm
in Assembler. The algorithm depends on the elementary
operations available. It can not be abstracted away from the
target language too much. There is no such thing as a
»language-independent algorithm«. For example, in a language
with an operator »¤« for a tree search of x in the tree T,
the algorithm for a tree-search just ist »x ¤ T«.
What I am interested in with this book is to learn about the
data structures themselves, not about good software practices,
testing the return value of malloc, writing int main(void) or
similar problems.

To me, when programming in C, handling malloc in the correct
way is an integral part of the algorithm. It can not be
abstracted away. Otherwise, I'd use a language with more
automated memory management.
In my implementation of the heap I did test for NULL return,
and I issue the corresponding error. In a book about advanced
data structures that is just a distracting detail.

I would not actually issue from a library function in the
sense of a side-effect with access to external files or
consoles or so, but just return 0. For example, I »quote«
some code, which actually was simplified by me:

node_t *get_node() /* version simplified by Stefan Ram */
{ ...
currentblock = malloc( BLOCKSIZE * sizeof(node_t) );
return( currentblock ); }

http://www-cs.ccny.cuny.edu/~peter/dstest/li_stack.c

This still is ok, as it returns 0 in case malloc returns 0.

.-------------------------------------------------------.
| We do not need extra means to »signal« a failure, |
| but we must not hide it from the caller. |
'-------------------------------------------------------'

But then, later:

st = get_node();
st->next = NULL;

http://www-cs.ccny.cuny.edu/~peter/dstest/li_stack.c

Here »st« is used as if it was known at runtime that it was
not 0. Such code is appropriate for some other languages,
but IMHO not for C.
 
J

James Dow Allen

James Dow Allen a écrit :
[Check out Peter Brass's book.  Source code on-line.]
I looked at the source for tries (digital search trees) and was
disappointed to see that he didn't use "traversor structures."
I know Ben Pfaff uses such things, and it seems excellent idea.
Do others use them?  Can Brass' examples be recommended if he
*doesn't* use them?

Because in many examples he shows a first "trivial" implementation that
he later refines more and more. If you would explain what do you
mean by "traversor structures" we could go further.

I examined several modules linked from
http://www-cs.ccny.cuny.edu/~peter/dstest.html
all with the same unfortunate result. Let us admit,
however, that much or most other tree-traversing code also
does *not* use such "traversor structures". Ben Pfaff's
code does and, whether or not he claims it as original
invention, can doubtless do better than I at lucid
discussion thereof.

Not counting the rather trivial
typedef char object_t;
Brass's code declares a single object type
(node), where I'd use at least another two.
Add a table or table descriptor type more than
just a certain node (the root) becoming all
there is to a "tree." The other type I'd
surely introduce for all but the most trivial
search tree is "traversor", which holds the
state of a search in progress (or search completed).

To understand the need, consider that we operate
the tree searcher as a co-routine, returning the
entire tree in pre- or post-order. Brass's
code can't do that: it has create(), find(), insert()
and delete() but no treewalker(). When you write
treewalker() you'll see the need for the traversor
object; once you implement it you find it convenient
to use during find(), delete(), etc. as well.

James Dow Allen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top