Questions on Using Python to Teach Data Structures and Algorithms

E

efrat

Hello,

I'm planning to use Python in order to teach a DSA (data structures
and algorithms) course in an academic institute. If you could help out
with the following questions, I'd sure appreciate it:
1. What exactly is a Python list? If one writes a[n], then is the
complexity Theta(n)? If this is O(1), then why was the name "list"
chosen? If this is indeed Theta(n), then what alternative should be
used? (array does not seem suited for teaching purposes.)
2. Suppose I have some file example.py, and I'd like to incorporate it
**into** part of an HTML page with nice syntax highlighting and all the
shebang. Is there an easy way to do so?
(Sorry, but any Google query involving "Python" and "HTML" (with any
other additional terms) led to Python HTML processing libraries.)
3. Are there any useful links for Python/DSA education? I found "Data
Structures and Algorithms with Object Oriented Design Patterns"
(http://www.brpreiss.com/books/opus7/html/book.html). It is a fine book,
but it is unsuitable: my students are electrical-engineers, and barely
know how to program; teaching them DSA, python, **and** stuff like the
visitor pattern seems impossible.

Python is such a cool language - I'm really hoping the students will
enjoy it as much as I do. Once again, many thanks for helping out with this.

Thanks,

Efrat
 
B

bearophileHUGS

efrat:
1. What exactly is a Python list?

A dynamic array that can grow geometrically on the right.

If one writes a[n], then is the complexity Theta(n)? If this is O(1),<

It is O(1).

then why was the name "list" chosen?

I'd too love to know why the wrong "list" name was chosen for them,
instead of "array". (Maybe because "list" is shorter, or because ABC
called them "lists"...)

2. Suppose I have some file example.py, and I'd like to incorporate it
**into** part of an HTML page with nice syntax highlighting and all the
shebang. Is there an easy way to do so?<

There are many programs that do this, I use a modified version of
PySourceColor:
http://bellsouthpwp.net/m/e/mefjr75/


Using Python to teach data structures and algorithms to
electrical-engineers students:
The following personal ideas may seem wrong, but if they are wrong,
than I'd like to know why.
I think Python is only partially fit for your purpose. If you want to
teach how complex data structures work in general, and some smart
algorithms on them, like teaching some interesting graph algorithms,
then Python is fit, because implementing such algorithms is often
simple, etc.
But to manage simple data structures Python isn't good, because it's
too much slow compared to the simple operations, and it uses too much
memory. One of the basic data structures is the chained list, you can
easly implement a chained list in Python, but the result is often
useless and without meaning, maybe even for teaching purposes. Python
is too much hi-level, while most of the basic data structures use
pointers and they have a meaning if done closer to the 'metal'. With
Python you can't have pointers (just names of objects) and some times
if you use a "fast" data structure you end doing things slower than
using the built-in data structures like dicts. So to teach some of the
basic data structures to your electrical-engineers students I think
Pascal is the best choice still :)
(Note: to teach DSA to CS students C can be fit too, and to teach a bit
of DSA to younger people Python can be better.)

Bye,
bearophile
 
G

George Sakkis

efrat said:
1. What exactly is a Python list? If one writes a[n], then is the
complexity Theta(n)? If this is O(1), then why was the name "list"
chosen? If this is indeed Theta(n), then what alternative should be
used? (array does not seem suited for teaching purposes.)

Indexing for python lists is O[1]. Why shouldn't they be named lists ?
2. Suppose I have some file example.py, and I'd like to incorporate it
**into** part of an HTML page with nice syntax highlighting and all the
shebang. Is there an easy way to do so?
(Sorry, but any Google query involving "Python" and "HTML" (with any
other additional terms) led to Python HTML processing libraries.)

Check out MoinMoin, it colorizes python as well as other languages and
text types: http://moinmoin.wikiwikiweb.de/HelpOnFormatting
3. Are there any useful links for Python/DSA education? I found "Data
Structures and Algorithms with Object Oriented Design Patterns"
(http://www.brpreiss.com/books/opus7/html/book.html). It is a fine book,
but it is unsuitable: my students are electrical-engineers, and barely
know how to program; teaching them DSA, python, **and** stuff like the
visitor pattern seems impossible.

"Beginning Python - From Novice to Professional" is approachable and
great as a textbook IMO. As a bonus, it covers up to python 2.4, which
very few existing books do.

George
 
M

Marc 'BlackJack' Rintsch

1. What exactly is a Python list? If one writes a[n], then is the
complexity Theta(n)? If this is O(1), then why was the name "list"
chosen?

Why not? It has all the methods one expect from an abstract data type
"list". It's not the O() behavior but the interface that defines abstract
data types. If it's a (double) linked list or a dynamical array under the
hood is an implementation detail.

Ciao,
Marc 'BlackJack' Rintsch
 
S

sturlamolden

efrat said:
1. What exactly is a Python list? If one writes a[n], then is the
complexity Theta(n)? If this is O(1), then why was the name "list"
chosen? If this is indeed Theta(n), then what alternative should be
used? (array does not seem suited for teaching purposes.)

A Python list is an array of object references that has some empty
slots (or one empty slot?) for growing quickly to the right.

If you want to make a chained structure, then perhaps you know LISP?
This is what the basic machinery of LISP looks like in Python:

def cons(a,b)
return [a,b]

def car(structure)
return structure[0]

def cdr(structure)
return structure[1]

Python lists are more powerful than you would think! You don't need
classes or OOP to create linked lists or tree structures in Python.

Remember that O(1) is not neccesarily faster than O(N)! Unless your
linked list is very big, you will get something called a 'cache miss'
inside your CPU. Thus it is usually more efficient to work with dynamic
arrays. Further, you can create hybrid array-list structures (e.g.
Java's ArrayList) that outperform lists and arrays with respect to
adding new elements. But they will have to be tuned to your particular
hardware architecture. Growing a linked list node by node is an
excercise for fools (and DSA students?) It may look good in DSA
textbooks, but it is extremely inefficient on real world computers.

Python's lists are implemented as dynamic arrays internally to be
efficient on the kind of data we normally work with. Not only do small
dynamic arrays grow faster than small lists, they also index much
faster. Why are they called "lists" then? Because Guido want you to
look at them conceptually as lists. That is what they are.

If you want real 'fixed size' arrays like Fortran and Matlab, then you
should add 'NumPy' to your Python (http://www.scipy.org). Your science
and engineering students will find NumPy, SciPy and Matplotlib
(http://matplotlib.sourceforge.net) valuable, so direct them to those
sources.
 
S

Steve Holden

efrat: [...]
then why was the name "list" chosen?


I'd too love to know why the wrong "list" name was chosen for them,
instead of "array". (Maybe because "list" is shorter, or because ABC
called them "lists"...)
I suspect it's because of their intrinsic one-dimensional nature.

regards
Steve
 
R

Ramon Diaz-Uriarte

Going back to the original question, a related question: does anybody
know why there are so few books on data structures and algorithms that
use Python?

I remember that, at least ~ 12 years ago there were many (and very
good) books that used Pascal for this topic. So when I did my own
search for one in Python (just for my own consumption and
enlightnment) and could only find the same one as the original poster
of this thread [1], I was very surprised. No publishers have felt the
need to fill this gap?



Best,

R.

[1] http://thread.gmane.org/gmane.comp.python.general/486698/focus=486698
"Data Structures and Algorithms with Object Oriented Design Patterns"
(http://www.brpreiss.com/books/opus7/html/book.html) and was surprised.
 
F

Fredrik Lundh

Ramon said:
Going back to the original question, a related question: does anybody
know why there are so few books on data structures and algorithms that
use Python?

Probably because Python has "better than textbook" implementations of
core structures, and "better than textbook" implementations of many core
algorithms, so lots of things can be done more efficiently by combining
existing structures and algorithms than by using "textbook" algorithms.

</F>
 
S

sturlamolden

sturlamolden said:
Remember that O(1) is not neccesarily faster than O(N)! Unless your
linked list is very big, you will get something called a 'cache miss'
inside your CPU. Thus it is usually more efficient to work with dynamic
arrays.

This was a bit ackwardly formulated. What I was trying to say is that
linked lists produces cache misses rather often, whereas small dynamic
arrays do not. This is because linked lists are not contigous in
memory, in contrast to dynamic arrays. Thus, adding an element to a
dynamic array is in most cases faster, even tough you have to make a
copy the whole array. The same is true when you try do delete some
elements from a list. Small dynamic arrays are faster than linked
lists, because they can be kept in cache. Creating a new array in cache
is faster than tracing after pointers. It is only when dynamic arrays
are to large to fit in cache that linked lists perform better. But in
this case, something like Java's ArrayList is the preferred data
structure.

That is the reason only fools and DSA students use linked lists. They
are a nice teoretical cobnstruct, but not friendly to real-world
computer hardware. Perhaps they were the better option some time in the
past, when CPUs had much less cache and could only accomodate very
short arrays.
 
R

Ramon Diaz-Uriarte

Probably because Python has "better than textbook" implementations of
core structures, and "better than textbook" implementations of many core
algorithms, so lots of things can be done more efficiently by combining
existing structures and algorithms than by using "textbook" algorithms.

OK, point taken. But having that shown explicitly in a (variety of)
traditional-looking DSA textbooks would be great. (And for some of us,
it might provide a conforting: "oh man, see how easy it is now with
Python"). After all, I think DSA classes are standard in CS
curricula. And what does the budding Python programmer answer to his
Pascal friend when he says "look, here is my linked list"?

Best,

R.
 
E

efrat

efrat said:
Hello,

I'm planning to use Python in order to teach a DSA (data structures
and algorithms) course ...


Hello,

Many thanks, repliers, for the informative and useful answers.

Bye,

Efrat
 
G

Gabriel G

At said:
Going back to the original question, a related question: does anybody
know why there are so few books on data structures and algorithms that
use Python?

I remember that, at least ~ 12 years ago there were many (and very
good) books that used Pascal for this topic. So when I did my own
search for one in Python (just for my own consumption and
enlightnment) and could only find the same one as the original poster
of this thread [1], I was very surprised. No publishers have felt the
need to fill this gap?

Maybe, because with Pascal you got *nothing* more than the bare
language, and you had to implement most of the structures and
algorithms yourself. (This was by design).
Python, on the other hand, comes with "batteries included". What's
the point in reimplementing another mapping/dictionary structure
using Python, having the built-in dict type which is rather efficient?
I would not use Python to teach *basic* data structures, instead, I'd
use it as a second stage to teach more complex structures and how to
design algorithms.


Gabriel Genellina
Softlab SRL





__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
W

Wayne Brehaut

Going back to the original question, a related question: does anybody
know why there are so few books on data structures and algorithms that
use Python?

I remember that, at least ~ 12 years ago there were many (and very
good) books that used Pascal for this topic. So when I did my own
search for one in Python (just for my own consumption and
enlightnment) and could only find the same one as the original poster
of this thread [1], I was very surprised. No publishers have felt the
need to fill this gap?

No, and you'll know why if you check for the number of university and
college computer science students learning Python in their
introductory programming course (not the number of institutions that
teach a little bit in a physics course), and the number of textbooks
available to support that (and not free online or print tutorials).
There's just not a big enough market for (traditional) publishers to
be interested in publishing them or (traditional) authors in writing
them.

Preiss (http://www.brpreiss.com/page1.html) translated his original
C++ text (1999) into a number of other languages: Java (2000), C#
(2001), Python (2003), and Ruby (2004). So he could easily afford to
Translate the early money-makers into less used languages because the
initial writing overhead was no longer an issue--and much of the
tyranslation was "automated". And he uses his free online versions to
help market the publishe'rs small (relative to C++ and Java) print
runs, so they can afford to service this very small market.

DSA--formerly (i.e., in the "Good Old Days") just Data Structures-- is
or was, in the "usual" CS curriculum (a la ACM+IEEE) at least, a
second-level course based on CS1; hence, "most efficiently" taught
using the students' introductory language (if it's at all suitable,
and texts are available) so only some finer points of the language
needed covering and one can concentrate on implementation of the data
structures themselves.

So very little CS1 in Python translates into very little--and
probably even less--CS2, etc., in Python.

wwwayne
Best,

R.

[1] http://thread.gmane.org/gmane.comp.python.general/486698/focus=486698
"Data Structures and Algorithms with Object Oriented Design Patterns"
(http://www.brpreiss.com/books/opus7/html/book.html) and was surprised.
 
W

Wayne Brehaut

Probably because Python has "better than textbook" implementations of
core structures, and "better than textbook" implementations of many core
algorithms, so lots of things can be done more efficiently by combining
existing structures and algorithms than by using "textbook" algorithms.

But this answers a diiferent question: university and (academic)
college data structure courses don't care how efficient a particular
language's built-in data structures are, since the intent is for the
students to learn how to implement data structures in general, and
probably in a particular language--the "core" languasge used in their
core programs--and then be able to apply that knowledge and skill to
implementing at least "reasonably efficient" ones when they need to in
languages that don't have any to speak of built in.

And, since many students will go direct to business or industry, and
may even be apprenticing there in "co-op work terms" during their
education, most could care less how efficient Python's built-in data
structures are.

Also, it's a very rare DSA text that intends its DS code to be used
directly--especially in "serious" applications. Preiss's texts, noted
by the OP, are one exception, and many could be used "out of the box"
in industrial strength applications.

So far as "combining existing structures" is concerned, it's highly
unlikely that any such combination of linear structures could be more
efficient in both--if either--storage and running time than one
specifically designed and implemented for non-linear structures, such
as (many) trees and algorithms on them. For general graphs efficient
list processing may be sufficient for an adjacncy structure for the
graph itself, of course, but how does that translate to storing a
subtree found using a DFS, for example, and efficiently processing it
in some way?

For learning DSA it's more important to have a clear, well-written and
well-documented implementation in a language of interest (again,
especially, the core language in one's programs) than just "using" or
even inspecting and trying to learn subtle details of some particular
implementation of a related DS or A in some "other" language.

How many of those who devleoped and improved the very efficient data
structures in Python learned and honed their skills in Python (vs. C
or Pascal, for example)?

wwwayne
 
S

Scott David Daniels

Wayne Brehaut wrote:
....
For learning DSA it's more important to have a clear, well-written and
well-documented implementation in a language of interest (again,
especially, the core language in one's programs) than just "using" or
even inspecting and trying to learn subtle details of some particular
implementation of a related DS or A in some "other" language.

How many of those who devleoped and improved the very efficient data
structures in Python learned and honed their skills in Python (vs. C
or Pascal, for example)?

True, but these days I build my data structures first in Python to
measure their effectiveness, and (sometimes) cast them into C concrete
once I know the winner.

-Scott David Daniels
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top