Standard graph API?

M

Magnus Lie Hetland

Is there any interest in a (hypothetical) standard graph API (with
'graph' meaning a network, consisting of nodes and edges)? Yes, we
have the standard ways of implementing graphs through (e.g.) dicts
mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient (or
abstract) interface to emulate. It might be nice to have the kind of
polymorphic freedom that one has with, e.g, with the DB-API. One could
always develop factories or adaptors (such as for PyProtocols) to/from
the dict-of-sets version...

So, any interest? Or am I just a lone nut in wanting this?
 
S

Steven Bethard

Magnus Lie Hetland said:
Is there any interest in a (hypothetical) standard graph API (with
'graph' meaning a network, consisting of nodes and edges)?

I don't need one right now, but I know I have a few times in the past.
Certainly seems like a good idea to me. We've got sets as builtins now, no
reason we shouldn't have a simple graph API, at least in the library.

Steve
 
W

wes weston

Magnus said:
Is there any interest in a (hypothetical) standard graph API (with
'graph' meaning a network, consisting of nodes and edges)? Yes, we
have the standard ways of implementing graphs through (e.g.) dicts
mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient (or
abstract) interface to emulate. It might be nice to have the kind of
polymorphic freedom that one has with, e.g, with the DB-API. One could
always develop factories or adaptors (such as for PyProtocols) to/from
the dict-of-sets version...

So, any interest? Or am I just a lone nut in wanting this?
Magnus,
A know I'd appreciate it. It could be used to configure
neural nets and logic networks; where this api would make
it easy to build an abstraction then "compile" it into a
faster representation for execution - or just run the
tree/graph in "interpreted" mode.
I don't think it would get a lot of use, but the use
would be high end.
wes
 
D

David Eppstein

Yes, we have the standard ways of implementing graphs through (e.g.)
dicts mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient
(or abstract) interface to emulate.

Actually, my interpretation of this standard way is as a fairly abstract
interface, rather than a specific instantiation such as dict-of-sets:
Most of the time, I merely require that iter(G) produces a sequence of
the vertices of graph G, and iter(G[v]) produces a sequence of neighbors
of vertex v. I also sometimes use "v in G" and "w in G[v]" to test
existence of vertices or edges.

Pros and cons of this approach:

- You can use a list instead of a set in the adjacency list part of the
representation. This may be faster and more space efficient when the
vertex degrees are small.

- It's easy to create test graphs as code literals

G1 = {
0: [1,2,5],
1: [0,5],
2: [0,3,4],
3: [2,4,5,6],
4: [2,3,5,6],
5: [0,1,3,4],
6: [3,4],
}
G2 = {
0: [2,5],
1: [3,8],
2: [0,3,5],
3: [1,2,6,8],
4: [7],
5: [0,2],
6: [3,8],
7: [4],
8: [1,3,6],
}

- Any indexable object can be a vertex. The vertex identities can be
something meaningful to your program. On the other hand, that means
(unless you know where your graph came from) you can't rely on the
vertices being special vertex objects with nice properties and you can't
use objects like None as flag values unless you're sure they won't be
vertices.

- It doesn't provide an abstract way of changing the graph (although
that's relatively easy if G is e.g. a dict of sets)

- It doesn't directly represent multigraphs

- It doesn't directly represent undirected graphs (instead you have to
replace an undirected edge by two directed edges and hope your callers
don't give you a directed graph by mistake).

- There isn't an explicit object representing an edge, although you can
create one by using a tuple (v,w) or (for undirected edges) a set. This
can be an advantage in terms of memory usage but a disadvantage in terms
of number of object creations. Also it means that if you want to store
information on the edges you have to use a dict indexed by the edge
instead of attributes on an edge object (probably better style anyway
since it prevents different algorithms on the same graph from colliding
with each other's attributes).
 
P

Phil Frost

+1 for standard graph API!

I don't have a "high-end" use for it, but I did write a program which
graphs the revision history of a software repository. It would have been
nice to have most of that code in a library, and if such a library
existed, it would probably implement operations I was too lazy to
implement, such as coloring.
 
D

David Eppstein

Phil Frost said:
+1 for standard graph API!

I don't have a "high-end" use for it, but I did write a program which
graphs the revision history of a software repository. It would have been
nice to have most of that code in a library, and if such a library
existed, it would probably implement operations I was too lazy to
implement, such as coloring.

I have a random sample of graph algorithms implemented in
http://www.ics.uci.edu/~eppstein/PADS/

I use the existing Guido-standard graph representation, that is:
iter(G) and iter(G[v]) list vertices of G and neighbors of v in G
v in G and w in G[v] test existence of vertices and edges in G

It includes both simple basic graph algorithm stuff (copying a graph, a
DFS implementation that works non-recursively so it doesn't run into
Python's recursion limit) and some much more advanced algorithms (e.g.
non-bipartite maximum matching).
 
P

Paul McGuire

Magnus Lie Hetland said:
Is there any interest in a (hypothetical) standard graph API (with
'graph' meaning a network, consisting of nodes and edges)? Yes, we
have the standard ways of implementing graphs through (e.g.) dicts
mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient (or
abstract) interface to emulate. It might be nice to have the kind of
polymorphic freedom that one has with, e.g, with the DB-API. One could
always develop factories or adaptors (such as for PyProtocols) to/from
the dict-of-sets version...

So, any interest? Or am I just a lone nut in wanting this?

Not sure if this falls under the category of an API, but it may be relevant
to what you are doing.

This is a Python API to the Graphviz DOT language:
http://dkbza.org/pydot.html

So I think this is evidence you are not alone.

-- Paul
 
I

Istvan Albert

Magnus said:
So, any interest? Or am I just a lone nut in wanting this?

I have often needed to use simple graph concepts and wrote a bunch
of code, then at some point I have started to unify it and (slowly)
put together a consistent model. When my research brings me back to
graphs I'll try to finish it.

http://www.personal.psu.edu/staff/i/u/iua1/python/graphlib/html/

I do have a lot functionality working, you can associate arbitrary
data with the nodes and edges, bfs, dfs, topological sort,graph
generation, David Epstein's python dijkstra algorithm, graphviz
visualization.

Istvan.
 
J

Jeremy Bowers

- It doesn't directly represent multigraphs

- It doesn't directly represent undirected graphs (instead you have to
replace an undirected edge by two directed edges and hope your callers
don't give you a directed graph by mistake).

- There isn't an explicit object representing an edge, although you can
create one by using a tuple (v,w) or (for undirected edges) a set.

I think these three things speak to why there isn't a graph type and
probably won't be one any time soon; unlike "Sets", there are just too
many types of "graphs" in use, all fundamentally different in
implementation, and with all differences having massive performance
implications. As you indirectly point out, each of the following is an
independent dimension:

* Directed, undirected
* Multi or non-multi
* Explicit edges/explicit nodes with links/node and edge objects
* Simple and fast implementation of nodes/nodes and edges that take
attributes

That's a good 24 possible types of graph library, each with implications
w.r.t. algorithms and performance.

While the abstract idea of a standard graph library is appealing to some
people, any actual concrete implementation will likely leave the majority
of people who want to use it out in the cold, resulting either in
something only useful in the simplest of cases, or suffering from major
feeping creaturitis as it tries to cover too many bases at once.
 
W

wes weston

Magnus said:
Is there any interest in a (hypothetical) standard graph API (with
'graph' meaning a network, consisting of nodes and edges)? Yes, we
have the standard ways of implementing graphs through (e.g.) dicts
mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient (or
abstract) interface to emulate. It might be nice to have the kind of
polymorphic freedom that one has with, e.g, with the DB-API. One could
always develop factories or adaptors (such as for PyProtocols) to/from
the dict-of-sets version...

So, any interest? Or am I just a lone nut in wanting this?

Magnus,
Do you have any design thoughts. It would be good to have weighted,
directed graphs and depth first traversal.
wes
 
M

Magnus Lie Hetland

Yes, we have the standard ways of implementing graphs through (e.g.)
dicts mapping nodes to neighbor-sets, but if one wants a graph that's
implemented in some other way, this may not be the most convenient
(or abstract) interface to emulate.

Actually, my interpretation of this standard way is as a fairly abstract
interface, rather than a specific instantiation such as dict-of-sets:
Most of the time, I merely require that iter(G) produces a sequence of
the vertices of graph G, and iter(G[v]) produces a sequence of neighbors
of vertex v. I also sometimes use "v in G" and "w in G[v]" to test
existence of vertices or edges.

Yes, I agree, to some extent. I guess the problems start when you want
to manipulate the graph. I think it would be nice to be able to use an
empty graph object to build a given graph without knowing the
implementation. I guess you could do that in this implementation too
(if all the neighbor sets were initialized).

But if this does turn out to be an acceptable API, I'm all for it. I
just think it would be nice to have a Recommended Standard(tm), to
create interoperability.

[snip]
- It doesn't provide an abstract way of changing the graph (although
that's relatively easy if G is e.g. a dict of sets)
Right.

- It doesn't directly represent multigraphs

Unless you insist on having neighbor-sets, it does, doesn't it?
Neighbor-lists can be used for this...?

[snip]
 
M

Magnus Lie Hetland

[snip]

As I tried to state in the original post (I probably wasn't clear
enough) I'm not talking about a standard *implementation*, just a
standard *API*, like the DB-API. This could easily cover all kinds of
strange beasts such as directed or undirected, weighted or unweighted
(etc.) graphs; multigraphs, chain graphs, hypergraphs, who knows.

I'm basically just suggesting that it might be useful to have a
"standard" interface for these things. It may be that the simple de
facto standard that David cites is sufficient (although it certainly
doesn't cover hypergraphs -- but that's possibly going a bit too far
anyway.)
 
M

Magnus Lie Hetland

<[email protected]>, wes
weston wrote:
[snip]
Magnus,
Do you have any design thoughts. It would be good to have weighted,
directed graphs and depth first traversal.

I've thought of several alternatives; basically, I just thought about
defining the "standard" API for the basic abstract data type
(including weights, direction, labels, colours etc.). Concrete
implementations and algorithms would be a separate issue.
 
D

David Eppstein

Unless you insist on having neighbor-sets, it does, doesn't it?
Neighbor-lists can be used for this...?

If you're doing anything serious with a multigraph you need to have some
way of distinguishing different edges between the same pair of vertices.
For instance, an edge object for each edge, that you can use as an index
to store information about that edge. A neighbor list that has multiple
copies of the same neighbor won't let you do that, you can iterate
through the edges but not distinguish one from another.

Another possibility, which fits into the same general abstract API but
is more specialized, would be to represent a multigraph by a dict of
dicts, where the outer dict maps each vertex to its neighbors and the
inner dict maps each neighbor to the number of edges; then you could
represent each edge by a tuple (v,w,index) with index in range(G[v][w]).
 
P

Paramjit Oberoi

enough) I'm not talking about a standard *implementation*, just a
standard *API*, like the DB-API. This could easily cover all kinds of
strange beasts such as directed or undirected, weighted or unweighted
(etc.) graphs; multigraphs, chain graphs, hypergraphs, who knows.

I believe the equivalent thing in the C++ world is the Boost Graph Library
described here:

http://www.boost.org/libs/graph/doc/table_of_contents.html
http://www.awprofessional.com/bookstore/product.asp?isbn=0201729148

I tried using it once, and it was so horrendously complicated that I gave
up. Some of the complexity came from having to abstract all the different
kinds of graphs that were supported, but a lot of it was also a result of
the static nature of C++.

Still, it may be useful as a source of ideas and/or warning of problems.

-param
 
D

David Eppstein

I've thought of several alternatives; basically, I just thought about
defining the "standard" API for the basic abstract data type
(including weights, direction, labels, colours etc.). Concrete
implementations and algorithms would be a separate issue.

I would strongly prefer not to have weights or similar attributes as
part of a graph API. I would rather have the weights be a separate dict
or function or whatever passed to the graph algorithm. The main reason
is that I might want the same algorithm to be applied to the same graph
with a different set of weights.

A secondary reason is that we already have in Python a good general
mechanism (dicts) for associating arbitrary information with objects, I
don't see a need for reinventing a more specific mechanism for doing so
when the objects are pieces of graphs and the information is some list
of weight, label, etc that some graph API designer thinks is sufficient.

I think this may contradict some things I said a year or two ago about
using a dict-of-dicts representation in which G[v][w] provides the
weight; I've changed my mind.
 
J

Jeremy Bowers

[snip]

As I tried to state in the original post (I probably wasn't clear enough)
I'm not talking about a standard *implementation*, just a standard *API*,
like the DB-API. This could easily cover all kinds of strange beasts such
as directed or undirected, weighted or unweighted (etc.) graphs;
multigraphs, chain graphs, hypergraphs, who knows.

Point conceded about API and not a library, but I'm not sure that changes
my point much. Your API is going to assume something about how edges are
represented (which will conflict with somebody), *or* it will be so vague
as to not have any particular advantage over the nothing we have now. And
so on for most of the other dimensions.
 
M

Magnus Lie Hetland

[snip]
I would strongly prefer not to have weights or similar attributes as
part of a graph API. I would rather have the weights be a separate dict
or function or whatever passed to the graph algorithm. The main reason
is that I might want the same algorithm to be applied to the same graph
with a different set of weights.

I can see that.

[snip stuff about using dicts]

This can be said about all objects, really; no reason to have
attributes as long as we can associate values with the objects using
dicts. This is where things start to look implementation-specific,
even though it is *possible* to keep it abstract using this interface.

One of my motivations is allowing arbitrary structures behind the
scenes, e.g. the graph may be a front-end for something that is
computed on-the-fly using specialized hardware (in fact a very real
possibilit in my case). I could have something like this be
represented by several distinct objects (e.g. one for the topological
structure, one for the weights), of course, but I'd certainly have to
implement the weight mapping myself, and not use built-in dicts.

I do think your API is nice in that it is simple, but I also have the
feeling that using it with other implementations would be sort of
unnatural; one would be trying to emulate the "dict-of-lists with
dicts for properties" implementation, because that implementation
*would* have been simple -- had one used it.

Also, again, this doesn't lend itself very well to manipulating
graphs. If I set one weight to infinity, I might expect (perhaps) the
corresponding edge to disappear (otherwise the graph would have to be
complete in the first place) or similar things; there may also be
other dependencies between properties. Not easily handled with a
separate object for each property. And using functions for everything
that needs calculating doesn't easily lead to polymorphism...

It's not horribly inconvenient, of course (just a matter of defining a
few objects referring to the same underlying mechanism, each with
a different __getitem__ method). I'm just airing my thoughts about why
it *might* be useful to have something else -- possibly in addition.

Perhaps one could have something like two levels? The Level 1 Graph
API would support the "graph as mapping from nodes to neighbors" with
"properties as separate mappings" and the Level 2 Graph API could add
some convenience methods/properties for encapsulation and
manipulation?

[snip]
I think this may contradict some things I said a year or two ago
about using a dict-of-dicts representation in which G[v][w] provides
the weight;

Yeah, I remember you saying that :)
I've changed my mind.

Fair enough. FWIW, I agree with your new position when it comes to the
simple dict-based implementation; this is basically how it's done in
pseudocode, usually (e.g. having pi[v] represent the predecessor of v
and so forth).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top