Which graph library is best suited for large graphs?

W

Wolodja Wentland

Hi all,

I am writing a library for accessing Wikipedia data and include a module
that generates graphs from the Link structure between articles and other
pages (like categories).

These graphs could easily contain some million nodes which are frequently
linked. The graphs I am building right now have around 300.000 nodes
with an average in/out degree of - say - 4 and already need around 1-2GB of
memory. I use networkx to model the graphs and serialise them to files on
the disk. (using adjacency list format, pickle and/or graphml).

The recent thread on including a graph library in the stdlib spurred my
interest and introduced me to a number of libraries I have not seen
before. I would like to reevaluate my choice of networkx and need some
help in doing so.

I really like the API of networkx but have no problem in switching to
another one (right now) .... I have the impression that graph-tool might
be faster and have a smaller memory footprint than networkx, but am
unsure about that.

Which library would you choose? This decision is quite important for me
as the choice will influence my libraries external interface. Or is
there something like WSGI for graph libraries?

kind regards

--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJLIhr8AAoJEIt/fTDK8U78fb4P/12wcB7Kg6Z+5TSsTgflROl0
jXVLADPRyeWNeuAEWglnhkpyp2Ft0/ifaajN/jtSeB6N7hgugdIc22eC1Yip7p7m
1K1o397TG5jKEUbN1uLnA/DVNvxakP4WMDUapg21yOhSRoKZQvp0vkuM4v4BQ8Ob
w6nDMiSvdTC9qIk/UDvlO50SZfXwWbMhWecEU6V1pO8ju4qKo3PvqgAWWhSIp+Zy
QUYYxaEB7D4rJZvec2QexTsd39Zfj+xaEEzajMPEzz4am7QNeI54yYJt6cOKY3T0
IlrcbwICM2VTU/oM1fzjQE5mACSltmONIqSnrG4hdXDi2RswBpqkSHD/NbdUhB2u
UcWgI7xSIaxOU4Msq6iLbnEthOTA8s1+8X78aBvu9HV6ZWf1ALhJQqPcN0sAG8zd
fso3PWQMBbIQ3jpTgfccwLCDmGXNjsJZMYtuvMednu9fKJjC9gTa5NS+YHNa6Ia0
1D6PU/ZF6k60E6OCq5iu1gyod7htAPxJaZFQRZ11yBe+hMan9fxtq2eXd4RNzNgI
+HntD1ChJBVL3hgyhn8WAa3iEb+baD7LUt04eOt5yoDCQ9mE8+W906w+Ekdss7Gc
hyRnDcHWTlL94ABnl8oYrb19vo+aQRriqbttoMmb01AEBF9PMYHWzQjuUnfi2+u7
Zdw5en+q26ML4zi2dcCi
=In+e
-----END PGP SIGNATURE-----
 
W

Wolodja Wentland

Wolodja Wentland:

This one probably uses low memory, but I don't know if it works still:
http://osl.iu.edu/~dgregor/bgl-python/

That project looks not that maintained and graph-tool [1] is based on
boost as well, so I don't see the advantage in choosing bgl-python over
graph-tool.

The point is, that I am not sure if using graph-tool has any advantages
over networkx at all. It looks like a great library, supports filtered
graphs which I find pretty useful, but have not used it yet.

[1] http://projects.forked.de/graph-tool/
--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJLIipXAAoJEIt/fTDK8U78qtQP/22adnDHDQwd0ZMqOaDPKyKp
hIaCrIk4QJH5NQoGDm4Fo0JMEmI11qELKXlEFj2FHzXONvHuSzeSGrQYug13y8mq
P81yQ5zze8WEueNeWc0LfQfZtQzkp+2MlEYCUBEPqGjxbN6U5v7KVgvvENBWVLVp
Mjnj4Ou4hWdO3hwskzcHe+oQZt0YExk26kUdm3/hCgLLe+J/jgxIKtjPZO9Q3Ksc
qMVBGpeFCwHhGYvj9s6qaxEwEDje4J36PBaCYMd6krB8xPAyLo9KiImHNMJ/yRyH
OegFxbVrlkBhuqMMF8WHu/eAST1/GDsGq5VAyr9wtffZRnAvs42YK3Bcm/Y0GgsE
BpB5X5Knm9+IrjGuzgyTXxEUm0UALaCYuy3GnregqsUsnqWNWDk0vEDRy/779+Dj
vbLWhTVZM1APHDKmSrkL8AXaWkxUO5T+RwhietfW7lTW5DJPrpCyeyxj0eB5NNoM
Uv5q8KZXwindt2QhWD7zCHTmahA9dzqwWi74ftJfS1lF0FeV8xOoSoAP+EmqgSCA
bHnVq+sPFrc7wo2yArXVRsebHESn7hjzcPiqcEPIPj1cXL7nVg18o28OsIl863Nt
gmRC4fOy5yzYz6WeKBK5UxfEIk1+8GqjQzZn86GILSJqrUcwSiOmQ8aU2r1/0pqt
DD9xP+RkwTo3JPv36iY1
=s6mx
-----END PGP SIGNATURE-----
 
W

Wolodja Wentland

How about python interface to igraph?

Don't know :) as I have not yet worked with it. Why do you recommend it?
--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJLIlItAAoJEIt/fTDK8U781M4P/3ZeNik7CnogmfxPD7grPBHV
lVe0o+oBj9qCU3gMp9CgefozbIUX3Zhp6n2Bc4yO3rZLQhOljXz4mNkyQBhwJNf9
IIFt847s3ykAIyBeY5N5Be0ahApcvYPLjWkZnMNjIIpIyRJo+ogP9ekQNF0lzwuv
xW3LOdwIlz23MOx7pvpY0ub1DsAXmftG27DcAo+5LRcne21O9guiJnBb87OBG/oA
jiNG5dO2P/M5atD0BD1XaiXt1PoYoG3yHFPuDME1jCWT++7fwzbR2ToOZHNmDE1+
Nhj0v3wlwwtjmuy7vyA5qDlAM572nPYTZs0Cbm4ZexrpXDe0LWfTujjtjVsRwEd0
DAOImHa7Ej+MuIpTx330zexSZYrVXUGCztm/9YTdzJH22it13k05sfRVpvuzRCVZ
FoFTw1Cnj3uyVN9ftqQzgo7B/FemBVLOGEgNW5bSotcAfKh8yXMQFgotaPHJsB1g
z8ybORVD7tpxJLUWVod+SIoGTOx1xdnnku6ex2mIbgKtUmMLiSiiY4EhL/EEGubX
VboUDwIpfdlANNoLR2ffr73PzdmeEMqU71iN253aK8IDtdZVqyx1ef23Ul8CYOb0
bW0zo/6zHQkMJN1mCTR86cUthFo8snfMX4NHuECd4tod5xq7EblHVhnj3YdU0Qpw
uxDEYe74YVT1OSQbRG6a
=1VSV
-----END PGP SIGNATURE-----
 
I

IngoognI

Which library would you choose?

looking at the galery at networx, it seems to be all balls 'n sticks,
how about writing the data to a file POV-Ray can read and render it
there?
 
W

Wolodja Wentland

On Dec 11, 11:12 am, Wolodja Wentland <[email protected]>
wrote:

looking at the galery at networx, it seems to be all balls 'n sticks,
how about writing the data to a file POV-Ray can read and render it
there?

Huh? I am not really concerned about rendering the graphs but after a
library with a small memory footprint. Preferably one that contains a
number of typical algorithms.
--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJLIm2PAAoJEIt/fTDK8U78nqcP/2ydeP5SOF/0tA1hHsAeXKDn
FBtOP73hyls+lnIZzmOzvx7ULFc2sxF4PYdya6rDXCAT9SXSe+5ypfkXlRdlrzyg
zASf/HUuOAhaQ7Wy5HKqWu/DX9d9GfRr91ecQKcOyRqBf9DJSELOmFIzsuBKZTRJ
fMEHrgvtvhoobqbHPZHTovQttDSs8qFrUr9+Suofywgc5Tp0WZML7HqOHpTgR4Fl
M+X9UsmDd5vE3+BWKwjt+FpWr5QzOi23maEaGufBibLpm4pycDfhh0g8dY4Pn0xy
Y02vXZo86XB2dxTWGopUM0atKWkhRcsafK3fR6NBkHuQ4XcA+kXhHY1/uYMZJpGv
9suFQsAAlYtEV0n1zK8U6Jt/0yqABXS3KQ19lPe8osSoFJ8qBswlbKyXeOa6PXOI
1XC81+xv6Hy9Ku1lYDCMOC+DMlxRXh8+1yBZLRCed8B/IbJu0exfNGbSJaoGInfa
PROpW1wQR9SsZhcPkhTHj/rxpZIebjgioq3BfR4UKWx3FHi5RUfeUIe7WKNsSLlE
/AEZEtcTCyNyneeTNDHwJjRjKEpMeFdmnGoMOwF3hxu1uFzNeRSeaZNjtid444HJ
q+41ZWM1UNxxNlCHFWb+QXoaQSdBMD6Aczvjb2bW1O8l0S2p3PjdzRNhpNO2N1zU
o91ZV7Qv7wVSxu6KjsAY
=UmG5
-----END PGP SIGNATURE-----
 
N

Neal Becker

Wolodja said:
Don't know :) as I have not yet worked with it. Why do you recommend it?

My understanding is that igraph is a high performance graph library (all
implemented in C). It seems to have a very active user community.

There is also boost graph library, which IIRC also has a python interface.
 
A

anand jeyahar

Which library would you choose?
Hmm.... i have tried python-graph and was happy with it....but the most
use i did was for complete graphs of 60-65 nodes..

Also there is an experimental branch for faster implementations, which
is under development.

--
==============================================
Anand J
http://sites.google.com/a/cbcs.ac.in/students/anand
==============================================
The man who is really serious,
with the urge to find out what truth is,
has no style at all. He lives only in what is.
~Bruce Lee

Love is a trade with lousy accounting policies.
~Aang Jie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,790
Messages
2,569,637
Members
45,346
Latest member
EstebanCoa

Latest Threads

Top