W
Wolodja Wentland
Hi all,
I am writing a library for accessing Wikipedia data and include a module
that generates graphs from the Link structure between articles and other
pages (like categories).
These graphs could easily contain some million nodes which are frequently
linked. The graphs I am building right now have around 300.000 nodes
with an average in/out degree of - say - 4 and already need around 1-2GB of
memory. I use networkx to model the graphs and serialise them to files on
the disk. (using adjacency list format, pickle and/or graphml).
The recent thread on including a graph library in the stdlib spurred my
interest and introduced me to a number of libraries I have not seen
before. I would like to reevaluate my choice of networkx and need some
help in doing so.
I really like the API of networkx but have no problem in switching to
another one (right now) .... I have the impression that graph-tool might
be faster and have a smaller memory footprint than networkx, but am
unsure about that.
Which library would you choose? This decision is quite important for me
as the choice will influence my libraries external interface. Or is
there something like WSGI for graph libraries?
kind regards
--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAEBCAAGBQJLIhr8AAoJEIt/fTDK8U78fb4P/12wcB7Kg6Z+5TSsTgflROl0
jXVLADPRyeWNeuAEWglnhkpyp2Ft0/ifaajN/jtSeB6N7hgugdIc22eC1Yip7p7m
1K1o397TG5jKEUbN1uLnA/DVNvxakP4WMDUapg21yOhSRoKZQvp0vkuM4v4BQ8Ob
w6nDMiSvdTC9qIk/UDvlO50SZfXwWbMhWecEU6V1pO8ju4qKo3PvqgAWWhSIp+Zy
QUYYxaEB7D4rJZvec2QexTsd39Zfj+xaEEzajMPEzz4am7QNeI54yYJt6cOKY3T0
IlrcbwICM2VTU/oM1fzjQE5mACSltmONIqSnrG4hdXDi2RswBpqkSHD/NbdUhB2u
UcWgI7xSIaxOU4Msq6iLbnEthOTA8s1+8X78aBvu9HV6ZWf1ALhJQqPcN0sAG8zd
fso3PWQMBbIQ3jpTgfccwLCDmGXNjsJZMYtuvMednu9fKJjC9gTa5NS+YHNa6Ia0
1D6PU/ZF6k60E6OCq5iu1gyod7htAPxJaZFQRZ11yBe+hMan9fxtq2eXd4RNzNgI
+HntD1ChJBVL3hgyhn8WAa3iEb+baD7LUt04eOt5yoDCQ9mE8+W906w+Ekdss7Gc
hyRnDcHWTlL94ABnl8oYrb19vo+aQRriqbttoMmb01AEBF9PMYHWzQjuUnfi2+u7
Zdw5en+q26ML4zi2dcCi
=In+e
-----END PGP SIGNATURE-----
I am writing a library for accessing Wikipedia data and include a module
that generates graphs from the Link structure between articles and other
pages (like categories).
These graphs could easily contain some million nodes which are frequently
linked. The graphs I am building right now have around 300.000 nodes
with an average in/out degree of - say - 4 and already need around 1-2GB of
memory. I use networkx to model the graphs and serialise them to files on
the disk. (using adjacency list format, pickle and/or graphml).
The recent thread on including a graph library in the stdlib spurred my
interest and introduced me to a number of libraries I have not seen
before. I would like to reevaluate my choice of networkx and need some
help in doing so.
I really like the API of networkx but have no problem in switching to
another one (right now) .... I have the impression that graph-tool might
be faster and have a smaller memory footprint than networkx, but am
unsure about that.
Which library would you choose? This decision is quite important for me
as the choice will influence my libraries external interface. Or is
there something like WSGI for graph libraries?
kind regards
--
.''`. Wolodja Wentland <[email protected]>
: :' :
`. `'` 4096R/CAF14EFC
`- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAEBCAAGBQJLIhr8AAoJEIt/fTDK8U78fb4P/12wcB7Kg6Z+5TSsTgflROl0
jXVLADPRyeWNeuAEWglnhkpyp2Ft0/ifaajN/jtSeB6N7hgugdIc22eC1Yip7p7m
1K1o397TG5jKEUbN1uLnA/DVNvxakP4WMDUapg21yOhSRoKZQvp0vkuM4v4BQ8Ob
w6nDMiSvdTC9qIk/UDvlO50SZfXwWbMhWecEU6V1pO8ju4qKo3PvqgAWWhSIp+Zy
QUYYxaEB7D4rJZvec2QexTsd39Zfj+xaEEzajMPEzz4am7QNeI54yYJt6cOKY3T0
IlrcbwICM2VTU/oM1fzjQE5mACSltmONIqSnrG4hdXDi2RswBpqkSHD/NbdUhB2u
UcWgI7xSIaxOU4Msq6iLbnEthOTA8s1+8X78aBvu9HV6ZWf1ALhJQqPcN0sAG8zd
fso3PWQMBbIQ3jpTgfccwLCDmGXNjsJZMYtuvMednu9fKJjC9gTa5NS+YHNa6Ia0
1D6PU/ZF6k60E6OCq5iu1gyod7htAPxJaZFQRZ11yBe+hMan9fxtq2eXd4RNzNgI
+HntD1ChJBVL3hgyhn8WAa3iEb+baD7LUt04eOt5yoDCQ9mE8+W906w+Ekdss7Gc
hyRnDcHWTlL94ABnl8oYrb19vo+aQRriqbttoMmb01AEBF9PMYHWzQjuUnfi2+u7
Zdw5en+q26ML4zi2dcCi
=In+e
-----END PGP SIGNATURE-----