I
IRR
Hi all,
I'm fairly new to perl and graph theory, so hopefully this question isn't
too ignorant on both counts. I'm trying to input basically an edge list for
a graph, and get an output of the connected graphs that all of these edges
comprise. So for example:
Edge list input:
A B
A C
B A
D E
Perl script output:
A+B+C,D+E
using strongly_connected_graph (described in Graph::Base). Admittedly,
strongly_connected_graph is a black box to me -- I'm not sure if its doing a
DFS versus BFS or ?.
But what is clear is that this scales exponentially -- O(n^2.3) or so, based
on a few test sets) -- in time versus the total number of input edges. This
is fine for a few smaller edge lists that I have, but my eventual goal is to
do this for a set with ~50,000 vertices and ~2 million edges total. Based
on those test sets I mentioned, this size dataset will take ~7000 hours to
run on my current system, which obviously isn't reasonable -- and that's not
even accounting for things getting choked up in inputing that amount of data
with add_edge!
So my question is, am I approaching this problem with the right tools? Is
there another way within the Graph module to do this more efficiently?
Thanks,
I.R.
I'm fairly new to perl and graph theory, so hopefully this question isn't
too ignorant on both counts. I'm trying to input basically an edge list for
a graph, and get an output of the connected graphs that all of these edges
comprise. So for example:
Edge list input:
A B
A C
B A
D E
Perl script output:
A+B+C,D+E
using strongly_connected_graph (described in Graph::Base). Admittedly,
strongly_connected_graph is a black box to me -- I'm not sure if its doing a
DFS versus BFS or ?.
But what is clear is that this scales exponentially -- O(n^2.3) or so, based
on a few test sets) -- in time versus the total number of input edges. This
is fine for a few smaller edge lists that I have, but my eventual goal is to
do this for a set with ~50,000 vertices and ~2 million edges total. Based
on those test sets I mentioned, this size dataset will take ~7000 hours to
run on my current system, which obviously isn't reasonable -- and that's not
even accounting for things getting choked up in inputing that amount of data
with add_edge!
So my question is, am I approaching this problem with the right tools? Is
there another way within the Graph module to do this more efficiently?
Thanks,
I.R.