Approach to match segments, opinions welcome

F

Frank Maas

Hi Perl Group,

I am fussing about a problem and I would very much like some input of
others. I'll explain the problem:

given a list of numbers 1,2,3,15,13,19,5,6
and a set of data triples (1,15,9),(15,13,2),(13,6,4),(20,23,99)

I am looking for a fast algorithm that would convert the list into
the sum of 9, 2 and 4. To explain some more: the triples in effect
are the begin and endpoint in the list and the data that is valid
for that subsegment. No triple contains overlap, but for the begin
and end. There are more triples than subsegments and it is possible
that a subsegment cannot be found. In that case something special
must be done.

I have tried two approaches:
(1) use a series of regexp created out of the triple list (+/ 1000)
and match this with the list:

$list="1:2:3:15:13:19:5:6"; $d=0;
for (@triples) {
if ($list =~ s/(?:^|:)($_->[0]):(?:.+:)?($_->[1])(?::|$)/:$1:-:$2:/g)
{
$d+=$_->[2]
}
}
$list=~s/:\d+:(-|$)//;
print "success $d\n" unless $link

(2) walk through the list (converted to an array) and find matches

%triples=( '1:15'=>9, ...);
@list=qw(1 2 3 15 13 19 5 6 X); $d=0;
for $i (shift @list) {
for $j (shift @list) {
exit if $j eq 'X';
$d+=$triples{"$i:$j"}, last if exists $triples{"$i:$j"};
}
}
print "success $d\n";

Both work (perhaps I have oversighted an error in the sample code
above), but a tad bit slow. Since this is must once be part of a
heavy webserver any time optimisation is very helpful. Any ideas?

--Frank
 
S

Steven Kuo

On Thu, 21 Aug 2003, Steven Kuo wrote:

(snippped)
I think the second approach is better. You've not mentioned what
should be done if the first number in the list has no stopping
delimiter. Do you stop? Or do you attempt to find a valid starting
point? Assuming the former, here's what I'd do:

#!/usr/local/bin/perl

use strict;
use warnings;
my @triples = ( [1,15,9],[15,13,2],[13,6,4],[20,23,99]);
my %find;

for (@triples) {
$find{$_->[0]} = {
stop => $_->[1],
add => $_->[2],
};
}

my @list = qw(1 2 3 15 13 19 5 6);
my $save = my $start = shift @list;
my $stop;

unless ($start and exists $find{$start}->{stop}) {
^^^^^^^

I suppose this should be

unless (defined ($start) and exists $find{$start}->{stop}) {
 
F

Frank Maas

Hi Steven,
You're using certain numbers to delimit a list of numbers?

I am not exactly sure what you mean by this. If your question
was to what I am doing exactly: this is about calculating travel
times in a network of nodes. Where the string is the node list and
the triples are travel times between two nodes.
I think the second approach is better.

Yes... When I wrote this I had an error in the code that I found later.
The 2nd approach is way (way!) faster.
You've not mentioned what
should be done if the first number in the list has no stopping
delimiter. Do you stop? Or do you attempt to find a valid starting
point? Assuming the former, here's what I'd do:
# Loop once and exit as soon as possible -- should be faster than
# what you had previously written.

I can use this, but not exactly your solution. What I did not write
is that it is possible to have two triples with the same "start".
Your solution does not accommodate this. But the one time loop (w/o
shifting) seems promising.

Thanks!

Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top