Is my algorithm wrong?

L

Looking

Here is what I am trying to do:

For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

@parts= split (/\W/, $original);

$thelength=0;$newpart1='';
foreach (@parts) {
$thelength=$thelength+length($_)+1;
if ($thelength > 25) {break;}
else {$newpart1=$newpart1.$_;}
}

I think my algorithm has problem. The simple task should be done within 2
lines of regex.

I am thinking of substr(); but I don't know how to find the position of the
last whitespace before character #25. Or I can do substr($original, 0, 25)
then check if the 25 is a whitespace etc... if not then go to #24.

Both my algorithms take n*n, which is really bad speed.
 
T

thundergnat

Looking said:
Here is what I am trying to do:

For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

@parts= split (/\W/, $original);

$thelength=0;$newpart1='';
foreach (@parts) {
$thelength=$thelength+length($_)+1;
if ($thelength > 25) {break;}
else {$newpart1=$newpart1.$_;}
}

I think my algorithm has problem. The simple task should be done within 2
lines of regex.

I am thinking of substr(); but I don't know how to find the position of the
last whitespace before character #25. Or I can do substr($original, 0, 25)
then check if the 25 is a whitespace etc... if not then go to #24.

Both my algorithms take n*n, which is really bad speed.

How about somthing like:


use warnings;
use strict;

my $original = 'This is a test string about 46 characters long';

my $end = $original;
my $start = substr($end,0,25,'');
$start =~ s/ ([^ ]+)$//;
$end = $1.$end if defined $1;

print "$start --- $end\n";
print '$original length - '.length($original).' --- $start length - '.
length($start).' --- $end length - '.length($end)."\n";


If there is no space in the first 25 charactetrs, it will just return
the first 25 characters as the start value, otherwise it will return the
longest string smaller than 25 characters of space delimited character
groups.
 
T

thundergnat

Looking said:
Here is what I am trying to do:

For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

@parts= split (/\W/, $original);

$thelength=0;$newpart1='';
foreach (@parts) {
$thelength=$thelength+length($_)+1;
if ($thelength > 25) {break;}
else {$newpart1=$newpart1.$_;}
}

I think my algorithm has problem. The simple task should be done within 2
lines of regex.

I am thinking of substr(); but I don't know how to find the position of the
last whitespace before character #25. Or I can do substr($original, 0, 25)
then check if the 25 is a whitespace etc... if not then go to #24.

Both my algorithms take n*n, which is really bad speed.
(edited message; I cancelled a previous one but it may still show up)

How about somthing like:


use warnings;
use strict;

my $original = 'This is a test string about 46 characters long';

my $end = $original;
my $start = substr($end,0,25,'');
$start =~ s/\s+(\S+)$//;
$end = $1.$end if defined $1;

print "$start --- $end\n";
print '$original length - '.length($original).' --- $start length - '.
length($start).' --- $end length - '.length($end)."\n";


If there is no space in the first 25 characters, it will just return
the first 25 characters as the start value, otherwise it will return the
longest string smaller than 25 characters of space delimited character
groups. It allows for multiple spaces between words. Check the length of
the $start value to determine if there was a space or not. 24 or less
means there WAS a space. 25 means there wasn't.
 
L

Looking

Here is what I am trying to do:
For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

@parts= split (/\W/, $original);

$thelength=0;$newpart1='';
foreach (@parts) {
$thelength=$thelength+length($_)+1;
if ($thelength > 25) {break;}
else {$newpart1=$newpart1.$_;}
}

I think my algorithm has problem. The simple task should be done within 2
lines of regex.

I am thinking of substr(); but I don't know how to find the position of the
last whitespace before character #25. Or I can do substr($original, 0, 25)
then check if the 25 is a whitespace etc... if not then go to #24.

Both my algorithms take n*n, which is really bad speed.

Here is my new code, Please tell me if there are better ways:


$s=qq("sadf content= "this is what i' want " asd " sdf " adfa " sdf');

$sub= substr ($s, 0, 25);
$offset=0;
while ($sub=~ m/ /g) {$offset=$-[0];}; #find the offset of the last space
print $offset."\n";
$sub= substr ($s, 0, $offset); # from start to offeset
print "$sub\n";
$sub= substr ($s, $offset+1); # from the offset to end
print "$sub\n";

i still think i did something wrong at
while ($sub=~ m/ /g) {$offset=$-[0];};
it got to be a better way to run the loop.
 
L

Looking

i still think i did something wrong at
while ($sub=~ m/ /g) {$offset=$-[0];};
it got to be a better way to run the loop.

$sub=~ s/\s//g;
does the same thing, faster without loop;
 
D

David K. Wall

[Not a very descriptive Subject.]

Looking said:
Here is what I am trying to do:

For a given string, break it into two parts. The first part should
be as long as possible but less than 25 characters. The SPLIT key
is a whitespace. The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the
join them with space.

@parts= split (/\W/, $original);

\W means a non-word character, not whitespace. Maybe that's what you
wanted, but it doesn't match the description above.

[snip rest of program]

Quick and dirty:

my $string = 'something at least twenty-five characters long';
my ($part1, $part2) = $string =~ /^(.{1,23}\S)\s(.*)/s;

Ideally you should check to see if the match succeeded.

I do hope this isn't homework.
 
L

Looking

Quick and dirty:

my $string = 'something at least twenty-five characters long';
my ($part1, $part2) = $string =~ /^(.{1,23}\S)\s(.*)/s;

Ideally you should check to see if the match succeeded.

I do hope this isn't homework.

it is not a homework. it is a feature i am adding to a site.

your works except in the case that there is no space in the first 25.
my codes have problems too.

thundergnat's solution works perfect. this is his
#my $original = 'This is a test string about 46 characters long';
my $original = 'ThisSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS is a test
string about 46 characters long';

my $end = $original;
my $start = substr($end,0,25,''); #cut them into 2
$start =~ s/\s+(\S+)$//; # remove the last none whitespaces following at
least 1 whitespace
$end = $1.$end if defined $1; # add it to the second part
 
M

Michele Dondi

For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

I'm not sure if I understand what you mean, but couldn't something
along these lines suit your needs?


#!/usr/bin/perl -l

use strict;
use warnings;

$_='foo bar baz ' x 5;

my $pos;
while (/ /g) {
last if pos > 25;
$pos=pos;
}

print for unpack 'A' . $pos . 'A*', $_;

__END__


Michele
 
D

David K. Wall

Looking said:
it is not a homework. it is a feature i am adding to a site.

OK. No offense meant.
your works except in the case that there is no space in the first
25. my codes have problems too.

I didn't think of that. (Obviously.)
thundergnat's solution works perfect.

Yeah, I like it. Looks a bit like the code in Text::Wrap, but I'd
guess he/she came up with it independently.

Just to redeem myself a little, here's a slightly altered version of
the single regex solution. (Although I suspect there's a more
succinct way to express it.)


my $string = 'thisistheoriginalstringblahblahblah';

my ($part1, $part2) = grep defined,
$string =~ /^
(?: (.{1,24}\S) \s (.*) )
|
(?: (.{1,24}\S) \s? (.*) )
/sx;


Oh, and I changed it to grab at most 25 characters, not 24; at first
I just saw the text that said "less than 25 characters", that is,
< 25, not <= 25. <shrug>
 
A

Anno Siegel

David K. Wall said:
[...]

Just to redeem myself a little, here's a slightly altered version of
the single regex solution. (Although I suspect there's a more
succinct way to express it.)


my $string = 'thisistheoriginalstringblahblahblah';

my ($part1, $part2) = grep defined,
$string =~ /^
(?: (.{1,24}\S) \s (.*) )
|
(?: (.{1,24}\S) \s? (.*) )
/sx;

pos( $string) = 24;
my ($part1, $part2) = /(.*\s|.*)(\S*\G.*)/;

That's more compact, but not necessarily better than other solutions.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top