Filtering a string

B

Bill H

Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant

sub fix
{
my $o = shift;
my $r = "";
my $i = 0;
for($i = 0;$i < length($o);$i++)
{
if(index($filter,uc(substr($o,$i,1))) != -1){$r .= substr($o,$i,1);}
}
return($r);
}

I just typed this in to give you the gist of what I want to do so if
there are errors it is in my typing.

Bill H
 
D

Dave B

Bill said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant

Is the following acceptable for you?

$new = lc(join("",grep(m/[$filter]/i,split(//,$original))));
 
B

Ben Morrow

Quoth Bill H said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant

$orginal =~ tr/a-zA-Z0-9_-//cd;

See tr/// under 'Regexp Quote-Like Operators' in perlop.

Note that '-' must come first or last, as otherwise it will be
interpreted as part of an X-Y range.

Ben
 
J

John W. Krahn

Bill said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant

$ perl -le'
$original = "a malformed%string/containi\"ng characters I don\047~t
want\041 ...";
$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

( $new = $original ) =~ s/[^\Q$filter\E]+//ig;
print for $original, $new;
'
a malformed%string/containi"ng characters I don'~t want! ...
amalformedstringcontainingcharactersIdontwant



John
 
B

Bill H

Quoth Bill H said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):
$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";
$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";
$new = &fix($original);
$new would now equal:
amalformedstringcontainingcharactersidontwant

$orginal =~ tr/a-zA-Z0-9_-//cd;

See tr/// under 'Regexp Quote-Like Operators' in perlop.

Note that '-' must come first or last, as otherwise it will be
interpreted as part of an X-Y range.

Ben

--
For far more marvellous is the truth than any artists of the past imagined!
Why do the poets of the present not speak of it? What men are poets who can
speak of Jupiter if he were like a man, but if he is an immense spinning
sphere of methane and ammonia must be silent? [Feynmann]     (e-mail address removed)

Ben this one seems to work nice for me, only issue is I have to lc()
the results (not really a problem). Is there a way to have that done
in this ? Also thanks for pointing out the part in perlop that deals
with it.

Bill H
 
B

Ben Morrow

Quoth Bill H said:
Ben this one seems to work nice for me, only issue is I have to lc()
the results (not really a problem). Is there a way to have that done
in this ?

Well, tr/-_a-zA-Z0-9\000-\177/-_a-za-z0-9/d will work, but personally
I'd probably just use lc, and probably beforehand, to avoid two
alphabetic ranges. Something like

(my $new = lc $original) =~ tr/a-z0-9_-//cd;

seems clearer than relying on the details of tr///, which is an operator
a lot of Perl programmers don't know very well.

Ben
 
D

Dave Weaver

Bill said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant

Is the following acceptable for you?

$new = lc(join("",grep(m/[$filter]/i,split(//,$original))));

That won't do what the OP wants - try it with "=", "[" or "<" in
$original, for example.
 
D

Dave B

Dave said:
Bill said:
Can someone point me to some docs on how I would do this without
iterating over the whole string (pattern matching?):

$original = "a malformed%string/containi\"ng characters I don'~t
want! ...";

$filter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_";

$new = &fix($original);

$new would now equal:

amalformedstringcontainingcharactersidontwant
Is the following acceptable for you?

$new = lc(join("",grep(m/[$filter]/i,split(//,$original))));

That won't do what the OP wants - try it with "=", "[" or "<" in
$original, for example.

Yes, the "-" in the filter should be at the very beginning or end, I
overlooked that. However, after seeing the solutions based on tr///, I
realized that this is not by any means a good solution. I'm still learning...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top