J
Jason C
I'm currently limiting repeated characters like so:
$text =~ s#(.)\1{6,}#$1$1$1$1$1$1#gsi;
I'm wanting to modify it to only limit repeated characters if they're not within <img...> or <a href=...></a> tags.
I'm guessing that this would be done with negative lookahead, like this:
# Note, these aren't tested, just here for the explanation
$text =~ s#(?<!<img)(.)\1{6,}#$1$1$1$1$1$1#gsi;
$text =~ s#(?<!<a href)(.)\1{6,}#$1$1$1$1$1$1#gsi;
Neither of these are going to be perfect, though, because:
1. in the first one, I need to test for both an opening <img and an ending >; otherwise, I think it would not catch something like "<img src='aaa.jpg'> bbbbbbbbbb" (since the repeated "b" comes after "<img").
2. in the second one, I also need to test for the ending >, but also for the closing </a>. Even if I fixed the ending >, I could still end up with a confusing "<a href='http://www.aaaaaaaaaa.com'>http://www.aaaaaa.com</a>"
Any suggestions on how to do either of these better? TIA,
Jason
$text =~ s#(.)\1{6,}#$1$1$1$1$1$1#gsi;
I'm wanting to modify it to only limit repeated characters if they're not within <img...> or <a href=...></a> tags.
I'm guessing that this would be done with negative lookahead, like this:
# Note, these aren't tested, just here for the explanation
$text =~ s#(?<!<img)(.)\1{6,}#$1$1$1$1$1$1#gsi;
$text =~ s#(?<!<a href)(.)\1{6,}#$1$1$1$1$1$1#gsi;
Neither of these are going to be perfect, though, because:
1. in the first one, I need to test for both an opening <img and an ending >; otherwise, I think it would not catch something like "<img src='aaa.jpg'> bbbbbbbbbb" (since the repeated "b" comes after "<img").
2. in the second one, I also need to test for the ending >, but also for the closing </a>. Even if I fixed the ending >, I could still end up with a confusing "<a href='http://www.aaaaaaaaaa.com'>http://www.aaaaaa.com</a>"
Any suggestions on how to do either of these better? TIA,
Jason