J
Jane
Hi,
I'm attempting to make sure that any ampersands I produce in a webpage
are using the proper entity code (ie. "&" as opposed to simply
"&"). I can get so far with it, and it works fine, except for one
small, but important detail.
Importantly, I have to assume, when swapping any "&" I find, that it is
not already part of an entity, which could include "&" itself, or
something like "›", for example. It may also simply be preceeded or
suffixed by nothing other than a space.
So, I produced a little test sentence, below, to try out my regex, and
it swaps everything it's supposed to swap, but it gobbles up an extra
character as well, when I don't want it to. The regex, the original
test sentence, and the sentence after being regexed, are below
(hopefully the ampersands etc don't get escaped when I post this):
THE ORIGINAL SENTENCE:
$x="Apples & oranges from T&J are really good and tasty …, & I
should know...\n";
(which contains two ampersands to be regexed, one between apples and
oranges, and one between "T" and "J". The others are, of course,
already in good shape).
THE REGEX:
$x=~s/&[^#a]/&/g;
THE RESULT:
Apples &oranges from T& are really good and tasty …, &
I should know...
.... so it gobbles up the space character before oranges, and also the
J, from T&J. I've tried all sorts of things, but can only seem to make
it worse! ... Any assistance would be mightily appreciated.
Thanks!
Jane
I'm attempting to make sure that any ampersands I produce in a webpage
are using the proper entity code (ie. "&" as opposed to simply
"&"). I can get so far with it, and it works fine, except for one
small, but important detail.
Importantly, I have to assume, when swapping any "&" I find, that it is
not already part of an entity, which could include "&" itself, or
something like "›", for example. It may also simply be preceeded or
suffixed by nothing other than a space.
So, I produced a little test sentence, below, to try out my regex, and
it swaps everything it's supposed to swap, but it gobbles up an extra
character as well, when I don't want it to. The regex, the original
test sentence, and the sentence after being regexed, are below
(hopefully the ampersands etc don't get escaped when I post this):
THE ORIGINAL SENTENCE:
$x="Apples & oranges from T&J are really good and tasty …, & I
should know...\n";
(which contains two ampersands to be regexed, one between apples and
oranges, and one between "T" and "J". The others are, of course,
already in good shape).
THE REGEX:
$x=~s/&[^#a]/&/g;
THE RESULT:
Apples &oranges from T& are really good and tasty …, &
I should know...
.... so it gobbles up the space character before oranges, and also the
J, from T&J. I've tried all sorts of things, but can only seem to make
it worse! ... Any assistance would be mightily appreciated.
Thanks!
Jane