I'm trying to use Perl to replace a line in a few XML files I have.

Example XML below, I'm wanting to change the Id= part from Id="/Local/
App/App1" to Id=/App1". I know there's an easy way to do this with
perl alone however I'm trying to use XML::Simple or any XML plugin for

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

<Profile xmlns="xxxxxxxxx" name="" version="1.1" xmlns:xsi="http://

<Application Name="App1" Id="/Local/App/App1" Services="1" policy=""
StartApp="" Bal="5" sessInt="500" WaterMark="1.0"/>




I don't think that processing XML with Perl alone (i.e. without any
module) is easy.
I'm trying to use XML::Simple
or any XML plugin for perl.

Have a look first at the excellent web site
Ways to Rome: Processing XML with Perl
(original version by Ingo Macherius, maintained by Michel Rodriguez)

If you don't find a solution there,
then you can always employ a combination of the CPAN modules
XML::Reader and XML::Writer

A sample program would look as follows:

use strict;
use warnings;

use XML::Reader;
use XML::Writer;

my $rdr = XML::Reader->newhd(\*DATA, {filter => 3});
my $wrt = XML::Writer->new(OUTPUT => \*STDOUT,

# If, with XML::Writer, you write mixed content XML (that
# is tags and characters in the same level, such as, for ex.:
# <data>abc<sub>def</sub>ghi</data>
# then XML::Writer will abort with message "Mixed content
# not allowed". To allow XML::Writer in this case, you
# will have to alter the parameters to
# XML::Writer->new(NEWLINES=>0, DATA_MODE=>0, DATA_INDENT=>0);
# or even to
# XML::Writer->new(NEWLINES=>1, DATA_MODE=>0, DATA_INDENT=>0);

$wrt->xmlDecl('UTF-8', 'no');

while ($rdr->iterate) {
my $tag = $rdr->tag;
my $val = $rdr->value;
my %att = %{$rdr->att_hash};

if ($rdr->path eq '/Profile/Application'
and defined $att{Id}) {
# change '/../../zzz' into 'zzz'
$att{Id} =~ s{\A .* /}''xms;

if ($rdr->is_start) { $wrt->startTag($tag, %att); }
if ($val ne '') { $wrt->characters($val); }
if ($rdr->is_end) { $wrt->endTag($rdr->tag); }


<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

Name="App1" Id="/Local/App/App1" Services="1"
policy="" StartApp="" Bal="5" sessInt="500"

Name="App99" Id="/Dummy/Test/iii" Services="3"
policy="99" StartApp="2" Bal="7" sessInt="27"

Name="Yyee" Id="/Dat/Inp/Out" Services="5"
policy="88" StartApp="" Bal="1" sessInt="8"


Jürgen Exner

Klaus said:
I don't think that processing XML with Perl alone (i.e. without any
module) is easy.

Well, XML is a rather straightforward, well structured language. If you
are familar with compiler construction then it should be no big deal. At
least much easier to parse than let's say C or Perl itself or even HTML
(there are too many special cases in HTML).



I agree, XML is straight forward and well structured, that's why I
like to use it wherever I can.

....and if I was a compiler writer, I would say that processing XML was
easy :)

By the way, I have now released a new version of XML::Reader (ver
0.35) with some bug fixes, warts removed, relicensing, etc...

The line I wrote in my previous post (which was for XML::Reader ver
0.34) was:

my $rdr = XML::Reader->newhd(\*DATA, {filter => 3});

With the new version 0.35 of XML::Reader, the same line would be

my $rdr = XML::Reader->new(\*DATA, {mode => 'attr-in-hash'});


If what you need is all you state,
this code should fix up your xml.
Its restricted to just single tag-attribute pair.
It works by parsing exclusionary and specific markup.

The advantage here is that nothing else changes in the
original markup, only the string content of Id is changed
via the replacement side of the regex.
This avoids formatting headaches with some writers.

The regex may look simple for a parser, thats becuse it
is custom to the specific task.
The markup interraction is correct.


# -------------------------------------------
# rx_xml_fixval.pl
# -sln, 5/2/2010
# Util to extract some attribute/val's from
# xml/xhtml
# -------------------------------------------

use strict;
use warnings;

my $rxopen = "(?: Application )"; # Open tag , cannot be empty alternation
my $rxattr = "(?: Id )"; # Attribute we seek, cannot have an empty alternation

my $Rxmarkup = qr/
# Things that hide markup
(?: <! (?: \[CDATA\[.*?\]\] | --.*?-- | \[[A-Z][A-Z\ ]*\[.*?\]\] ) > ) \K
# Specific markup
(?: < (?<OPEN> $rxopen ) \s+[^>]*? (?<=\s) (?<ATTR> $rxattr) \s*=\s* \K(?<VAL> ".+?"|'.+?')
(?= [^>]*? \s* \/? > )
< \K

my $html = join '', <DATA>;
$html =~ s/ $Rxmarkup/ fixval( $+{VAL} ) /eg;
print "\n",$html;

exit (0);

sub fixval {
return '' unless defined $_[0];
if ($_[0] =~ / \/ \s* (?<val>[^\/]+?) \s* (?<delim>["']) $/x) {
return "$+{delim}$+{val}$+{delim}";
return $_[0];


<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

<Profile xmlns="xxxxxxxxx" name="" version="1.1" xmlns:xsi="http://

<Application Name="App1" Id="/Local/App/App1" Services="1" policy=""
StartApp="" Bal="5" sessInt="500" WaterMark="1.0"/>




With a slight modification, multiple attr-val's can be done
within a single tag. Of course this includes some re-eval
fringe code (?{}) and a conditional (?() | ) but does the
same search and replace and on multiples.


Some output:
Id = "/Local/App/App1", (valnew = "App1")
Id2 = "/Local/App/App2", (valnew = "App2")
Id = '/Dummy/Test/iii', (valnew = 'iii')
Id = "/testing", (valnew = "testing")
Id = "/Dum
", (valnew = "iii")
Id = "/Dat/Inp/Out", (valnew = "Out")
Id = "/Local/App/App1", (valnew = "App1")
Id = "/Dummy/Test/iii", (valnew = "iii")
Id = "/Dat/Inp/Out", (valnew = "Out")
Tt = "TT/tt hello", (valnew = "tt hello")
Id = "/he llo", (valnew = "he llo")

# -------------------------------------------
# rx_html_fixval2.pl
# -sln, 5/5/2010
# Util to search/replace attribute/val's from
# xml/html
# -------------------------------------------

use strict;
use warnings;

## Initialization

my $rxopen = "(?: Application )"; # Open tags , cannot be empty alternation
my $rxattr = "(?: Id.?|Tt )"; # Attributes we seek, cannot have an empty alternation
# "(?: \\w+ )";

use re 'eval';
my $topen = 0;

my $Rxmarkup = qr
(?(?{$topen}) # Begin Conditional

# Have <OPEN> ?
# Try to match next attr-val pair
\s+[^>]*? (?<=\s) (?<ATTR> $rxattr) \s*=\s* \K(?<VAL> ".+?"|'.+?')
(?= [^>]*? \s* /? > )
# No more attr-value pairs
(?{$topen = 0})
# Look for new <OPEN>
# Things that hide markup:
# - Comments/CDATA
(?: <! (?: \[CDATA\[.*?\]\] | --.*?-- | \[[A-Z][A-Z\ ]*\[.*?\]\] ) > ) \K
# Specific markup we seek:
# - OPEN tag
(?: < (?<OPEN> $rxopen \K) )
(?{$topen = 1})
< \K
) # End Conditional

## Code

my $html = join '', <DATA>;
$html =~ s/$Rxmarkup/ fixval( $+{ATTR}, $+{VAL} ) /eg;
print "\n",$html;

exit (0);

## Subs

sub fixval {
return '' unless defined $_[1];
print "$_[0] = $_[1], ";
if ($_[1] =~ / \/ \s* (?<val>[^\/]+?) \s* (?<delim>["']) $/x) {
my $valnew = $+{delim}.$+{val}.$+{delim};
print "(valnew = $valnew)\n";
return $valnew;
print "(val unchanged)\n";
return $_[1];


<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

<Profile xmlns="xxxxxxxxx" name="" version="1.1" xmlns:xsi="http://

<Application Name="App1" Id="/Local/App/App1"
Id2="/Local/App/App2" Services="1" policy=""
StartApp="" Bal="5" sessInt="500" WaterMark="1.0"/>



Name="App99" Id='/Dummy/Test/iii' Services="3"
policy="99" StartApp="2" Bal="7" sessInt="27"
WaterMark="4.3" />

<Application Id="/testing"
Name="App100" Id="/Dum
" Services="4"
policy="99" StartApp="2" Bal="7" sessInt="27"

Name="Yyee" Id="/Dat/Inp/Out" Services="5"
policy="88" StartApp="" Bal="1" sessInt="8"

<![INCLUDE CDATA [ <Application Name="App99" Id="//Test/can't see me"/> ]]>

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

Name="App1" Id="/Local/App/App1" Services="1"
policy="" StartApp="" Bal="5" sessInt="500"

Name="App99" Id="/Dummy/Test/iii" Services="3"
policy="99" StartApp="2" Bal="7" sessInt="27"

Name="Yyee" Id="/Dat/Inp/Out" Services="5"
policy="88" StartApp="" Bal="1" sessInt="8"
WaterMark="2.1" Tt = "TT/tt hello"/>

Name="Yyee" Id="/he llo" Services="5"
policy="88" StartApp="" Bal="1" sessInt="8"


