L
Lax
Hello all,
I'm trying to search and replace the value of a tag in an xml file.
I'm not in a position to use the usual XML parsers as the version of
Perl I'm required to use
doesnt contain any of the XML libraries. I can use Text::Balanced, but
I want to deal with the xml file on a
line-by-line basis, as the value of my tag could strecth over multiple-
lines.
Perl Version:
This is perl, v5.8.7 built for sun4-solaris
Sample xml file:
-------------------------
<project xmlns="xml:header">
<version>1.0.0</version>
<SomeTag>
<version>invalid version</version>
</SomeTag>
<SomeAnotherTagNested1>
<SomeAnotherTagNested2>
<SomeAnotherTagNested3>
<version>invalid version</version>
</SomeAnotherTagNested3>
</SomeAnotherTagNested2>
</SomeAnotherTagNested1>
<version>stand-alone, but not valid either</version>
</project>
-------------------------
I only want the version tag when they're not enclosed in any other
tags.
I want to replace the 1.0.0 (an example value) with 2.0.0 on an stand-
alone "version"'s first occurence.
I came up with the following:
--------------------
#!/usr/local/bin/perl
use strict ;
use File::Copy ;
die "Usage: replace.pl <xml file>!\n" unless ( $#ARGV == 0 ) ;
my $file = shift ;
open(IN,"$file") or die "Cant open file: $!\n" ;
chomp(my @arr = <IN> ) ;
close(IN) ;
open(OUT,"> bak") or die "Cant open file: $!\n" ;
# Two flags,
# $tag_flag -- to check if we're inside a tag
# $version_flag -- to check if we've replaced version tag already.
my $tag_flag = "off" ;
my $version_flag = "off" ;
foreach my $line ( @arr )
{
# Dont consider the open and close of top-level <project> tag.
if ( $line =~ /^\s*\<(\/)?project/ )
{
print OUT "$line\n" ;
next ;
}
# Found <version>, replace version string if tag_flag is on and
version_flag is off.
elsif ( ($line =~ /^\s*\<version\>/) && ( $tag_flag eq "off" ) &&
( $version_flag eq "off" ) )
{
# print "Flag: $flag\n" ;
print OUT "<version>2.0.0</version>\n" ;
$tag_flag = "on" ;
$version_flag = "on" ;
}
# Inside an open tag "<", tag_flag on.
elsif ( ( $line =~ /^\s*\<.*\>/ ) && ( $line !~ /^\s*\<\/.*
\>/ ) )
{
print OUT "$line\n" ;
$tag_flag = "on" ;
}
# Inside a close tag "</", tag_flag on.
elsif ( $line =~ /^\s*\<\/.*\>/ )
{
print OUT "$line\n" ;
$tag_flag = "off" ;
} else {
print OUT "$line\n" ;
}
}
close(OUT) ;
# Move bak file to original
------------------------------------------
The above script works, and a "diff bak <xml-file>" gives me the
expected result when the stand-alone <version> is all on one line, I
cant get this working when its extended over multiple-lines.
Could anyone give me some pointers, please?
Thanks,
Lax
I'm trying to search and replace the value of a tag in an xml file.
I'm not in a position to use the usual XML parsers as the version of
Perl I'm required to use
doesnt contain any of the XML libraries. I can use Text::Balanced, but
I want to deal with the xml file on a
line-by-line basis, as the value of my tag could strecth over multiple-
lines.
Perl Version:
This is perl, v5.8.7 built for sun4-solaris
Sample xml file:
-------------------------
<project xmlns="xml:header">
<version>1.0.0</version>
<SomeTag>
<version>invalid version</version>
</SomeTag>
<SomeAnotherTagNested1>
<SomeAnotherTagNested2>
<SomeAnotherTagNested3>
<version>invalid version</version>
</SomeAnotherTagNested3>
</SomeAnotherTagNested2>
</SomeAnotherTagNested1>
<version>stand-alone, but not valid either</version>
</project>
-------------------------
I only want the version tag when they're not enclosed in any other
tags.
I want to replace the 1.0.0 (an example value) with 2.0.0 on an stand-
alone "version"'s first occurence.
I came up with the following:
--------------------
#!/usr/local/bin/perl
use strict ;
use File::Copy ;
die "Usage: replace.pl <xml file>!\n" unless ( $#ARGV == 0 ) ;
my $file = shift ;
open(IN,"$file") or die "Cant open file: $!\n" ;
chomp(my @arr = <IN> ) ;
close(IN) ;
open(OUT,"> bak") or die "Cant open file: $!\n" ;
# Two flags,
# $tag_flag -- to check if we're inside a tag
# $version_flag -- to check if we've replaced version tag already.
my $tag_flag = "off" ;
my $version_flag = "off" ;
foreach my $line ( @arr )
{
# Dont consider the open and close of top-level <project> tag.
if ( $line =~ /^\s*\<(\/)?project/ )
{
print OUT "$line\n" ;
next ;
}
# Found <version>, replace version string if tag_flag is on and
version_flag is off.
elsif ( ($line =~ /^\s*\<version\>/) && ( $tag_flag eq "off" ) &&
( $version_flag eq "off" ) )
{
# print "Flag: $flag\n" ;
print OUT "<version>2.0.0</version>\n" ;
$tag_flag = "on" ;
$version_flag = "on" ;
}
# Inside an open tag "<", tag_flag on.
elsif ( ( $line =~ /^\s*\<.*\>/ ) && ( $line !~ /^\s*\<\/.*
\>/ ) )
{
print OUT "$line\n" ;
$tag_flag = "on" ;
}
# Inside a close tag "</", tag_flag on.
elsif ( $line =~ /^\s*\<\/.*\>/ )
{
print OUT "$line\n" ;
$tag_flag = "off" ;
} else {
print OUT "$line\n" ;
}
}
close(OUT) ;
# Move bak file to original
------------------------------------------
The above script works, and a "diff bak <xml-file>" gives me the
expected result when the stand-alone <version> is all on one line, I
cant get this working when its extended over multiple-lines.
Could anyone give me some pointers, please?
Thanks,
Lax