newbie help

Ram · Feb 3, 2004

How do I search for just the ordsts start(<ordsts>) and end tags(</ordsts>)
and the data between them, and get just the last matched one. Also would
need an idea of how to get the last two matches.

Thanks for the pointers.

Sample Input file:
<logos>
<ordsts>
<gname>
</gname>
</ordsts>
<ordadd>
<aname>
</aname>
</ordadd>
</logos>
<customer>
<contact>
<pname>
</pname>
</contact>
<ordsts>
<name>
</name>
</ordsts>
<shipname>
<sname>
</sname>
</shipname>
</customer>
<ordsts>
<doc_hdr>
<type_code>ORDSTS</type_code>
<type_suffix>LE</type_suffix>
<direction>IN</direction>
</doc_hdr>
<ord_keys>
<ordno>200000</ordno>
</ord_keys>
<req_obj>
<obj>order_header</obj>
<obj>order_line</obj>
</req_obj>
</ordsts>
<order> <doc_hdr> <type_code>ORDER</type_code>
<type_suffix>LE</type_suffix> <direction>IN</direction> <client_da
a>User Supplied Data</client_data> <client_id>User Supplied
Data</client_id> <correlation_id>414D51204C45555343433033202020
040001EEE00042583</correlation_id>
<response_channel>CC.ORDER.REPLY</response_channel>
<correlation_id>41,4d,51,20,4c,45,55
53,43,43,30,33,20,20,20,20,40,0,1e,ee,0,4,25,83,</correlation_id>
<response_channel>LEUSCS01::CC.ORDER.REPLY.CS.S.Q</response_c
annel> </doc_hdr> <customer> <cus_num>3374831</cus_num>
<bill_to> <contact> <con_num>2</con_num> </
ontact> </bill_to> <ship_to> <address>
<adr_num>1</adr_num> </address> <taxwaregeocode> <
eocode>331003600</geocode></order>
<ordsts> <doc_hdr> <type_code>ORDER</type_code>
<type_suffix>LE</type_suffix> <direction>IN</direction> <client_d
ta>User Supplied Data</client_data> <client_id>User Supplied
Data</client_id> <correlation_id>414D51204C4555534343303320202
2040001EEE00042583</correlation_id>
<response_channel>CC.ORDER.REPLY</response_channel>
<correlation_id>41,4d,51,20,4c,45,5
,53,43,43,30,33,20,20,20,20,40,0,1e,ee,0,4,25,83,</correlation_id>
<response_channel>LEUSCS01::CC.ORDER.REPLY.CS.S.Q</response_
hannel> </doc_hdr> <customer> <cus_num>3374831</cus_num>
<bill_to> <contact> <con_num>2</con_num> <
contact> </bill_to> <ship_to> <address>
<adr_num>1</adr_num> </address> <taxwaregeocode>
geocode>331003600</geocode></ordsts>

Gunnar Hjalmarsson · Feb 3, 2004

Ram said:
How do I search for just the ordsts start(<ordsts>) and end
tags(</ordsts>) and the data between them, and get just the last
matched one.

Assuming the data is in $_:

my ($lastmatch) = /.*( said:
Also would need an idea of how to get the last two matches.

I leave that as an excercise to you.

Thanks for the pointers.

http://www.perldoc.com/perl5.8.0/pod/perlre.html

J Krugman · Feb 3, 2004

In said:
Assuming the data is in $_:

my ($lastmatch) = /.*(<ordsts>.*<\/ordsts>).*/s;

Why doesn't this match everthing between the very first <ordsts>
in the file and the last </ordsts>? Isn't the regexp engine supposed
to give the longest match?

jill

Gunnar Hjalmarsson · Feb 3, 2004

J said:
Why doesn't this match everthing between the very first <ordsts> in
the file and the last </ordsts>?

Because the first .* is greedy.

Isn't the regexp engine supposed to give the longest match?

Nope.

Please read about greediness in perldoc perlre.

J Krugman · Feb 3, 2004

Because the first .* is greedy.

OK, I missed that. Thanks.

jill

Ram · Feb 4, 2004

This string does not match if <ordsts> and </ordsts> has child tags spread
across multiple lines.

If I stick this to the end of file, it does not match:
<ordsts>
<gname>
</gname>
</ordsts>
But it matches:
<ordsts> <gname> </gname> </ordsts>

For my case, it should match the both, including the child tags.

Thanks!!

Chris · Feb 4, 2004

Ram said:
How do I search for just the ordsts start(<ordsts>) and end tags(</ordsts>)
and the data between them, and get just the last matched one. Also would
need an idea of how to get the last two matches.

Thanks for the pointers.

[snipped sample XML]

If this is XML, as it appears to be, you might do better parsing and get
better overall mileage from using XML::Simple or one of its close cousins.

(Wondering if this is the "Ram" that *I* know. If so, I hope you are
doing well.)

Chris

Gunnar Hjalmarsson · Feb 4, 2004

[ Please do not top post! ]

This string does not match if <ordsts> and </ordsts> has child
tags spread across multiple lines.

It's not a string, it's a regular expression, and it does match over
multiple lines.

If I stick this to the end of file, it does not match:
<ordsts>
<gname>
</gname>
</ordsts>
But it matches:
<ordsts> <gname> </gname> </ordsts>

Would you mind showing us the code you used to end up to that conclusion?

For my case, it should match the both, including the child tags.

And my suggestion does that perfectly well.

Have you began to study perldoc perlre yet? You'd better do so right
away, and don't forget to read about the /s modifier.

gnari · Feb 4, 2004

[note: if you do not top-post then it is more likely we want to help.
it si annoying when you put your follow-up at the top of your message,
quoting the message you are rplying to under that (in this case in whole)]

This string does not match if <ordsts> and </ordsts> has child tags spread
across multiple lines.
...

key sentence, perhaps?

are you matching one line at a time?

gnari

James Willmore · Feb 4, 2004

[please don't top post - reordered to proper format] On Wed, 04 Feb 2004

This string does not match if <ordsts> and </ordsts> has child tags
spread across multiple lines.

If I stick this to the end of file, it does not match: <ordsts>
<gname>
</gname>
</ordsts>
But it matches:
<ordsts> <gname> </gname> </ordsts>

For my case, it should match the both, including the child tags.

I'd follow the suggestion offered by Chris Olive - use an XML module to
parse your data. It will save you lots of time and effort - and reduce
the amount of "mistakes" made in parsing. Right now, if someone changes
the format of the file, you'll have to go through a similar type exercise
again in the future.

Again, it's just a suggestion

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
You never know how many friends you have until you rent a house
<on the beach.

Ram · Feb 4, 2004

Script I used:

#!/usr/bin/perl
use strict;
my $el;
open(ONE, "ordsts.txt" ) or die "Can't open file $! \n";
while (<ONE>) {
#print "$_ \n";
my @lastmatch = /.*(<ordsts>.*<\/ordsts>)/s;
print "@lastmatch \n";
$el= my @lastmatch;
}
print "$el \n";

I am not the Ram you know!!

Chris said:
Ram said:

How do I search for just the ordsts start(<ordsts>) and end

Click to expand...

tags( said:

and the data between them, and get just the last matched one. Also would
need an idea of how to get the last two matches.

Thanks for the pointers.

[snipped sample XML]

Click to expand...

If this is XML, as it appears to be, you might do better parsing and get
better overall mileage from using XML::Simple or one of its close cousins.

(Wondering if this is the "Ram" that *I* know. If so, I hope you are
doing well.)

Chris
-----
Chris Olive
chris -at- technologEase -dot- com
http://www.technologEase.com
(pronounced "technologies")

Tad McClellan · Feb 4, 2004

Ram said:
Subject: newbie help

Please put the subject of your article in the Subject of your article.

Gunnar Hjalmarsson · Feb 4, 2004

Ram said:
Script I used:

#!/usr/bin/perl
use strict;
my $el;
open(ONE, "ordsts.txt" ) or die "Can't open file $! \n";
while (<ONE>) {
#print "$_ \n";
my @lastmatch = /.*(<ordsts>.*<\/ordsts>)/s;
print "@lastmatch \n";
$el= my @lastmatch;
}
print "$el \n";

It proves that gnari guessed right: You are applying the regex to one
line at a time, which obviously can't work.

Try this instead:

#!/usr/bin/perl
use strict;
use warnings;
open ONE, "ordsts.txt" or die "Can't open file $!";
$_ = do { local $/; <ONE> }; # slurp file into $_
close ONE;
my ($el) = /.*(<ordsts>.*<\/ordsts>).*/s;
print "$el\n";

Tad McClellan · Feb 5, 2004

[ Please do not post upside-down followups ]

Ram said:
This string does not match

Does not match *what* ?

if <ordsts> and </ordsts> has child tags spread
across multiple lines.

How are you getting the multiple lines into $_ ?

That _will_ match across multiple lines.

You are probably running afoul of this Frequently Asked Question:

I'm having trouble matching over more than one line. What's wrong?

Ram · Feb 5, 2004

Excellent, a lot to learn!!

Gunnar Hjalmarsson said:
It proves that gnari guessed right: You are applying the regex to one
line at a time, which obviously can't work.

Try this instead:

#!/usr/bin/perl
use strict;
use warnings;
open ONE, "ordsts.txt" or die "Can't open file $!";
$_ = do { local $/; <ONE> }; # slurp file into $_
close ONE;
my ($el) = /.*(<ordsts>.*<\/ordsts>).*/s;
print "$el\n";

gnari · Feb 5, 2004

Ram said:
Excellent, a lot to learn!!

specially about top-posting.

[snipped top-posted quoted whole article]

gnari

Ram · Feb 5, 2004

Is this the correct way to post (not top-posting), while responding to
gnari!!

Tad McClellan · Feb 5, 2004

Ram said:
Is this the correct way to post (not top-posting),

No it isn't.

*plonk*

[ snip TOFU ]

Help! XML Mapping to Relational Database Tables	0	May 27, 2004
Newbie with worldpay	19	May 27, 2005
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
CSV dB script help	9	Jun 2, 2004
comp.lang.vhdl FAQ part 1 of 4: general	0	Jul 8, 2003
Listbox SelectedIndexChanged not firing (Autopostback = True)	1	Feb 3, 2004
Webservice He!p - Response error: "An existing connection was forcibly closed by the remote host" e	0	Dec 20, 2006
Richmond Jobs Update 01/10/05	2	Jan 11, 2005

newbie help

Ram

Gunnar Hjalmarsson

J Krugman

Gunnar Hjalmarsson

J Krugman

Ram

Chris

Gunnar Hjalmarsson

gnari

James Willmore

Ram

Tad McClellan

Gunnar Hjalmarsson

Tad McClellan

Ram

gnari

Ram

Tad McClellan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads