Date in CSV/TSV question

  • Thread starter Dr Eberhard Lisse
  • Start date
D

Dr Eberhard Lisse

I have a Tab Separated File of roughly 1000 likes with the first fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

I need change the first field to look like

2011-01-07 "TFR"
2011-01-05 "DR"

for all lines, of course :)-O

Can someone point me to where I can read this up? Or send me a code
fragment?

Thanks, el
 
D

Dave Saville

I have a Tab Separated File of roughly 1000 likes with the first fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

I need change the first field to look like

2011-01-07 "TFR"
2011-01-05 "DR"

for all lines, of course :)-O

Can someone point me to where I can read this up? Or send me a code
fragment?

Not clear if the file has the quotes or you are using them to show the
fields. Assuming you have extracted the first field then split on
space to day month year. Set up an array of month names. Find the
index of the given month. Regenerate the field with sprintf. $new =
sprintf($year-%2.2d-$day, $index); For simplicity put a dummy month on
the front of the list, perl arrays index from 0, so @months = qw(crap
Jan Feb ..........

HTH
 
D

Dr Eberhard W Lisse

Thanks.

el

I have a Tab Separated File of roughly 1000 likes with the first
fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

I need change the first field to look like

2011-01-07 "TFR"
2011-01-05 "DR"

OK, couldn't resist having a bash at this. Didn't spend a lot of time
on it but this does what you want.

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

use Date::Calc qw( Decode_Date_EU );
use Text::CSV;

my $csv = Text::CSV->new( { sep_char=>"\t", quote_char=>'"' } )
or die "Failed to create CSV object: $!\n";
while ( 1 ) {
my $row = $csv->getline( \*DATA );
last unless $row->[0]; # getline returns zero-length arrayref;
irritating
my ( $year, $month, $day ) = Decode_Date_EU( $row->[0] );
die "Bad date" unless $year;
printf "%04d-%02d-%02d\t%s\n", $year, $month, $day, $row->[1];
}

__DATA__
"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"
henry@eris:~/Perl/tryout$ ./tryout
2011-01-07 TFR
2011-01-05 DR

It could be improved, and made more Perlish (I write code in isolation,
rather, which isn't a good idea). In particular I was maddened by the
need to check the EOF condition explicitly. "while my $row =
getline..." returns a one-element array containing a null value when it
hits EOF; you'd think it would return undef. (And yes I did try
"defined" as suggested in perldoc IO::Handle but the arrayref is
actually defined, despite not containing anything useful).
 
R

Rainer Weikusat

Dr Eberhard Lisse said:
I have a Tab Separated File of roughly 1000 likes with the first fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

I need change the first field to look like

2011-01-07 "TFR"
2011-01-05 "DR"

for all lines, of course :)-O

Can someone point me to where I can read this up? Or send me a code
fragment?

-----------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}
-----------
 
K

Keith Thompson

Henry Law said:
You could use Date::Calc, particularly the Decode_Date_EU function; it's
overkill if what you've described is really all there is, but it saves
programming. A truly lazy^H^H^H^Hcreative programmer would look for
something to decode the tab-separated file too; maybe Text::CSV would do
that? I've only ever used it for comma separated data, (which, er, is
what it's for).

Yes, quoting "perldoc Text::CSV":

The module accepts either strings or files as input and
can utilize any user-specified characters as delimiters,
separators, and escapes so it is perhaps better called ASV
(anything separated values) rather than just CSV.
 
R

Rainer Weikusat

Henry Law said:
You could use Date::Calc, particularly the Decode_Date_EU function;
it's overkill if what you've described is really all there is, but it
saves programming. A truly lazy^H^H^H^Hcreative programmer would look
for something to decode the tab-separated file too; maybe Text::CSV
would do that?

Nice example how it 'saves programming':

,----
| #!/usr/bin/perl
| use strict;
| use warnings;
| use 5.010;
|
| use Date::Calc qw( Decode_Date_EU );
| use Text::CSV;
|
| my $csv = Text::CSV->new( { sep_char=>"\t", quote_char=>'"' } )
| or die "Failed to create CSV object: $!\n";
| while ( 1 ) {
| my $row = $csv->getline( \*DATA );
| last unless $row->[0]; # getline returns zero-length arrayref;
| irritating
| my ( $year, $month, $day ) = Decode_Date_EU( $row->[0] );
| die "Bad date" unless $year;
| printf "%04d-%02d-%02d\t%s\n", $year, $month, $day, $row->[1];
| }
`----

That's 14 lines of code. Alternate version without Date::Calc and
Text::CSV

,----
| %months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
|
| while (<>) {
| s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
| print;
| }
`----

That's good enough for the problem which was described and it's four
lines of code. "Truly creative", -10 lines of code were saved here
and a comment explaining an 'ugly' workaround for deficiency in the
downloaded code had to be added as well[*],

while (1) {
 
C

C.DeRykus

-----------

%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);



while (<>) {

s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;

print;

}

-----------

Maybe even shrink it to a long one-liner:

perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile
 
R

Rainer Weikusat

C.DeRykus said:
fields like
[...]
-----------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}
-----------

Maybe even shrink it to a long one-liner:

perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile

Considering the situation of the OP, he has a 'zero line' solution
because all code was written by someone else. I don't know how his is
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)

much faster than I can download anything from the net, especially
considering that I'd have to read to documentation for this anything,
too, making this a very bad tradeoff. And if I had to rely one someone
else's code for totally trivial stuff such as splitting a text file
with n 'somehow separated' data columns into an array, I would have a
very hard time solving the much more complicated problems I usually
need to deal with. Actually, I regularly search CPAN whenever I have a
reasonably complex and self-contained subtask of something that 'using
a module' if one existed would be a good idea. The most common result
of this searches, however, is 'nada', the second most common is some
totally bizarre implementation of 25% of the features I actually need
and the third 'implementation is total crap' aka 'IO::poll' (and the
original author abandoned the code in question in 1975 in order to
become a missionary in Gabun or something like that).

CPAN is mostly a load of tripe resulting from fifteen years of bored
'hobbyists' (here supposed to mean people whose actual job isn't
programming) trying whatever weirdo-approach for solving fifty
different but vaguely related _trivial_ problems with the help of a
steam-engine powered motor umbrella constructed out of yellow,
magenta and purple lego bricks happened to come to their mind. And
downloading all these 'incredible machines' is - except in case of
500 SLOC throw-away 'oneliners' - not the end of the story: I have to
maintain the code because the people who use the software I'm
responsible for come to me with any problems resulting from that.

The rule of thumb I usually follow is that 'using a library' (or -
something I very much prefer - an already written program somebody
actually used to solve a real problem) is only worth the effort if it
saves a significant amount of work, at least something like 500 lines
of code and preferably, a few thousands. And even then, I end up
'maintaining' seriously byzantine workarounds for all the problems in
the 'free' code until I grow tired of that and replace it with
something which actually works (in the sense that it reliably does
what is needed to solve the problem I have to solve and nothing else)
more often than not.
 
D

Dr Eberhard Lisse

The OP is an elderly Obstetrician & Gynecologist, who occasionally needs
to Practically Extract and Report stuff.

el

C.DeRykus said:
I have a Tab Separated File of roughly 1000 likes with the first
fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

2011-01-07 "TFR"
2011-01-05 "DR"
[...]
-----------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}
-----------

Maybe even shrink it to a long one-liner:

perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile

Considering the situation of the OP, he has a 'zero line' solution
because all code was written by someone else. I don't know how his is
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)

much faster than I can download anything from the net, especially
considering that I'd have to read to documentation for this anything,
too, making this a very bad tradeoff. And if I had to rely one someone
else's code for totally trivial stuff such as splitting a text file
with n 'somehow separated' data columns into an array, I would have a
very hard time solving the much more complicated problems I usually
need to deal with. Actually, I regularly search CPAN whenever I have a
reasonably complex and self-contained subtask of something that 'using
a module' if one existed would be a good idea. The most common result
of this searches, however, is 'nada', the second most common is some
totally bizarre implementation of 25% of the features I actually need
and the third 'implementation is total crap' aka 'IO::poll' (and the
original author abandoned the code in question in 1975 in order to
become a missionary in Gabun or something like that).

CPAN is mostly a load of tripe resulting from fifteen years of bored
'hobbyists' (here supposed to mean people whose actual job isn't
programming) trying whatever weirdo-approach for solving fifty
different but vaguely related _trivial_ problems with the help of a
steam-engine powered motor umbrella constructed out of yellow,
magenta and purple lego bricks happened to come to their mind. And
downloading all these 'incredible machines' is - except in case of
500 SLOC throw-away 'oneliners' - not the end of the story: I have to
maintain the code because the people who use the software I'm
responsible for come to me with any problems resulting from that.

The rule of thumb I usually follow is that 'using a library' (or -
something I very much prefer - an already written program somebody
actually used to solve a real problem) is only worth the effort if it
saves a significant amount of work, at least something like 500 lines
of code and preferably, a few thousands. And even then, I end up
'maintaining' seriously byzantine workarounds for all the problems in
the 'free' code until I grow tired of that and replace it with
something which actually works (in the sense that it reliably does
what is needed to solve the problem I have to solve and nothing else)
more often than not.
 
C

C.DeRykus

C.DeRykus said:
I have a Tab Separated File of roughly 1000 likes with the first
fields like

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR"

2011-01-07 "TFR"
2011-01-05 "DR"


[...]


-----------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}
-----------
Maybe even shrink it to a long one-liner:
perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile



Considering the situation of the OP, he has a
'zero line' solution because all code was written
by someone else.

Hm, it sounded like he just a separate tab-delimited
file he needed in a different format (ideal for a 1-
liner.) The -i switch is especially useful for just
this if the scenario allows it.
I don't know how his
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
much faster than I can download anything from the net, especially

considering that I'd have to read to documentation for this anything,

too, making this a very bad tradeoff. And if I had to rely one someone

else's code for totally trivial stuff such as splitting a text file

with n 'somehow separated' data columns into an array, I would have a

very hard time solving the much more complicated problems I usually

need to deal with. Actually, I regularly search CPAN whenever I have a

reasonably complex and self-contained subtask of something that 'using

a module' if one existed would be a good idea. The most common result

of this searches, however, is 'nada', the second most common is some

totally bizarre implementation of 25% of the features I actually need

and the third 'implementation is total crap' aka 'IO::poll' (and the

original author abandoned the code in question in 1975 in order to

become a missionary in Gabun or something like that).



CPAN is mostly a load of tripe resulting from fifteen years of bored

'hobbyists' (here supposed to mean people whose actual job isn't

programming) trying whatever weirdo-approach for solving fifty

different but vaguely related _trivial_ problems with the help of a

steam-engine powered motor umbrella constructed out of yellow,

magenta and purple lego bricks happened to come to their mind. And

downloading all these 'incredible machines' is - except in case of

500 SLOC throw-away 'oneliners' - not the end of the story: I have to

maintain the code because the people who use the software I'm

responsible for come to me with any problems resulting from that.



The rule of thumb I usually follow is that 'using a library' (or -

something I very much prefer - an already written program somebody

actually used to solve a real problem) is only worth the effort if it
saves a significant amount of work, at least something like 500 lines

of code and preferably, a few thousands. And even then, I end up

'maintaining' seriously byzantine workarounds for all the problems in

the 'free' code until I grow tired of that and replace it with

something which actually works (in the sense that it reliably does

what is needed to solve the problem I have to solve and nothing else)

more often than not.

I can appreciate your viewpoint. Date::Manip though
is well-maintained and extraordinarily useful. There
are several other very good Date modules as well.

Leveraging a small bit of module code for a tedious,
surprisingly frequent little chore appeals to the
very lazy. So, it's worth it IMO :)
 
R

Rainer Weikusat

[and need to translate that to]
[...]
Considering the situation of the OP, he has a
'zero line' solution because all code was written
by someone else.

Hm, it sounded like he just a separate tab-delimited
file he needed in a different format (ideal for a 1-
liner.) The -i switch is especially useful for just
this if the scenario allows it.

If you weren't using -i, it wasn't necessary to worry about creating a
backup file since the modified content would end up in a new file.
I don't know how his
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
much faster than I can download anything from the net,
[...]

Date::Manip though is well-maintained and extraordinarily
useful. There are several other very good Date modules as well.

Leveraging a small bit of module code for a tedious,
surprisingly frequent little chore appeals to the
very lazy. So, it's worth it IMO :)

It would call this a case of 'false laziness': You happen to be
familiar with a certain 'date munging' module. The OP wanted to modify
some 'structured text field' which happened to be a data. Ergo:
Clearly, a case for using the date manipulation code. But nothing in
the described problem is related to dates. A sequence of text of the
form

"number0 string number1"

is supposed to be changed such that it becomes

number1-number2-number0

that is, the quotes are supposed to be deleted (I didn't realize
that), the first and the last subfield should be transposed and the
middle string replaced by a two-digit number using a simple,
"well-known" static mapping from twelve three character strings to
numbers. This is exactly the kind of stuff which can be done very
easily with perl, ie

-------------
%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

s/^"(\d+)\s+(\S+)\s+(\d+)"/$3-$months{$2}-$1/, print while (<>);
-------------

and telling the OP that he should instead download a couple of
thousands (probably, I've only counted the DM6 file which figures at
691 LOC) of lines of code consisting of 972(!) different files, most
of which are documented(!) as broken and are totally useless for the
problem at hand is not something I'd call a sound piece of technical
advice. It is probably possible to use a combine harvester instead of
a lawnmower but nobody in his right mind would ever do that or suggest
that others do it.
 
C

C.DeRykus

[and need to translate that to]


2011-01-07 "TFR"
2011-01-05 "DR"


[...]


Maybe even shrink it to a long one-liner:

perl -MDate::Manip -pi.bak -le 's{^"(\d+)\s+(\S+)\s+(\d+)"}
{"$3-" . UnixDate("$1 $2 $3","%m") . "-$1"}e' infile
Considering the situation of the OP, he has a
'zero line' solution because all code was written
by someone else.
Hm, it sounded like he just a separate tab-delimited
file he needed in a different format (ideal for a 1-
liner.) The -i switch is especially useful for just
this if the scenario allows it.



If you weren't using -i, it wasn't necessary to worry about creating a

backup file since the modified content would end up in a new file.

-i is useful in case you're one of those whose
code never works the first time time though...
And you can always remove -i later.
I don't know how his
for other people, however, I can type

qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
much faster than I can download anything from the net,


[...]



Date::Manip though is well-maintained and extraordinarily
useful. There are several other very good Date modules as well.

Leveraging a small bit of module code for a tedious,
surprisingly frequent little chore appeals to the
very lazy. So, it's worth it IMO :)



It would call this a case of 'false laziness': You happen to be

familiar with a certain 'date munging' module. The OP wanted to modify

some 'structured text field' which happened to be a data. Ergo:

Clearly, a case for using the date manipulation code. But nothing in

the described problem is related to dates. A sequence of text of the

form



"number0 string number1"



is supposed to be changed such that it becomes



number1-number2-number0



that is, the quotes are supposed to be deleted (I didn't realize

that), the first and the last subfield should be transposed and the

middle string replaced by a two-digit number using a simple,

"well-known" static mapping from twelve three character strings to

numbers. This is exactly the kind of stuff which can be done very

easily with perl, ie



-------------

%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);



s/^"(\d+)\s+(\S+)\s+(\d+)"/$3-$months{$2}-$1/, print while (<>);
...


Sure, if you don't deal with this kind of
transform often, yet another incantation is
no big deal. And a simple regex can remain
blissfully ignorant of the fact that it's
dealing with dates. But then, if tweaks are
needed, it's "deja vu all over again". Can't
remember where to cut'n paste your old tweak..
No problem. Just wade in and watch out for typo's.

and telling the OP that he should instead download a couple of

thousands (probably, I've only counted the DM6 file which figures at

691 LOC) of lines of code consisting of 972(!) different files, most

of which are documented(!) as broken and are totally useless for the

problem at hand is not something I'd call a sound piece of technical

advice.


I'd agree there are probably better solutions
that pulling in the bloat of Date::Manip. But
there are several good Date modules and it's
all about leveraging code already written and
working. Concern with "pulling in a big module"
is almost always FUD - especially speed concerns. Additionally, if the input format changes, and
those are dates after all, a good Date module
probably has a method to cinch the code tweaks.
One that's already written...

It is probably possible to use a combine harvester > instead of a lawnmower but nobody in his right mind would ever do that or suggest
that others do it.

Then why do we use a simple module function to
escape HTML for instance.. rather than rolling
our own? Sometimes a Swiss army knife - rather
than scrounging around for a small pen knife -
is worth the extra weight in a knapsack.
 
D

Dr Eberhard W Lisse

Ah, the Plonkers.

el


By the way, Meinheer Doctor, you might be interested to know that quite
a lot of people who frequent this group won't have seen the article
which you followed up here, having decided some time ago to block posts
from its author at source.

I leave it to you to determine the significance of this.

PS I bet you're no more elderly than I am :)
 
R

Rainer Weikusat

Leading remark: I'm going to cut this somewhat short. I don't agree
with your opinion on this, however, essentially repeating myself
doesn't seem very useful to me, so I'm just going to address a few
isolated points.
[...]

If you weren't using -i, it wasn't necessary to worry about creating a
backup file since the modified content would end up in a new file.

-i is useful in case you're one of those whose
code never works the first time time though...
And you can always remove -i later.

What I was trying to get at was that it wouldn't be necessary to use
the 'automatic backup' feature of -i if 'overwriting' (aka
'destroying') the input file hadn't been requested to begin with: In this
case, the processed data would go to stdout, immediately available for
interactive inspection, and could be redirected to some other file if
so desired at the user's discretion.

[...]
Sure, if you don't deal with this kind of
transform often, yet another incantation is
no big deal.

'Incantation' is IMO a very unfortunate choice for describing this. It
is a sequence of instructions with exactly defined meaning which
causes a machine to perform a specific function. That's a completely
mundane thing with absolutely no 'magic' of any kind involved (except
insofar 'any sufficiently advanced technology is indistinguishable
fomr magic' [as seen by someone who doesn't understand any of it],).

[...]
Then why do we use a simple module function to
escape HTML for instance.. rather than rolling
our own?

Hmm ... why would I?

$text =~ s/([<>"'&])/'&#'.ord($1).';'/ge;
 
C

ccc31807

"07 Jan 2011" "TFR"
"05 Jan 2011" "DR">

I need change the first field to look like>

2011-01-07 "TFR"
2011-01-05 "DR"

For each line in the file, do something like this, assuming that $date contains a string that matches the date you want to change:
1. my ($day, $month, $year) = split(/ /, $date);
2. $date = sprintf("%04d-%02d-%02d", $year, $mo2num{$mo}, $day);

Line 1 splits your date string into the three components: day, month, year.
Line 2 reassembles those three components and assigns the result back to $date.
The hash table %mo2num looks like this:
my %mo2num = (
JAN => 1,
FEB => 2,
mar => 3,
etc.
);

CC.
 
D

Dr Eberhard Lisse

Thanks,

el

For each line in the file, do something like this, assuming that $date contains a string that matches the date you want to change:
1. my ($day, $month, $year) = split(/ /, $date);
2. $date = sprintf("%04d-%02d-%02d", $year, $mo2num{$mo}, $day);

Line 1 splits your date string into the three components: day, month, year.
Line 2 reassembles those three components and assigns the result back to $date.
The hash table %mo2num looks like this:
my %mo2num = (
JAN => 1,
FEB => 2,
mar => 3,
etc.
);

CC.
 
R

Rainer Weikusat

Dr Eberhard Lisse said:
Thanks,

el

And assuming the hash exists (I posted a command generating it two
times), the format can be transformed with a subsitution expression (I
also posted two times), namely

s/"(\d+)\s+(\S+)\s+(\d+)"/$3-$mo2num{$2}-$1/
 
B

Ben Goldberg

-----------

%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);



while (<>) {

s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;

print;

}

-----------

Don't forget that you can use perl's "command line" switches even when you put your program in a file.
#!/usr/bin/perl -pi.bak
BEGIN {
%months = map {;$_, sprintf('%02d', ++$n)}
qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
}
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
__END__
 
R

Rainer Weikusat

Ben Goldberg said:
]
-----------

%months = map { $_, sprintf('%02d', ++$n); } qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);

while (<>) {
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
print;
}

-----------

Don't forget that you can use perl's "command line" switches even when you put your program in a file.
#!/usr/bin/perl -pi.bak
BEGIN {
%months = map {;$_, sprintf('%02d', ++$n)}
qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
}
s/^"(\d+)\s+(\S+)\s+(\d+)"/"$3-$months{$2}-$1"/;
__END__

The 'BEGIN' serves no useful purpose here: %months needs to be
initialized before the while-loop uses it. Since statements in a file
are executed consecutively (anything else would probably be 'a little
confusing' :), this will be the case with either variant.

As I wrote in another posting: If perl hadn't been told to destroy the
input file, also telling it to make a backup of that before doing so
wasn't necessary. While this probably doesn't matter much for a
trivial example like this, 'not using -i' also means that the code can
be debugged and fixed without constantly renaming files or losing the
original input file altogether in case the 'backup request' was
accidentally forgotten. This also enables use of the script(let) as
'another filter' in a more complicated pipeline.
 
U

Uri Guttman

BM> There's no need to muck about with the #! line and BEGIN blocks, both of
BM> which would make it impossible to turn this into a subroutine later:

BM> my %months = ...;

BM> local $^I = ".bak";
BM> while (<>) { ... }

BM> The edit-in-place handling, including renaming the old file and opening
BM> and selecting ARGVOUT, is done by the no-filehandle <> operator (or an
BM> explicit <ARGV> or readline(ARGV)) whenever $^I is set. If you want to
BM> in-place edit a custom list of files, you can also localise @ARGV.

and File::Slurp has edit_file and edit_file_lines which are even easier
to use.

i do need to add a backup file option to those.

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top