How to concatenate 'like' files in a dir?

Discussion in 'Perl Misc' started by wilson_work@yahoo.com, Feb 18, 2006.

  1. Guest

    Hi All,
    I have a directory of .txt files and need to concatenate all files
    belonging to each user (oldest first, no set number per user). The
    username (M08x) is embedded in the filename, along with other info. I
    would like to delete the smaller individual logs/files once they have
    been concatenated. Any advice is greatly appreciated!

    Here is a sample of the filenames...
    -rw-r--r-- 1 christine christine 28046 Oct 11 21:40
    KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
    KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
    .......


    Thank you,
    Christine
     
    , Feb 18, 2006
    #1
    1. Advertising

  2. wrote in news:1140288000.457752.320040
    @z14g2000cwz.googlegroups.com:

    > Hi All,
    > I have a directory of .txt files and need to concatenate all files
    > belonging to each user (oldest first, no set number per user). The
    > username (M08x) is embedded in the filename, along with other info. I
    > would like to delete the smaller individual logs/files once they have
    > been concatenated. Any advice is greatly appreciated!


    Well, please first read the posting guidelines for this group. You have
    a much better chance of getting useful help if you post some code.

    > Here is a sample of the filenames...
    > -rw-r--r-- 1 christine christine 28046 Oct 11 21:40
    > KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
    > KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    > KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    > KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    > KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    > KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    > KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
    > ......


    Simple ... use a hash ;-)

    1. opendir and readdir to read the filesnames
    2. Use a capturing regex match to grab the user name
    3. Add the filename to the list of filenames belonging to the user
    4. Custom sort routine to sort filenames by date component.
    5. Open a file for user to write to.
    6. Read and write each file in required order.

    Here is something quick and dirty to get you started:

    #!/usr/bin/perl

    use strict;
    use warnings;

    my %months = ( Jan => '01', Feb => '02', Mar => '03',
    Apr => '04', May => '05', Jun => '06',
    Jul => '07', Aug => '08', Sep => '09',
    Oct => '10', Nov => '11', Dec => '12',
    );

    my %users;

    while (my $filename = <DATA>) {
    chomp $filename;
    if ( $filename =~ m{
    \A
    KCD-
    (M\d{6})-
    NA-server.name-
    (\d{4})(\w{3})(\d{2})-
    (\d{2}:\d{2}:\d{2})
    \.txt
    \z
    }x ) {
    my ($user, $date) = ($1, "$2$months{$3}$4$5");
    push @{ $users{$user} }, { filename => $filename, date => $date
    };
    }
    }

    for my $user (keys %users) {
    print "Files for $user:\n";
    my @files = sort {
    $b->{date} cmp $a->{date}
    } @{ $users{$user} };

    print $_->{filename}, "\n" for @files;
    print "\n";
    }


    __DATA__
    KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
    KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt

    D:\Home\asu1\UseNet\clpmisc\dir> files
    Files for M087350:
    KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt

    Files for M087326:
    KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt



    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Feb 18, 2006
    #2
    1. Advertising

  3. A. Sinan Unur wrote:
    > wrote in news:1140288000.457752.320040
    > @z14g2000cwz.googlegroups.com:
    >
    > > Hi All,
    > > I have a directory of .txt files and need to concatenate all files
    > > belonging to each user (oldest first, no set number per user). The
    > > username (M08x) is embedded in the filename, along with other info. I
    > > would like to delete the smaller individual logs/files once they have
    > > been concatenated. Any advice is greatly appreciated!

    >
    > Well, please first read the posting guidelines for this group. You have
    > a much better chance of getting useful help if you post some code.
    >
    > > Here is a sample of the filenames...
    > > -rw-r--r-- 1 christine christine 28046 Oct 11 21:40
    > > KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
    > > KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    > > KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    > > KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    > > KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    > > KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    > > KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
    > > ......

    >
    > Simple ... use a hash ;-)
    >
    > 1. opendir and readdir to read the filesnames
    > 2. Use a capturing regex match to grab the user name
    > 3. Add the filename to the list of filenames belonging to the user
    > 4. Custom sort routine to sort filenames by date component.
    > 5. Open a file for user to write to.
    > 6. Read and write each file in required order.
    >
    > Here is something quick and dirty to get you started:
    >
    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > my %months = ( Jan => '01', Feb => '02', Mar => '03',
    > Apr => '04', May => '05', Jun => '06',
    > Jul => '07', Aug => '08', Sep => '09',
    > Oct => '10', Nov => '11', Dec => '12',
    > );
    >
    > my %users;
    >
    > while (my $filename = <DATA>) {
    > chomp $filename;
    > if ( $filename =~ m{
    > \A
    > KCD-
    > (M\d{6})-
    > NA-server.name-
    > (\d{4})(\w{3})(\d{2})-
    > (\d{2}:\d{2}:\d{2})
    > \.txt
    > \z
    > }x ) {
    > my ($user, $date) = ($1, "$2$months{$3}$4$5");
    > push @{ $users{$user} }, { filename => $filename, date => $date
    > };
    > }
    > }
    >
    > for my $user (keys %users) {
    > print "Files for $user:\n";
    > my @files = sort {
    > $b->{date} cmp $a->{date}
    > } @{ $users{$user} };
    >
    > print $_->{filename}, "\n" for @files;
    > print "\n";
    > }
    >
    >
    > __DATA__
    > KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt
    > KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    > KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    > KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    > KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    > KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    > KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
    >
    > D:\Home\asu1\UseNet\clpmisc\dir> files
    > Files for M087350:
    > KCD-M087350-NA-server.name-2005Oct05-21:13:19.txt
    > KCD-M087350-NA-server.name-2005Oct03-19:20:56.txt
    > KCD-M087350-NA-server.name-2005Sep27-19:26:09.txt
    >
    > Files for M087326:
    > KCD-M087326-NA-server.name-2005Oct19-21:35:06.txt
    > KCD-M087326-NA-server.name-2005Oct17-22:44:00.txt
    > KCD-M087326-NA-server.name-2005Oct16-23:55:26.txt
    > KCD-M087326-NA-server.name-2005Oct11-20:42:14.txt



    i realize this is just an example, but why did you choose to sort such
    that the oldest file is last?
     
    it_says_BALLS_on_your_forehead, Feb 18, 2006
    #3
  4. "it_says_BALLS_on_your_forehead" <> wrote in
    news::

    > A. Sinan Unur wrote:
    >> wrote in news:1140288000.457752.320040
    >> @z14g2000cwz.googlegroups.com:
    >>
    >> > Hi All,
    >> > I have a directory of .txt files and need to concatenate all
    >> > files belonging to each user (oldest first,


    ....

    >> my @files = sort {
    >> $b->{date} cmp $a->{date}


    > i realize this is just an example, but why did you choose to sort such
    > that the oldest file is last?


    I misread the OP's statement and thought that she wanted oldest last,
    not first.

    In any case, please quote only the relevant parts of the message to
    which you are replying.

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Feb 18, 2006
    #4
  5. A. Sinan Unur wrote:
    > "it_says_BALLS_on_your_forehead" <> wrote in
    > news::
    >
    > > A. Sinan Unur wrote:
    > >> wrote in news:1140288000.457752.320040
    > >> @z14g2000cwz.googlegroups.com:
    > >>
    > >> > Hi All,
    > >> > I have a directory of .txt files and need to concatenate all
    > >> > files belonging to each user (oldest first,

    >
    > ...
    >
    > >> my @files = sort {
    > >> $b->{date} cmp $a->{date}

    >
    > > i realize this is just an example, but why did you choose to sort such
    > > that the oldest file is last?

    >
    > I misread the OP's statement and thought that she wanted oldest last,
    > not first.


    gotcha. i didn't know if there was some esoteric file concat method
    that took a reverse sorted list as an argument. sorry about the
    over-quoting.
     
    it_says_BALLS_on_your_forehead, Feb 18, 2006
    #5
  6. "it_says_BALLS_on_your_forehead" <> wrote in
    news::

    >
    > A. Sinan Unur wrote:
    >> "it_says_BALLS_on_your_forehead" <> wrote in
    >> news::
    >>
    >> > A. Sinan Unur wrote:
    >> >> wrote in news:1140288000.457752.320040
    >> >> @z14g2000cwz.googlegroups.com:
    >> >>
    >> >> > Hi All,
    >> >> > I have a directory of .txt files and need to concatenate all
    >> >> > files belonging to each user (oldest first,

    >>
    >> ...
    >>
    >> >> my @files = sort {
    >> >> $b->{date} cmp $a->{date}

    >>
    >> > i realize this is just an example, but why did you choose to sort
    >> > such that the oldest file is last?

    >>
    >> I misread the OP's statement and thought that she wanted oldest last,
    >> not first.

    >
    > gotcha. i didn't know if there was some esoteric file concat method
    > that took a reverse sorted list as an argument.


    No there isn't (not that I know of ;-). But my code was missing a map
    that would have made life much easier:

    for my $user (keys %users) {
    print "Files for $user:\n";
    my @files = map { $_->{filename} }
    sort { $a->{date} cmp $b->{date} }
    @{ $users{$user} };
    system "cat @files > $user.txt";
    }

    > sorry about the over-quoting.


    No problem.

    Sinan

    --
    A. Sinan Unur <>
    (reverse each component and remove .invalid for email address)

    comp.lang.perl.misc guidelines onthe WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Feb 18, 2006
    #6
  7. Guest

    Many many thanks! I'm fairly new to Perl and this was a
    life/time-saver for me. I really appreciate all the help.

    It's just what I needed.
    Thank you,
    Christine
     
    , Feb 20, 2006
    #7
  8. wrote in news:1140449875.621920.248070
    @g44g2000cwa.googlegroups.com:

    > Many many thanks! I'm fairly new to Perl and this was a
    > life/time-saver for me. I really appreciate all the help.


    You are welcome.

    The best thing you can do to improve your Perl skills would be to read the
    posting guidelines, and try to come up with a short script that aims to do
    what you want, and exhibits the problems you are having.

    Then, post it here, and listen to the critique. You'll find that it works
    like an accelerated training course (the likes of which you would have to
    pay thousands of dollars to receive).

    Sinan
     
    A. Sinan Unur, Feb 20, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?UnVkeQ==?=

    Sub Dir, Virtual dir, what do I use?

    =?Utf-8?B?UnVkeQ==?=, Jun 12, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    432
    =?Utf-8?B?UnVkeQ==?=
    Jun 12, 2005
  2. kgk
    Replies:
    1
    Views:
    299
    Marc 'BlackJack' Rintsch
    Jul 11, 2007
  3. Matthew Denner
    Replies:
    1
    Views:
    188
  4. Tony
    Replies:
    5
    Views:
    747
  5. Carlos

    Concatenate/De-Concatenate

    Carlos, Oct 12, 2012, in forum: VHDL
    Replies:
    10
    Views:
    892
Loading...

Share This Page