parsing file name assigning extension to a variable

Discussion in 'Perl Misc' started by Alexander Heimann, Jun 11, 2004.

  1. Hi guys. I am new to Perl(four days). I am having a blast playing with
    it. There seems to be a hundred ways to solve each problem I am faced
    with.
    Currently I am working on migrating data to a new database. I need to
    read the contents of a few thousand files and then insert the contents
    into a database. The trick is the files are named desc.121655 with
    121655 being the record number or primary key in the database. So I
    need to parse the filename and save the extension to a variable to
    later use in my SQL statement. The steps I think i need to take are
    below any comments would be great

    1. open directory..
    2. go file by file
    3 assign extension of file to a variable @recordNum
    4 assign contents of file to a variable @content
    5 then insert content with SQL statement where PK = @recordNum
    6 then do next file until end of directory

    sorry if that sounds confusing i am a bit new at this

    Alex
     
    Alexander Heimann, Jun 11, 2004
    #1
    1. Advertising

  2. Alexander Heimann

    gnari Guest

    "Alexander Heimann" <> wrote in message
    news:...

    [snip problem without actual question]

    > 1. open directory..
    > 2. go file by file
    > 3 assign extension of file to a variable @recordNum


    $recordNum ?

    > 4 assign contents of file to a variable @content


    $content ?

    > 5 then insert content with SQL statement where PK = @recordNum


    ditto

    > 6 then do next file until end of directory


    sounds good. go for it and let us know how it goes.

    gnari
     
    gnari, Jun 11, 2004
    #2
    1. Advertising

  3. Alexander Heimann

    Sam Holden Guest

    On 10 Jun 2004 17:01:28 -0700,
    Alexander Heimann <> wrote:
    > Hi guys. I am new to Perl(four days). I am having a blast playing with
    > it. There seems to be a hundred ways to solve each problem I am faced
    > with.
    > Currently I am working on migrating data to a new database. I need to
    > read the contents of a few thousand files and then insert the contents
    > into a database. The trick is the files are named desc.121655 with
    > 121655 being the record number or primary key in the database. So I
    > need to parse the filename and save the extension to a variable to
    > later use in my SQL statement. The steps I think i need to take are
    > below any comments would be great
    >
    > 1. open directory..


    perldoc -f opendir

    > 2. go file by file


    perldoc -f readdir
    perldoc perlsyn [look for for, foreach, while]

    > 3 assign extension of file to a variable @recordNum


    perldoc File::Basename

    You certainly don't want to use an array for a single extension
    (and you don't seem to need to keep all the data at once)

    > 4 assign contents of file to a variable @content


    perldoc -f open
    perldoc -f readline
    perldoc -f read
    perldoc -f close

    > 5 then insert content with SQL statement where PK = @recordNum


    perldoc DBI

    > 6 then do next file until end of directory


    '}'


    perldoc is a command on most perl installs to read the documentation you
    may have it available as HTML or in some other format, in which case
    "perldoc -f foo" means the foo function documented in the perlfunc
    documentation. "perldoc File::Basename" means the documentation for
    the File::Basename module. perldoc perlsyn means the perlsyn
    documentation.

    --
    Sam Holden
     
    Sam Holden, Jun 11, 2004
    #3
  4. Alexander Heimann

    John Bokma Guest

    Sam Holden wrote:

    > On 10 Jun 2004 17:01:28 -0700,
    > Alexander Heimann <> wrote:
    >
    >>Hi guys. I am new to Perl(four days). I am having a blast playing with
    >>it. There seems to be a hundred ways to solve each problem I am faced
    >>with.
    >>Currently I am working on migrating data to a new database. I need to
    >>read the contents of a few thousand files and then insert the contents
    >>into a database. The trick is the files are named desc.121655 with
    >>121655 being the record number or primary key in the database. So I
    >>need to parse the filename and save the extension to a variable to
    >>later use in my SQL statement. The steps I think i need to take are
    >>below any comments would be great
    >>
    >>1. open directory..

    >
    > perldoc -f opendir
    >
    >>2. go file by file

    >
    >
    > perldoc -f readdir
    > perldoc perlsyn [look for for, foreach, while]


    Or File::Find

    >>3 assign extension of file to a variable @recordNum

    >
    > perldoc File::Basename
    >
    > You certainly don't want to use an array for a single extension
    > (and you don't seem to need to keep all the data at once)
    >
    >>4 assign contents of file to a variable @content

    >
    > perldoc -f open
    > perldoc -f readline
    > perldoc -f read
    > perldoc -f close


    File::Slurp (you probably have to install that one, see ppm)

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, Jun 11, 2004
    #4
  5. Alexander Heimann

    Tore Aursand Guest

    On Thu, 10 Jun 2004 19:54:58 -0500, John Bokma wrote:
    > [...]
    > Or File::Find


    ....or File::Find::Rule, which I find a lot easier to work with. :)


    --
    Tore Aursand <>
     
    Tore Aursand, Jun 11, 2004
    #5
  6. Alexander Heimann

    John Bokma Guest

    John Bokma, Jun 11, 2004
    #6
  7. "gnari" <> wrote in message news:<caat96$7ne$>...
    > "Alexander Heimann" <> wrote in message
    > news:...
    >
    > [snip problem without actual question]
    >
    > > 1. open directory..
    > > 2. go file by file
    > > 3 assign extension of file to a variable @recordNum

    >
    > $recordNum ?
    >
    > > 4 assign contents of file to a variable @content

    >
    > $content ?
    >
    > > 5 then insert content with SQL statement where PK = @recordNum

    >
    > ditto
    >
    > > 6 then do next file until end of directory

    >
    > sounds good. go for it and let us know how it goes.
    >
    > gnari

    thanks everyone for all your help. i will let you guys know how it goes.

    have an awesome weekend...
     
    Alexander Heimann, Jun 11, 2004
    #7
  8. >>>>> "Tore" == Tore Aursand <> writes:

    Tore> On Thu, 10 Jun 2004 19:54:58 -0500, John Bokma wrote:
    >> [...]
    >> Or File::Find


    Tore> ...or File::Find::Rule, which I find a lot easier to work with. :)

    .... or File::Finder, which is at least two characters shorter. :)

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
    See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
     
    Randal L. Schwartz, Jun 11, 2004
    #8
  9. Alexander Heimann

    John Bokma Guest

    Randal L. Schwartz wrote:

    >>>>>>"Tore" == Tore Aursand <> writes:

    >
    > Tore> On Thu, 10 Jun 2004 19:54:58 -0500, John Bokma wrote:
    >
    >>>[...]
    >>>Or File::Find

    >
    >
    > Tore> ...or File::Find::Rule, which I find a lot easier to work with. :)
    >
    > ... or File::Finder, which is at least two characters shorter. :)


    More on my list to check out :-D. Thanks.

    --
    John MexIT: http://johnbokma.com/mexit/
    personal page: http://johnbokma.com/
    Experienced Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, Jun 11, 2004
    #9
  10. maybe someone can tell me why I am unable to read the file when i do
    each step individually it was working but i am having trouble putting
    it all together..


    use File::Basename;
    fileparse_set_fstype("MSDOS");


    opendir (DIR, "D:/D2") or die "couldn't open directory\n";
    while (defined($file = readdir(DIR))) {



    ($name, $dir, $ext) = fileparse($file, '\..*');
    $ext =~s/^\.//;
    print " dir is $dir, name is $name, extension is $ext\n";

    my $input;
    open($input, "<", "$file")
    #or die "Couldn't open file :!\n";
    while(<$input>){
    undef $/;
    $content = <INPUT>;
    print if /of/;

    print $content;


    }
    close($input);







    }


    closedir DIR;
     
    Alexander Heimann, Jun 14, 2004
    #10
  11. Alexander Heimann

    Paul Lalli Guest

    On Mon, 14 Jun 2004, Alexander Heimann wrote:

    > maybe someone can tell me why I am unable to read the file when i do
    > each step individually it was working but i am having trouble putting
    > it all together..


    You've left off at least two vital pieces of information, necessary for
    anyone to effectively help you:

    1) What is your desired goal and/or output?
    2) What is the result / output of the code you tried? (this includes all
    errors and warnings that may be printed).

    Paul Lalli

    >
    >
    > use File::Basename;
    > fileparse_set_fstype("MSDOS");
    >
    >
    > opendir (DIR, "D:/D2") or die "couldn't open directory\n";
    > while (defined($file = readdir(DIR))) {
    >
    >
    >
    > ($name, $dir, $ext) = fileparse($file, '\..*');
    > $ext =~s/^\.//;
    > print " dir is $dir, name is $name, extension is $ext\n";
    >
    > my $input;
    > open($input, "<", "$file")
    > #or die "Couldn't open file :!\n";
    > while(<$input>){
    > undef $/;
    > $content = <INPUT>;
    > print if /of/;
    >
    > print $content;
    >
    >
    > }
    > close($input);
    >
    >
    >
    >
    >
    >
    >
    > }
    >
    >
    > closedir DIR;
    >
     
    Paul Lalli, Jun 14, 2004
    #11
  12. Paul Lalli <> wrote in message news:<>...
    > On Mon, 14 Jun 2004, Alexander Heimann wrote:
    >
    > > maybe someone can tell me why I am unable to read the file when i do
    > > each step individually it was working but i am having trouble putting
    > > it all together..

    >
    > You've left off at least two vital pieces of information, necessary for
    > anyone to effectively help you:
    >
    > 1) What is your desired goal and/or output?
    > 2) What is the result / output of the code you tried? (this includes all
    > errors and warnings that may be printed).
    >
    > Paul Lalli

    Paul, My apologies.
    1) My desired goal and output
    1. open directory..
    2. go file by file
    3 assign extension of file to a variable @ext
    4 assign contents of file to a variable $content
    5 then insert content with SQL statement where PK
    6 then do next file until end of directory
    2) I am getting the die error output when trying to read the file. The
    code parses the filename fine when i comment out the open file portion
     
    Alexander Heimann, Jun 15, 2004
    #12
  13. Alexander Heimann

    gnari Guest

    "Alexander Heimann" <> wrote in message
    news:...
    > 1) My desired goal and output
    > 1. open directory..
    > 2. go file by file
    > 3 assign extension of file to a variable @ext
    > 4 assign contents of file to a variable $content
    > 5 then insert content with SQL statement where PK
    > 6 then do next file until end of directory
    > 2) I am getting the die error output when trying to read the file. The
    > code parses the filename fine when i comment out the open file portion


    you obviously are forgetting the directory part of the
    filename when opening it

    gnari
     
    gnari, Jun 15, 2004
    #13
  14. gnari wrote:
    > "Alexander Heimann" <> wrote in message
    > news:...
    >
    >>1) My desired goal and output
    >>1. open directory..
    >>2. go file by file
    >>3 assign extension of file to a variable @ext
    >>4 assign contents of file to a variable $content
    >>5 then insert content with SQL statement where PK
    >>6 then do next file until end of directory
    >>2) I am getting the die error output when trying to read the file. The
    >>code parses the filename fine when i comment out the open file portion

    >
    >
    > you obviously are forgetting the directory part of the
    > filename when opening it
    >
    > gnari
    >
    >
    >

    gnari in the code below
    I am using $file as the filename to open, if i comment out the open file
    and read content portion i am able to parse the the file
    is there a reason i can't use $file again



    use File::Basename;
    fileparse_set_fstype("MSDOS");


    opendir (DIR, "D:/D2") or die "couldn't open directory\n";
    while (defined($file = readdir(DIR))) {
    ($name, $dir, $ext) = fileparse($file, '\..*');
    $ext =~s/^\.//;
    print " dir is $dir, name is $name, extension is $ext\n";

    my $input;
    open($input, "<", "$file")
    or die "Couldn't open file :!\n";
    while(<$input>){
    undef $/;
    $content = <INPUT>;
    print if /of/;

    print $content;


    }
    close($input);
     
    Alexander Heimann, Jun 15, 2004
    #14
  15. Alexander Heimann

    gnari Guest

    "Alexander Heimann" <> wrote in message
    news:r8wzc.6636$US1.3423@fed1read02...
    > gnari wrote:
    > >
    > > you obviously are forgetting the directory part of the
    > > filename when opening it
    > >

    > gnari in the code below
    > I am using $file as the filename to open, if i comment out the open file
    > and read content portion i am able to parse the the file
    > is there a reason i can't use $file again


    [snip code where OP is forgetting the directory part]

    > opendir (DIR, "D:/D2") or die "couldn't open directory\n";


    here 'D:/D2' is the directory, you are reading. call this
    the 'directory part'

    > print " dir is $dir, name is $name, extension is $ext\n";


    here is your problem. your stupid debugging. why print
    a bunch of variables that have nothong to do with the problem?
    they are not used in the open

    > open($input, "<", "$file")
    > or die "Couldn't open file :!\n";


    always include the filename in the die()
    or die "Couldn't open file '$file' :!\n";

    if you had done this you would have seen no
    directory part ('D:/D2')

    gnari
     
    gnari, Jun 15, 2004
    #15
  16. "gnari" <> wrote in message news:<camck9$qv6$>...
    > "Alexander Heimann" <> wrote in message
    > news:r8wzc.6636$US1.3423@fed1read02...
    > > gnari wrote:
    > > >
    > > > you obviously are forgetting the directory part of the
    > > > filename when opening it
    > > >

    > > gnari in the code below
    > > I am using $file as the filename to open, if i comment out the open file
    > > and read content portion i am able to parse the the file
    > > is there a reason i can't use $file again

    >
    > [snip code where OP is forgetting the directory part]
    >
    > > opendir (DIR, "D:/D2") or die "couldn't open directory\n";

    >
    > here 'D:/D2' is the directory, you are reading. call this
    > the 'directory part'
    >
    > > print " dir is $dir, name is $name, extension is $ext\n";

    >
    > here is your problem. your stupid debugging. why print
    > a bunch of variables that have nothong to do with the problem?
    > they are not used in the open
    >
    > > open($input, "<", "$file")
    > > or die "Couldn't open file :!\n";

    >
    > always include the filename in the die()
    > or die "Couldn't open file '$file' :!\n";
    >
    > if you had done this you would have seen no
    > directory part ('D:/D2')
    >
    > gnari


    Gnar,
    I added a variable for the directory part. When i took the die() out
    of the open file it worked ok, but when the die was in there it
    didn't. For some reason the open file was reading two files with no
    filenames in the directory. I don't see the files. i am not really
    sure why that is happening. So the open file wouldn't work because
    there was no name

    Anyways it is working without the die and I think i will be able to
    use this now and use the extension and the content of the file
    variable in my SQL insert statement. The reason i am printing out
    (stupid error checking) is to make sure the variables are holding the
    correct values to later use in a SQL statement



    use File::Basename;


    fileparse_set_fstype("MSDOS");

    $mydir = "D:/D2";
    opendir (DIR, $mydir) || die "couldn't opendir $mydir: $!\n";
    while ($file = readdir(DIR)) {

    ($name, $dir, $ext) = fileparse($file, '\..*');
    $ext =~s/^\.//;
    print "extension is $ext\n";

    open($input, "<$mydir/$file "); #|| die "couldn't open $mydir/$file
    for reading :!\n";
    while(<$input>){
    undef $/;
    $content = <INPUT>;
    print if /of/;
    print $content;
    }
    close($input);
    }

    closedir (DIR);
     
    Alexander Heimann, Jun 15, 2004
    #16
  17. Alexander Heimann

    gnari Guest

    "Alexander Heimann" <> wrote in message
    news:...

    [using readdir and open]

    some of the entries returned by readdir may not be readable files,
    for examples directories.
    for example '.' and '..'

    gnari
     
    gnari, Jun 15, 2004
    #17
  18. If anyone is interested. I am pasting the code that worked for the
    above problem. The main problem i was having was that when I was
    reading the directory i forgot to add
    next if $file =~/^\.\.?$/; after the while (defined($file =
    readdir(DIR))) to skip over the .

    after i stopped getting crazy errors

    thanks for all your help guys. i will try to contribute as much as I
    can i have only been playing with perl for a week now

    # modules used
    use DBI;
    use File::Basename;
    use File::Slurp;







    #create connection to database
    $dbh = DBI->connect('dbi:mysql:$dbname:localhost:3306',
    '$username','$password8',
    { RaiseError => 1, AutoCommit => 1});
    fileparse_set_fstype("MSDOS");


    #open directory loop while there is still a file
    $mydir = "D:/desc";
    opendir (DIR, $mydir) || die "couldn't opendir $mydir: $!\n";
    while (defined($file = readdir(DIR))) {
    next if $file =~/^\.\.?$/;


    #open file and assign content of file to variable $content
    my $content = read_file("$mydir/$file");


    #parse file and assing extension to $ext variable
    ($name, $dir, $ext) = fileparse($file, '\..*');
    $ext =~s/^\.//;




    #SQL insert statement to insert $ext and $content into DB
    #prepare then excecute

    $sth= $dbh->prepare("INSERT INTO `desc`VALUES (?,?)");
    $sth->execute ( $ext, $content );


    }
    #close directory
    closedir(DIR);
    #disconnect from database
    $dbh->disconnect();
     
    Alexander Heimann, Jun 16, 2004
    #18
  19. Alexander Heimann

    Joe Smith Guest

    Alexander Heimann wrote:

    > If anyone is interested. I am pasting the code that worked for the
    > above problem. The main problem i was having was that when I was
    > reading the directory i forgot to add
    > next if $file =~/^\.\.?$/; after the while (defined($file =
    > readdir(DIR))) to skip over the .


    But what if someone creates a subdirectory in D:/D2 ?
    The check on /^\.\.$/ is for the cases where files and subdirectories
    will both be processed.

    In your case, it is more robust to use
    next unless -f "$mydir/$file";
    to skip anything that is not a plain file (which will skip '.' and '..').

    -Joe
     
    Joe Smith, Jun 18, 2004
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page