What could potentially be wrong in this script?

Discussion in 'Perl Misc' started by Chris, Mar 29, 2007.

  1. Chris

    Chris Guest

    I am writing a Perl script to check that dependency files exist that
    are included in definition files that we have. However, I am getting
    a completely unexplainable problem. For one specific file only, the
    algorithm I have doesn't seem to work, or at least the string that is
    getting saved from the regex is somehow not working properly with
    certain Perl calls. The format of the strings is the same in every
    file and in this one in particular, I have even modified the line,
    moved it around, etc, etc. I am basically going through each file,
    opening it, and then based on the type of file, saving a regex to
    use. In one case, "^#tagdef\\s*(.*)" and in the other "^#include\
    \s*(.*)" (they are saved off in variables. As I read each line of the
    file, I am searching for the currently switched on regex, and then
    using $1 to see what I have in the (), which is the file I want to
    search for. This file is saved in a list of files that I gather (and
    this file is in the list, I've printed and see it is there). I do a
    foreach() on this file list, and then use index(fileInlist, $1) != -1
    to indicate I have found the file. However, for only one specific
    file, this doesn't work. What is even stranger is, if I print "$1\n",
    the file prints just fine. But, if I do something like print "$1 after
    \n", the whole output is messed up. If I print "before $1\n", nothing
    prints at all. If I print "before $1 after\n", only after prints.

    Here is a cut-paste of my script:

    #!/usr/bin/perl -w

    use lib "./scripts/perl/FILES-1.0";

    use strict;
    use Files::FileModules;
    use Cwd;

    my $boxPath = $ARGV[0];
    my $myPath = "path";

    my $startingDir = getcwd;

    my @filePostfix = ("\.def\$", "\.cfg\$");

    chdir("$boxPath/$myPath");

    my @myFiles = FindAllFiles(@filePostfix, "./");

    my @myFileCopy = @myFiles;

    my $DEFILE = 0;
    my $CFGFILE = 1;

    foreach my $files (@myFiles)
    {
    my $searchTag = " ";

    if(index($files, 'imgGame') != -1)
    {
    next;
    }

    open FILEHANDLE, "< $files" or die "Can't open $files\n";

    print "\nIn File $files\n";

    if($files =~ /\.def$/)
    {
    $searchTag = "^#include\\s*(.*)";
    }
    elsif($files =~ /\.cfg$/)
    {
    $searchTag = "^#tagdef\\s*(.*)";
    }

    while(<FILEHANDLE>)
    {
    if($_ =~ /$searchTag/)
    {
    my $findThis = $1;
    my $foundFile = 0;
    print "\nSearching for $findThis\n";
    foreach my $thing (@myFileCopy)
    {
    if(index($thing, 'imgGame') != -1)
    {
    next;
    }

    if(index($thing, $findThis) != -1)
    {
    print "\nFound dependency: $findThis\n";
    $foundFile = 1;
    last;
    }
    }
    die "Did not find: $findThis in $files\n" if !$foundFile;
    }
    }

    close FILEHANDLE;
    }


    The index call isn't working on this bizarre string. However, if I do
    things like length() on it, it shows the correct length, but other
    calls, even the die call at the end, can't print it out. Only if it
    is printed by itself with nothing else does it even print out.
     
    Chris, Mar 29, 2007
    #1
    1. Advertising

  2. Chris

    Klaus Guest

    On Mar 29, 1:36 am, "Chris" <> wrote:

    [snip]

    > if I print "$1\n",
    > the file prints just fine. But, if I do something like print "$1 after
    > \n", the whole output is messed up. If I print "before $1\n", nothing
    > prints at all. If I print "before $1 after\n", only after prints.


    not really sure, but could be a rogue "\r" in $1,
    try dumping out the content in hex: print unpack('H*', $1);

    --
    Klaus
     
    Klaus, Mar 29, 2007
    #2
    1. Advertising

  3. Chris

    Chris Guest

    On Mar 28, 7:33 pm, "Klaus" <> wrote:
    > On Mar 29, 1:36 am, "Chris" <> wrote:
    >
    > [snip]
    >
    > > if I print "$1\n",
    > > the file prints just fine. But, if I do something like print "$1 after
    > > \n", the whole output is messed up. If I print "before $1\n", nothing
    > > prints at all. If I print "before $1 after\n", only after prints.

    >
    > not really sure, but could be a rogue "\r" in $1,
    > try dumping out the content in hex: print unpack('H*', $1);
    >
    > --
    > Klaus


    Hi, Klaus:
    Thanks for suggestion and for making me aware of unpack! :) There
    is a rogue carriage return (0xd) in the string that isn't appearing in
    the other strings, even though the file characteristics are the same
    as far as the naked eye can see between all my files, this is the only
    one with that strange character at the end. chomp() doesn't seem to
    get rid of it either. Is there something I can do to deal with this
    situation? I've tried retyping the string by hand in the file, but it
    doesn't seem to be going away for some reason. In fact, I can delete
    the line and retype it in any other file and it works. For some
    reason, this file isn't happy...
     
    Chris, Mar 29, 2007
    #3
  4. On 2007-03-29 13:15, Chris <> wrote:
    > Thanks for suggestion and for making me aware of unpack! :) There
    > is a rogue carriage return (0xd) in the string that isn't appearing in
    > the other strings, even though the file characteristics are the same
    > as far as the naked eye can see between all my files, this is the only
    > one with that strange character at the end. chomp() doesn't seem to
    > get rid of it either.


    Chomp only removes $/, which is usually "\n", not "\r".

    > Is there something I can do to deal with this situation?


    You probably want to ignore any whitespace at the end of the line, so
    you could change your pattern from:

    $searchTag = "^#include\\s*(.*)";

    to

    $searchTag = "^#include\\s*(.*?)\\s*$";

    This would not only get rid of the \r, but also of any spaces or tabs at
    the end of the line (which normally aren't visible to the naked eye
    either).


    > I've tried retyping the string by hand in the file, but it
    > doesn't seem to be going away for some reason.


    Your editor is probably detecting the MS-DOS line endings and acting
    accordingly. It should have a way to change this. For example, in vim,
    you can switch to unix-style line endings with

    :set fileformat=unix

    and then just save the file. There are also utilities like dos2unix
    which do this, or you can write a simple one-line perl script ...

    hp


    --
    _ | Peter J. Holzer | Blaming Perl for the inability of programmers
    |_|_) | Sysadmin WSR | to write clearly is like blaming English for
    | | | | the circumlocutions of bureaucrats.
    __/ | http://www.hjp.at/ | -- Charlton Wilbur in clpm
     
    Peter J. Holzer, Mar 29, 2007
    #4
  5. Chris <> wrote:
    > On Mar 28, 7:33 pm, "Klaus" <> wrote:
    >> On Mar 29, 1:36 am, "Chris" <> wrote:
    >>
    >> [snip]
    >>
    >> > if I print "$1\n",
    >> > the file prints just fine. But, if I do something like print "$1 after
    >> > \n", the whole output is messed up. If I print "before $1\n", nothing
    >> > prints at all. If I print "before $1 after\n", only after prints.

    >>
    >> not really sure, but could be a rogue "\r" in $1,



    > There
    > is a rogue carriage return (0xd) in the string


    > Is there something I can do to deal with this
    > situation?



    Repair the corrupted file:

    perl -p -i -e 'tr/\r//d' bad_file


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Mar 30, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Munk

    A potentially dangerous Request.Form

    Alex Munk, Dec 16, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    592
    Adrijan Josic
    Dec 17, 2003
  2. Anil Kripalani
    Replies:
    2
    Views:
    488
    Eric Lawrence [MSFT]
    Feb 25, 2004
  3. amit
    Replies:
    1
    Views:
    519
    Eric Lawrence [MSFT]
    Feb 26, 2004
  4. Boris
    Replies:
    5
    Views:
    2,537
    Joe Kaplan \(MVP - ADSI\)
    Apr 17, 2004
  5. =?Utf-8?B?U1RlY2g=?=

    Potentially dangerous script - urgent!

    =?Utf-8?B?U1RlY2g=?=, Apr 19, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    610
    =?Utf-8?B?U1RlY2g=?=
    Apr 22, 2005
Loading...

Share This Page