check file exists with case sensitive on a case insensitive filesystem

Discussion in 'Perl Misc' started by Xah Lee, Apr 5, 2009.

  1. Xah Lee

    Xah Lee Guest

    been a while i coded in perl...

    how to check if a file exists with case sensitiveness on os x?

    a quick google search says call system ls & grep. That seems too slow.
    I have some over 10 thousand files to check.

    Xah
    ∑ http://xahlee.org/

    ☄
     
    Xah Lee, Apr 5, 2009
    #1
    1. Advertising

  2. Xah Lee

    Uri Guttman Guest

    Re: check file exists with case sensitive on a case insensitivefile system

    >>>>> "XL" == Xah Lee <> writes:

    XL> been a while i coded in perl...
    XL> how to check if a file exists with case sensitiveness on os x?

    XL> a quick google search says call system ls & grep. That seems too slow.
    XL> I have some over 10 thousand files to check.

    sorry we don't help self declared delusional geniuses who denigrate perl
    in every insane cross post they make. my suggestion is to rtfm. or do it
    in assembler. or write your own filesystem and plug it in.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
     
    Uri Guttman, Apr 5, 2009
    #2
    1. Advertising

  3. Re: check file exists with case sensitive on a case insensitive file system

    Xah Lee <> wrote:


    > been a while i coded in perl...



    Thank you.


    > how to check if a file exists with case sensitiveness on os x?
    >
    > a quick google search says call system ls & grep. That seems too slow.
    > I have some over 10 thousand files to check.



    Bummer Dude.


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Apr 5, 2009
    #3
  4. Re: check file exists with case sensitive on a case insensitive file system

    Xah Lee <> wrote in news:a92be1ae-a933-49e3-9ea6-
    :

    > been a while i coded in perl...
    >
    > how to check if a file exists with case sensitiveness on os x?
    >
    > a quick google search says call system ls & grep. That seems too slow.
    > I have some over 10 thousand files to check.


    How is that too slow?

    On Windows XP:

    C:\Temp\test> dir
    ....
    38077 File(s) 466 bytes

    C:\Temp\test> cat check.bat
    ls | grep %1

    C:\Temp\test> timethis check ZZCTWD8DIS

    C:\Temp\test> ls | grep ZZCTWD8DIS

    TimeThis : Elapsed Time : 00:00:00.375

    C:\Temp\test> dir ZZCTWD8DIS

    2009/04/05 01:21 PM 0 zzctwd8dis

    Less than half a second for 38,077 files. Of course, if you wanted to
    use this in a loop, it would get very slow very fast.

    Using Perl functions to replace ls and grep increases performance to 6
    or 7 checks per second (against the same set of 38,077 randomly
    generated filenames). I am not providing details because someone of your
    stature should be able to write the appropriate benchmarks with no
    effort.

    Finally, if the contents of the directory are fixed, and you just want
    to check the existence of many files case sensitively, just do it the
    Perl way: Use a hash.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://www.rehabitation.com/clpmisc/
     
    A. Sinan Unur, Apr 5, 2009
    #4
  5. Xah Lee

    Guest

    Re: check file exists with case sensitive on a case insensitive file system

    On Sun, 5 Apr 2009 08:18:49 -0700 (PDT), Xah Lee <> wrote:

    >been a while i coded in perl...
    >
    >how to check if a file exists with case sensitiveness on os x?
    >
    >a quick google search says call system ls & grep. That seems too slow.
    >I have some over 10 thousand files to check.
    >
    > Xah
    >? http://xahlee.org/
    >
    >?


    This will do it for you Zah Zah.
    Somewhere in here is the answer to your problem.

    -sln

    -----------------------------------------
    #!/usr/bin/perl

    package SMG;
    use strict;

    use File::stat;
    use File::path;
    use File::Copy;
    use File::Find;
    use File::Spec;
    use sort 'stable';


    require Exporter;
    our @ISA = qw(Exporter);
    our @EXPORT = qw(SafeMerge GetCommonElementsNxM GetCommonElements);
    my $VERSION = 1.00;


    #my $current = sort::current();



    ####################################################################################
    # SAFE MERGE
    # The two major options -
    # 1. Merge 2 directories into 1, preserving/renaming duplicate named files.
    # 2. Compare 2 directories and just rename the duplicates of 1 directory.
    # Also -
    # - The prefix level will be on a per-file monitored basis.
    # - File/Dir names are cached and renamed in the "From" array,
    # - Move is done via os-rename function into the "ToDir".
    # - Remove "From" directory option (after rename).
    # - Duplicate's-renaming-only is done in-place in the "FromDir".
    # - Report output is created in the current directory (for now).
    # - Itteration scheme are within-directory, then cross-directory until no more dups.
    #-----------------------------------------------------------------------------------
    sub SafeMerge ($$$$$$$)
    {
    return 0 if (@_ < 2 || @_ > 7);
    my ($ToDir,$FromDir,$PrefixName,$PrefixLevel,$Exclude,$Duponly,$RmvFrom) = @_;

    $PrefixName = '' unless defined $PrefixName; # Renaming Prefix
    $PrefixLevel = 0 unless defined $PrefixLevel; # Number of times to += prefix

    $Exclude = [] unless defined $Exclude; # Exclude list (must be re)
    $Duponly = 0 unless defined $Duponly; # Flag to just rename dups in "from"
    $RmvFrom = 0 unless defined $RmvFrom; # Flag to remove "from" after operation

    my @To = ();
    my @From = ();
    my @Xfound = ();
    my $dupsize = 0;
    my $passes = 0;

    print "\n\n *******************************\n";
    print "********[ Safe Merge ]*********\n";
    print " *******************************\n";
    print "\nPrefix\t$PrefixName\nLevel\t$PrefixLevel\nChecks ... ";

    $ToDir = File::Spec->canonpath( "$ToDir" );
    $FromDir = File::Spec->canonpath( "$FromDir" );
    #print "$ToDir\n$FromDir";
    if ($ToDir eq $FromDir) { print "\rFrom/To are identical: $ToDir\n"; return 0; }
    if (!-e $ToDir) { print "\rTo directory does not exist: $ToDir\n"; return 0; }
    if (!-e $FromDir) { print "\rFrom directory does not exist: $FromDir\n"; return 0;}

    # Glob seems to be bothered by spaces in dir path names
    my $Sep = "\\".File::Spec->canonpath( "/" );
    {
    my $td = File::Spec->canonpath( "$ToDir/*" );
    my $fd = File::Spec->canonpath( "$FromDir/*" );
    if ($td =~ / /) {
    $td = "'$td'";
    }
    if ($fd =~ / /) {
    $fd = "'$fd'";
    }
    foreach (glob ("$td")) {
    /.+$Sep(.+)$/;
    # /.+\/(.+)$/;
    push (@To, [$1, 0]);
    }
    foreach (glob ("$fd")) {
    /.+$Sep(.+)$/;
    # /.+\/(.+)$/;
    my $ftmp = $1;
    for (@{$Exclude}) {
    if ($ftmp =~ /$_/) {
    # save in @Xfound if $RmvFrom is set (see below)
    print "\rExclude - $ftmp\n";
    push (@Xfound, $ftmp) if ($RmvFrom);
    $ftmp = undef;
    last;
    }
    }
    push (@From, [$ftmp, 0, '']) if defined ($ftmp);
    }
    }

    # Will return if "From" empty. Technically not an error.
    # If it is empty through exclusions (regexp filter), "From" could be rmdir'd below.
    if (@From == 0) {
    print "\rEmpty dir: $FromDir\n";
    return 0;
    }
    print "OK\n";


    ## ============================================================================
    ## Iterate until there are no more dups across directories after (D2) rename
    ##
    do {
    my $Dupref = GetCommonElementsNxM_HashMethod(\@To,\@From, 1, 0);
    $dupsize = @{$Dupref};
    $passes++;

    print "\nP A S S \# $passes\n---------------------\n";
    print "Found $dupsize dups:\n";

    #**********************************
    # Check for cross-directory dups
    # -------------
    if ($dupsize)
    {
    my $errflg = 0;
    print "\n<DIR> $FromDir\n\n";
    for my $ndx (@{$Dupref})
    {
    my $Fromref = $From[$ndx];
    my $pname = $PrefixName;

    print "\t$Fromref->[0] ... $PrefixName";

    #*****************************************
    # Rename and check for in-directory dups
    # ----------
    while (1)
    {
    # Level check
    if ($Fromref->[1] >= $PrefixLevel)
    {
    print "$Fromref->[0] <<level $Fromref->[1] exceeded>>\n";
    # Clear the name in From so this file doesn't get moved
    $Fromref->[0] = "";
    $errflg = 1;
    last;
    }
    my $stmp = uc($pname.$Fromref->[0]);
    my $found = 0;
    for (@From) { if (uc($_->[0]) eq $stmp) { $found = 1; last; } }

    if (!$found) {
    # OK to rename! Dup not found in "From"
    # Change the name in "From" to this new one
    # so it gets moved later. Increment its level of prefix
    # ---------------------
    print "$Fromref->[0]";
    print "\n";
    if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
    $Fromref->[0] = $pname.$Fromref->[0];
    $Fromref->[1] += 1;
    last;
    }
    else {
    # File exist with that name, add another level of prefix.
    # ---------------------
    print "$PrefixName";
    $pname = $pname.$PrefixName;
    if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
    $Fromref->[1] += 1;
    }
    }
    }
    if ($errflg) {
    print "!!! Can't rename some files, raise the level or change the prefix.\n";
    #return 0;
    }
    }
    } while ($dupsize > 0);
    ##
    ## End cross-dir iteration
    ## =============================


    #*************************************************************
    # Check for rename of duplicates in D2 (only) without moving
    # ------------
    if ($Duponly)
    {
    my $rentried = 0;
    my $renok = 0;
    print "\nRename only ... ";
    for (@From) {
    my $res = 0;
    if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
    $rentried++;
    my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
    my $fr2 = File::Spec->canonpath( "$FromDir/$_->[0]" );
    if (!($res = rename ("$fr", "$fr2"))) {
    print "Rename error:\t$_->[2] ... $_->[0]\n";
    } else {
    $renok++; }
    }
    }
    if ($rentried) { print "$renok out of $rentried files OK\n"; }
    else { print "No duplicates found\n"; }
    return 1;
    }

    #**********************
    # Move "From" into "To"
    # -------------------
    #print @From."@From\n";
    my $sep = File::Spec->canonpath( "/" );
    print "\nMoving:\t$FromDir$sep* to $ToDir$sep\n";
    my $movcnt = 0;

    for (@From) {
    my $res = 0;
    if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
    my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
    my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
    $res = File::Copy::move ("$fr", "$to");
    if ($res == 0) { print "Move error:\t$_->[2] ... $to\n"; }
    }
    if ($_->[1] == 0 && $_->[0] gt '') {
    my $fr = File::Spec->canonpath( "$FromDir/$_->[0]" );
    my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
    $res = File::Copy::move ("$fr", "$to");
    if ($res == 0) { print "Move error:\t$fr\n" }
    }
    $movcnt++ if ($res);
    }
    my $notmoved = @From - $movcnt;
    print "Moved:\t$movcnt out of ".@From." files.\n";

    #***************************************
    # Check if "FromDir" is to be deleted
    # --------------
    if ($RmvFrom) {
    print "Remove:\t$FromDir ... ";
    if ($notmoved > 0) {
    print "not deleted, contains $notmoved file(s) that couldn't be moved!\n";
    }
    else {
    # check if @Xfound has values (and their not directories)
    # if so, delete these first before trying to remove the From directory
    # for now, don't want to "unlink" a directory, and we're not doing a tree here
    # ----------------
    for (@Xfound) {
    my $fr = File::Spec->canonpath( "$FromDir/$_" );
    unlink ("$fr") if (!-d $fr) ; }
    my $fr = File::Spec->canonpath( "$FromDir" );
    if (!rmdir "$fr") {print "$!\n" } else {print "OK\n"; }
    }
    }
    return 1;
    }


    #######################################################
    # Get Common Elements (from two N-dimensioned Array's)
    # IN - Refs to the NxN arrays to compare,
    # sort flag and the compare field.
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #
    sub GetCommonElementsNxM($$$$)
    {
    my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    $Fld = 0 unless defined $Fld;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Left};
    @{$A_Right} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = undef;
    my $rlast = undef;
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element->[$Fld]) eq uc($llast->[$Fld]));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element->[$Fld]) lt uc($right_element->[$Fld]));
    $cnt++;
    next if (uc($right_element->[$Fld]) eq uc($rlast->[$Fld]));
    if (uc($left_element->[$Fld]) eq uc($right_element->[$Fld]))
    {
    # push (@Dup, $right_element->[$Fld]); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (\@Dup);
    return (\@Ndx);
    }

    sub GetCommonElementsNxM_HashMethod($$$$)
    {
    my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    $Fld = 0 unless defined $Fld;

    my ($element,%count);
    #my @Dup = ();
    my @Ndx = ();
    my $idx = 0;

    foreach $element (@{$A_Left}) {
    $count{uc $element->[$Fld]} = 1;
    }
    foreach $element (@{$A_Right}) {
    if ($count{uc $element->[$Fld]} == 1) {
    #push @Dup, $element->[$Fld];
    push @Ndx, $idx;
    $count{$element->[$Fld]}++;
    }
    $idx++;
    }
    # return (\@Dup);
    return (\@Ndx);
    }


    #######################################################
    # Get Common Elements from single Array's
    # IN - Refs to the Nx1 arrays to compare, sort flag
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #######################################################
    sub GetCommonElements($$$)
    {
    my ($A_Left,$A_Right,$Srtflg) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a) cmp uc($b)} @{$A_Left};
    @{$A_Right} = sort {uc($a) cmp uc($b)} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = '';
    my $rlast = '';
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element) eq uc($llast));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element) lt uc($right_element));
    $cnt++;
    next if (uc($right_element) eq uc($rlast));
    if (uc($left_element) eq uc($right_element))
    {
    # push (@Dup, $right_element); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (\@Dup);
    return (\@Ndx);
    }

    sub GetCommonElements_HashMethod($$$$)
    {
    my ($A_Left,$A_Right,$Srtflg) = @_;
    $Srtflg = 0 unless defined $Srtflg;

    my ($element,%count);
    #my @Dup = ();
    my @Ndx = ();
    my $idx = 0;

    foreach $element (@{$A_Left}) {
    $count{uc $element} = 1;
    }
    foreach $element (@{$A_Right}) {
    if ($count{uc $element} == 1) {
    #push @Dup, $element;
    push @Ndx, $idx;
    $count{$element}++;
    }
    $idx++;
    }
    # return (\@Dup);
    return (\@Ndx);
    }

    1;
     
    , Apr 6, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. N.V.Dev
    Replies:
    10
    Views:
    10,095
    Does It Matter
    Aug 10, 2004
  2. Replies:
    1
    Views:
    2,556
    Mark P
    Apr 6, 2007
  3. Stephanie

    case sensitive / insensitive string equality

    Stephanie, Oct 3, 2005, in forum: ASP General
    Replies:
    2
    Views:
    194
    Steven Burn
    Oct 3, 2005
  4. Larry Martell

    Case insensitive exists()?

    Larry Martell, Jan 23, 2014, in forum: Python
    Replies:
    9
    Views:
    98
    Larry Martell
    Jan 23, 2014
  5. MRAB
    Replies:
    0
    Views:
    104
Loading...

Share This Page