case insentisive file search

Discussion in 'C Programming' started by henrik.sorensen@balcab.ch, Nov 2, 2006.

  1. Guest

    Hi List,

    I am looking for a way to do a case insensitive search for file names.

    Anybody have some hints ?

    thanks
    Henrik

    pl1gcc.sourceforge.net
     
    , Nov 2, 2006
    #1
    1. Advertising

  2. wrote:

    > Hi List,
    >
    > I am looking for a way to do a case insensitive search for file names.
    >
    > Anybody have some hints ?


    I don't know who List is but I'll attempt a reply.
    For every file name you read, turn it into all lower
    using tolower() and then do an ordinary search.
     
    Spiros Bousbouras, Nov 2, 2006
    #2
    1. Advertising

  3. Ian Collins Guest

    wrote:
    > Hi List,
    >
    > I am looking for a way to do a case insensitive search for file names.
    >
    > Anybody have some hints ?
    >

    Not realy a C question, but why don't you just read the filename and
    convert it to lower or upper case?

    --
    Ian Collins.
     
    Ian Collins, Nov 2, 2006
    #3
  4. Guest

    Ian Collins wrote:

    > wrote:
    >> Hi List,
    >>
    >> I am looking for a way to do a case insensitive search for file names.
    >>
    >> Anybody have some hints ?
    >>

    > Not realy a C question, but why don't you just read the filename and
    > convert it to lower or upper case?
    >

    ok I should have explained a bit more.

    I am writing a PL/I frontend for gcc. (pl1gcc.sourceforge.net)
    When processing %INCLUDE statements, that just reads a file from the
    filesystem, and places the text in the source program, I am faced with the
    following problem.
    Syntax
    %INCLUDE filename ;

    Traditionally PL/I is case insensitive.
    So when my scanner/parser searches for the filename to include, it can
    happen that the filename is in uppercase, but the actual file on the
    filesystem is in mixed case.

    Also I am trying to avoid to scan the whole directory for each file I have
    to include.

    The scanner is written using flex, and the parser is using bison, and all
    the necessary help functions are written in C.

    thanks
    Henrik
     
    , Nov 2, 2006
    #4
  5. wrote:

    > Ian Collins wrote:
    >
    > > wrote:
    > >> Hi List,
    > >>
    > >> I am looking for a way to do a case insensitive search for file names.
    > >>
    > >> Anybody have some hints ?
    > >>

    > > Not realy a C question, but why don't you just read the filename and
    > > convert it to lower or upper case?
    > >

    > ok I should have explained a bit more.
    >
    > I am writing a PL/I frontend for gcc. (pl1gcc.sourceforge.net)
    > When processing %INCLUDE statements, that just reads a file from the
    > filesystem, and places the text in the source program, I am faced with the
    > following problem.
    > Syntax
    > %INCLUDE filename ;


    Is %INCLUDE something related to PL/I ?

    > Traditionally PL/I is case insensitive.
    > So when my scanner/parser searches for the filename to include, it can
    > happen that the filename is in uppercase, but the actual file on the
    > filesystem is in mixed case.
    >
    > Also I am trying to avoid to scan the whole directory for each file I have
    > to include.


    Scan the whole directory once , turn each file name into
    all lower case as you read it and put it into memory.
    Then for every file you want to include check if its name
    appears in what you have stored in memory.

    There may be some way which avoids reading the whole
    directory but this would depend on the way your platform
    allows you to read directories. In any case it would make
    things more complicated so unless the directories involved
    are really huge I don't think it would worth the trouble. And
    of course there's always the possibility that one of the included
    files appears at the end of the directory so you would still
    need to search the whole thing no matter which method you
    use.
     
    Spiros Bousbouras, Nov 2, 2006
    #5
  6. In article <c5295$454a69a9$5448c618$>,
    <> wrote:

    >I am looking for a way to do a case insensitive search for file names.


    >Anybody have some hints ?


    You can't do that in standard C, as standard C gives no mechanisms
    to search for any file. The best you can do in standard C is to
    attempt to open a file and see if you succeed or not -- and
    if you do succeed, there is no way to tell if you are looking at
    the same file as another or a different file. Standard C places
    no interpretation upon filenames (other than that they are null
    terminated, so even if you know one filename, you cannot guess
    from it which other filenames might be valid.

    Thus, in order to do a search for filenames, you need to use
    implementation-specific system calls or libraries or build in knowledge
    about what filenames look like for your purposes.

    Generally speaking, you will need to find a system extension that
    allowed you to examine a directory for filenames. Those extensions
    vary between operating systems and system versions. For example,
    once upon a time in Unix the standard mechanism was to open the
    directory as if it were a file, and then to read the binary contents
    using the built-in knowledge that the first 14 characters out of every
    16 were the null-padded filename (and the last 2 characters were
    a binary encoding of an inode number.) This mechanism isn't
    much used in newer systems -- but really, the method for Windows
    looks a lot different than the method for Unix. Some systems
    provide mechanisms to pass in a prefix or pattern and to get back
    the next matching filename (or all matching filenames); many do not.

    If you are willing to restrict your portability to POSIX and
    some other random systems, you can use opendir() and readdir(),
    but don't count on a filename pattern match routine.
    --
    I was very young in those days, but I was also rather dim.
    -- Christopher Priest
     
    Walter Roberson, Nov 2, 2006
    #6
  7. Walter Roberson wrote:

    > In article <c5295$454a69a9$5448c618$>,
    > <> wrote:
    >
    > >I am looking for a way to do a case insensitive search for file names.

    >
    > >Anybody have some hints ?


    out_of_topic {
    >
    > If you are willing to restrict your portability to POSIX and
    > some other random systems, you can use opendir() and readdir(),
    > but don't count on a filename pattern match routine.


    What about glob() ?

    }
     
    Spiros Bousbouras, Nov 2, 2006
    #7
  8. writes:
    > Ian Collins wrote:
    >
    >> wrote:
    >>> Hi List,
    >>>
    >>> I am looking for a way to do a case insensitive search for file names.
    >>>
    >>> Anybody have some hints ?
    >>>

    >> Not realy a C question, but why don't you just read the filename and
    >> convert it to lower or upper case?
    >>

    > ok I should have explained a bit more.
    >
    > I am writing a PL/I frontend for gcc. (pl1gcc.sourceforge.net)
    > When processing %INCLUDE statements, that just reads a file from the
    > filesystem, and places the text in the source program, I am faced with the
    > following problem.
    > Syntax
    > %INCLUDE filename ;
    >
    > Traditionally PL/I is case insensitive.
    > So when my scanner/parser searches for the filename to include, it can
    > happen that the filename is in uppercase, but the actual file on the
    > filesystem is in mixed case.
    >
    > Also I am trying to avoid to scan the whole directory for each file I have
    > to include.


    Assuming a file system that allows mixed-case file names, and in which
    case distinctions are significant, I don't see how you can avoid
    scanning the whole directory. You can probably cache the result
    rather than scanning it for each "%INCLUDE". But scanning a directory
    shouldn't be a terribly expensive on most systems.

    You'll also need to decide what to do if "filename", "FileName", and
    "FILENAME" all exist. I'm guesing the PL/I standard provides some
    guidance on this.

    But since standard C has no concept of directories, you're probably
    better off asking in comp.unix.programmer.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Nov 2, 2006
    #8
  9. Keith Thompson wrote:

    > writes:
    > > Ian Collins wrote:
    > >
    > >> wrote:
    > >>> Hi List,
    > >>>
    > >>> I am looking for a way to do a case insensitive search for file names.
    > >>>
    > >>> Anybody have some hints ?

    > > Also I am trying to avoid to scan the whole directory for each file I have
    > > to include.

    >
    > Assuming a file system that allows mixed-case file names, and in which
    > case distinctions are significant, I don't see how you can avoid
    > scanning the whole directory.


    You start reading file names from the directory and
    you check if they match against what you want to
    inlude. If you find a match you can stop there , you
    don't need to read the rest of the directory. This of
    course if you know that there aren't file names which
    are repeated with different capitalization or you don't
    care.
     
    Spiros Bousbouras, Nov 2, 2006
    #9
  10. Guest

    Keith Thompson wrote:

    > But since standard C has no concept of directories, you're probably
    > better off asking in comp.unix.programmer.

    thanks for the tip.
     
    , Nov 2, 2006
    #10
  11. Guest

    Spiros Bousbouras wrote:

    > wrote:
    >
    >> Ian Collins wrote:
    >>
    >> > wrote:

    > Is %INCLUDE something related to PL/I ?

    yes.
    It is similar to C's #include.
    > Scan the whole directory once , turn each file name into
    > all lower case as you read it and put it into memory.
    > Then for every file you want to include check if its name
    > appears in what you have stored in memory.


    good idea.
    this would work, and even bring a nice improvement as well
    thanks
     
    , Nov 2, 2006
    #11
  12. Guest

    Walter Roberson wrote:

    > In article <c5295$454a69a9$5448c618$>,
    > <> wrote:
    >
    >>I am looking for a way to do a case insensitive search for file names.

    >
    >>Anybody have some hints ?

    >
    > You can't do that in standard C, as standard C gives no mechanisms
    > to search for any file. The best you can do in standard C is to
    > attempt to open a file and see if you succeed or not -- and
    > if you do succeed, there is no way to tell if you are looking at
    > the same file as another or a different file. Standard C places
    > no interpretation upon filenames (other than that they are null
    > terminated, so even if you know one filename, you cannot guess
    > from it which other filenames might be valid.


    thanks for explaining this...

    >
    > Thus, in order to do a search for filenames, you need to use
    > implementation-specific system calls or libraries or build in knowledge
    > about what filenames look like for your purposes.
    >

    ....
    > If you are willing to restrict your portability to POSIX and
    > some other random systems, you can use opendir() and readdir(),
    > but don't count on a filename pattern match routine.

    ok I will look into opendir()/readdir()

    thanks
     
    , Nov 2, 2006
    #12
  13. In article <>,
    Spiros Bousbouras <> wrote:
    >Walter Roberson wrote:


    >> If you are willing to restrict your portability to POSIX and
    >> some other random systems, you can use opendir() and readdir(),
    >> but don't count on a filename pattern match routine.


    >What about glob() ?


    glob() is not in POSIX.1-1990 -- it came in with POSIX.2
    --
    There are some ideas so wrong that only a very intelligent person
    could believe in them. -- George Orwell
     
    Walter Roberson, Nov 2, 2006
    #13
  14. cloverman Guest

    On Nov 2, 9:57 pm, wrote:
    > Hi List,
    >
    > I am looking for a way to do a case insensitive search for file names.
    >
    > Anybody have some hints ?
    >
    > thanks
    > Henrik
    >
    > pl1gcc.sourceforge.net


    /*this is a OS particular question but I'd like my UNIX-specific code
    criticized here - any errors I'd like to know*/

    #include <stdio.h>
    #include <ctype.h>
    #include <string.h>

    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <ustat.h>

    #include <malloc.h>
    #include <unistd.h>
    #include <dirent.h>


    static char *
    CopyStringPrefix( const char *String, size_t PrefixLength )
    {
    char *CopyOfString = NULL;

    if ( NULL == String || PrefixLength < 0 )
    {
    return NULL;
    }

    CopyOfString = (char *)malloc( PrefixLength + 1 );
    if ( NULL == CopyOfString )
    {
    return NULL;
    }

    strncpy( CopyOfString, String, PrefixLength );
    CopyOfString[ PrefixLength ] = '\0';
    return CopyOfString;
    }

    static char *
    CopyString( const char *String )
    {
    if ( NULL == String )
    {
    return NULL;
    }

    return CopyStringPrefix( String, strlen( String ) );
    }

    static char *
    CopyDirectory( const char *InFullPath )
    {
    char *DirectoryName = NULL;
    char *LastSlash = NULL;

    if ( NULL == InFullPath )
    {
    return NULL;
    }

    LastSlash = strrchr( InFullPath, '/' );
    if ( NULL == LastSlash )
    {
    DirectoryName = CopyString( "./" );
    }
    else
    {
    DirectoryName = CopyStringPrefix( InFullPath, LastSlash - InFullPath
    + 1 );
    }
    return DirectoryName;
    }

    static char *
    CopyFileName( const char *InFullPath )
    {
    char *FileName= NULL;
    char *LastSlash = NULL;

    if ( NULL == InFullPath )
    {
    return NULL;
    }

    LastSlash = strrchr( InFullPath, '/' );
    if ( NULL == LastSlash )
    {
    FileName = CopyString( InFullPath );
    }
    else
    {
    FileName = CopyString( LastSlash + 1 );
    }
    return FileName;
    }

    static int
    IsRegularFile( const char *FullPath )
    {
    struct stat FileStatus;

    if ( 0 != stat( FullPath, &FileStatus ) )
    {
    return 0;
    }
    return S_ISREG(FileStatus.st_mode );
    }

    static int
    OpenDirFile( const char *DirPath, const char *FileName, int
    ReadWriteMode )
    {
    int fd = -1;
    char *NewFullPath = NULL;

    if ( NULL == DirPath || NULL == FileName )
    {
    return -1;
    }

    NewFullPath = malloc( strlen( DirPath ) + strlen( FileName ) + 1 );

    if ( NULL == NewFullPath )
    return -1;

    strcpy( NewFullPath, DirPath );
    strcat( NewFullPath, FileName );

    if ( IsRegularFile( NewFullPath ) )
    {
    ReadWriteMode = O_RDWR;
    fd = open( NewFullPath, ReadWriteMode );
    }
    free( NewFullPath );
    return fd;
    }

    int
    CaseInsensitiveIsEqual( const char *String1, const char *String2 )
    {
    int LengthOfString1 = 0, LengthOfString2 = 0, EqualCharCount = 0;

    if ( NULL == String1 || NULL == String2 )
    {
    return String1 == String2;
    }

    LengthOfString1 = strlen( String1 );
    LengthOfString2 = strlen( String2 );
    if ( LengthOfString1 != LengthOfString2 )
    {
    return 0;
    }

    EqualCharCount = 0;
    while ( EqualCharCount < LengthOfString1 &&
    tolower(String1[EqualCharCount]) == tolower(String2[EqualCharCount])
    )
    {
    EqualCharCount++;
    }

    return EqualCharCount == LengthOfString1;
    }

    static char *
    CaseInsensitiveFindFileName( const char *DirPath, const char *FileName
    )
    {
    char *FoundFileName = NULL;
    struct dirent *DirectoryEntry = NULL;
    DIR *Directory = NULL;

    if ( NULL == DirPath )
    {
    return NULL;
    }

    Directory = opendir( DirPath );
    if ( NULL == Directory )
    {
    return NULL;
    }

    DirectoryEntry = readdir( Directory );
    while ( NULL != DirectoryEntry )
    {
    if ( CaseInsensitiveIsEqual( FileName, DirectoryEntry->d_name ) )
    {
    FoundFileName = CopyString( DirectoryEntry->d_name );
    break;
    }
    DirectoryEntry = readdir( Directory );
    }
    closedir( Directory );
    return FoundFileName;
    }

    int
    CaseInsensitiveFileOpen( /* const */ char *FullPath, int ReadWriteMode
    )
    {
    char *DirectoryPath = NULL, *FileName = NULL;
    int fd = open( FullPath, O_RDWR );

    if ( fd >= 0 )
    {
    return fd;
    }

    DirectoryPath = CopyDirectory( FullPath );
    if ( NULL == DirectoryPath )
    {
    return -1;
    }

    FileName = CopyFileName( FullPath );
    if ( NULL == FileName )
    {
    free( DirectoryPath );
    return -1;
    }
    else
    {
    char *FoundFileName = CaseInsensitiveFindFileName( DirectoryPath,
    FileName );

    fd = -1;
    if ( NULL != FoundFileName )
    {
    fd = OpenDirFile( DirectoryPath, FoundFileName, ReadWriteMode );
    free( FoundFileName );
    }
    }

    free( DirectoryPath );
    free( FileName );

    return fd;
    }
     
    cloverman, Nov 3, 2006
    #14
  15. cloverman Guest

    [snip]

    funny how posting my code here has made me examine the code more
    closely

    > CaseInsensitiveFileOpen( /* const */ char *FullPath, int ReadWriteMode

    /* const */ is an unneeded bodge

    the use of O_RDWR as opposed to ReadWriteMode is a horrible bodge -
    O_RDWR should be replaced by ReadWriteMode
     
    cloverman, Nov 3, 2006
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steve Franks
    Replies:
    2
    Views:
    1,267
    Steve Franks
    Jun 10, 2004
  2. Tee
    Replies:
    3
    Views:
    7,857
    Herfried K. Wagner [MVP]
    Jun 23, 2004
  3. Janice

    lower case to upper case

    Janice, Dec 10, 2004, in forum: C Programming
    Replies:
    17
    Views:
    1,195
    Richard Bos
    Dec 14, 2004
  4. Abby Lee
    Replies:
    5
    Views:
    443
    Abby Lee
    Aug 2, 2004
  5. Xah Lee
    Replies:
    4
    Views:
    1,004
Loading...

Share This Page