compare 2 files

Discussion in 'C++' started by Siemel Naran, Dec 19, 2004.

  1. Siemel Naran

    Siemel Naran Guest

    How to compare if two files are identical? I wrote the following:

    bool comparefiles(const std::string& lhs, const std::string& rhs)
    {
    std::ifstream lhsfile(lhs.c_str());
    std::ifstream rhsfile(rhs.c_str());

    typedef std::istreambuf_iterator<char> istreambuf_iterator;

    return std::equal(
    istreambuf_iterator(lhsfile),
    istreambuf_iterator(),
    istreambuf_iterator(rhsfile)
    );
    }

    But I don't think it will work becuase: (1) we only compare the first N
    chars where N is the number of chars in lhsfile, so if rhsfile has more
    chars the function will return true if the first N are equal which is
    incorrect, (2) the standard says that calling operator* on an end of stream
    is undefined (24.5.3.3), so if lhsfile has more chars then we will at some
    point call operator* on rhsfile when it is at EOF, and the result is
    undefined (though I think it should always return EOF).

    So what else can we do?

    I could use the stat function to check if lhsfile and rhsfile have the same
    size, but I want to keep my code ANSI compatible.

    So I came up with the following function, which looks very much like strcmp.


    bool comparefiles(const std::string& lhs, const std::string& rhs)
    {
    using namespace std;
    const streambuf::int_type eof = streambuf::traits_type::eof();

    ifstream lhsfile(lhs.c_str());
    ifstream rhsfile(rhs.c_str());

    streambuf * lhsbuf = lhsfile.rdbuf();
    streambuf * rhsbuf = rhsfile.rdbuf();

    char lhschar, rhschar;
    while (true)
    {
    lhschar = lhsbuf->sbumpc();
    rhschar = rhsbuf->sbumpc();

    if (lhschar == eof && rhschar == eof) return true;
    if (lhschar == eof || rhschar == eof) break;
    if (lhschar != rhschar) break;
    }

    cout << "compare \"" << lhs << "\" and \"" << rhs << "\" failed\n";
    return false;
    }


    Any comments?
    Siemel Naran, Dec 19, 2004
    #1
    1. Advertising

  2. "Siemel Naran" <> wrote in message
    news:fvcxd.1121739$...
    > How to compare if two files are identical? I wrote the following:

    ....
    > So I came up with the following function, which looks very much like
    > strcmp.
    >
    >
    > bool comparefiles(const std::string& lhs, const std::string& rhs)
    > {
    > using namespace std;
    > const streambuf::int_type eof = streambuf::traits_type::eof();
    >
    > ifstream lhsfile(lhs.c_str());
    > ifstream rhsfile(rhs.c_str());
    >
    > streambuf * lhsbuf = lhsfile.rdbuf();
    > streambuf * rhsbuf = rhsfile.rdbuf();

    Since only the stream buffer interface is used, you can directly
    create instances of std::filebuf instead of an ifstream.

    > char lhschar, rhschar;

    These two variables should be of type int_type. char may be unable
    to represent eof (or be equal to eof when it should not, e.g.
    when reading 0xFF on an implementation where char is signed).

    > while (true)
    > {
    > lhschar = lhsbuf->sbumpc();
    > rhschar = rhsbuf->sbumpc();
    >
    > if (lhschar == eof && rhschar == eof) return true;
    > if (lhschar == eof || rhschar == eof) break;
    > if (lhschar != rhschar) break;
    > }

    or:
    do {
    lhschar = lhsbuf.sbumpc();
    rhschar = rhsbuf.sbumpc();
    if( lhschar != rhschar ) return false;
    } while( lhschar != eof );
    return true;


    Cheers,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
    Ivan Vecerina, Dec 19, 2004
    #2
    1. Advertising

  3. Siemel Naran

    Howard Guest

    I think the first thing I'd do it check if the file sizes are the same. No
    need to read tthrough the file looking for differences if they're different
    sizes. I'm not familiar with how to check file size, but if that's easy
    enough to do, you might want to throw in a check for that equality before
    bothering to check the contents. Just a thought...

    -Howard
    Howard, Dec 20, 2004
    #3
  4. Siemel Naran

    Siemel Naran Guest

    "Howard" <> wrote in message news:J7Dxd.1129358

    > I think the first thing I'd do it check if the file sizes are the same.

    No
    > need to read tthrough the file looking for differences if they're

    different
    > sizes. I'm not familiar with how to check file size, but if that's easy
    > enough to do, you might want to throw in a check for that equality before
    > bothering to check the contents. Just a thought...


    This is the ideal solution, then I can continue to use std::equal as in my
    original code. However, the standard does not provide a way to find the
    file size without opening it and scanning to the last character. Opening
    the file, calling file.seekg(ios::end) followed by file.tellp() is allowed
    to return 0 rather than the actual byte position though my implementation
    does in fact return the file size. There is a function stat, and it's on
    Windows and Linux, but it's not ANSI standard (though maybe it should be).
    I know that boost also has some way to get the file size, and I imagine the
    implementation calls stat on Windows and Linux, etc.
    Siemel Naran, Dec 20, 2004
    #4
  5. Siemel Naran

    Jeff Flinn Guest

    Siemel Naran wrote:
    > "Howard" <> wrote in message news:J7Dxd.1129358
    >
    >> I think the first thing I'd do it check if the file sizes are the
    >> same. No need to read tthrough the file looking for differences if
    >> they're different sizes. I'm not familiar with how to check file
    >> size, but if that's easy enough to do, you might want to throw in a
    >> check for that equality before bothering to check the contents.
    >> Just a thought...

    >
    > This is the ideal solution, then I can continue to use std::equal as
    > in my original code. However, the standard does not provide a way to
    > find the file size without opening it and scanning to the last
    > character. Opening the file, calling file.seekg(ios::end) followed
    > by file.tellp() is allowed to return 0 rather than the actual byte
    > position though my implementation does in fact return the file size.
    > There is a function stat, and it's on Windows and Linux, but it's not
    > ANSI standard (though maybe it should be). I know that boost also has
    > some way to get the file size, and


    Yes, at:

    http://www.boost.org/libs/filesystem/doc/operations.htm#file_size


    > I imagine the implementation calls stat on Windows and Linux, etc.


    Windows: GetFileAttributes()
    POSIX: stat()

    Jeff Flinn
    Jeff Flinn, Dec 20, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. YC
    Replies:
    1
    Views:
    4,830
    siva chelliah
    Aug 13, 2003
  2. alan jeeves
    Replies:
    10
    Views:
    6,270
    James D. Veale
    Mar 5, 2004
  3. Xavier

    How to compare 2 XML files

    Xavier, Nov 30, 2005, in forum: Java
    Replies:
    2
    Views:
    5,702
    Andrew E
    Dec 2, 2005
  4. edw
    Replies:
    2
    Views:
    9,468
  5. vaggelis
    Replies:
    0
    Views:
    3,500
    vaggelis
    Jul 13, 2003
Loading...

Share This Page