Cryptographic Hash: File vs Message?

Kenneth Kan · Dec 25, 2008

This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is
to take the path to a file and return the file digest. The weird thing
is, with both functions using the same hash, the one taking a message
returns a correct digest while the one taking the file does not (I
checked against various sources). For instance, when I have a file
called "test.txt" with "test files" as its content, the hash for "test
files" is correct but the checksum for the file is not correct
(checked against whirlpooldeep). Is it just C, the file, the hash
algorithm, or what? I'm desperate after hours of research. Here is my
source code in question. Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];
char* digest;
FILE* file;
int length;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {
rewind(file);
while(fread(buffer, sizeof(u8), DIGESTBYTES, file) != 0) {
NESSIEadd(buffer, sizeof(u8)*DIGESTBYTES, &w);
}
}

return whirlpool_hex(&w);
}

char* whirlpool_string(char* string) {
struct NESSIEstruct w;
NESSIEinit(&w);

NESSIEadd(string, 8*strlen(string), &w);

return whirlpool_hex(&w);
}

Kenneth Kan · Dec 25, 2008

There is actually one more function preceding the two:

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[128];

NESSIEfinalize(w, digest);

// U8 to Hex
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {
first = digest/16;
second = digest%16;
result[i*2] = (char)((first<10) ? (first+48) : (first+55));
result[i*2+1] = (char)((second<10) ? (second+48) : (second+55));
}

return result;
}

Moi · Dec 25, 2008

This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html]. One
is to take a string and return a 512-bit digest and the other is to take
the path to a file and return the file digest. The weird thing is, with
both functions using the same hash, the one taking a message returns a
correct digest while the one taking the file does not (I checked against
various sources). For instance, when I have a file called "test.txt"
with "test files" as its content, the hash for "test files" is correct
but the checksum for the file is not correct (checked against
whirlpooldeep). Is it just C, the file, the hash algorithm, or what? I'm
desperate after hours of research. Here is my source code in question.
Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];
char* digest;
FILE* file;
int length;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {

size_t cnt;

rewind(file);
while(fread(buffer, sizeof(u8), DIGESTBYTES, file) != 0) {

NESSIEadd(buffer, sizeof(u8)*DIGESTBYTES, &w);

while((cnt=fread(buffer, sizeof(u8), DIGESTBYTES, file)) !
= 0) {
NESSIEadd(buffer, sizeof(u8)*cnt, &w);

}

NB: sizeof u8 is 1 (given that u8 is an unsigned char)

}

return whirlpool_hex(&w);
}

char* whirlpool_string(char* string) {
struct NESSIEstruct w;
NESSIEinit(&w);

NESSIEadd(string, 8*strlen(string), &w);

I don't think you want the "8*" here

return whirlpool_hex(&w);
}

HTH,
AvK

Ike Naar · Dec 25, 2008

This is probably not only exclusive to C but I've programmed in C two

functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is
to take the path to a file and return the file digest. The weird thing
is, with both functions using the same hash, the one taking a message
returns a correct digest while the one taking the file does not (I
checked against various sources). For instance, when I have a file
called "test.txt" with "test files" as its content, the hash for "test
files" is correct but the checksum for the file is not correct
(checked against whirlpooldeep). Is it just C, the file, the hash
algorithm, or what? I'm desperate after hours of research. Here is my
source code in question. Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];
char* digest;
FILE* file;
int length;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {
rewind(file);
while(fread(buffer, sizeof(u8), DIGESTBYTES, file) != 0) {
NESSIEadd(buffer, sizeof(u8)*DIGESTBYTES, &w);
}
}

return whirlpool_hex(&w);
}

What happens if the file size is not a multiple of DIGESTBYTES?
In that case, the last fread will return a partial buffer, but
you seem feed the entire buffer to NESSIEadd.

Can you try something like:

size_t bytes_read;
while ((bytes_read = fread(buffer, sizeof(u8), DIGESTBYTES, file)) != 0)
{
NESSIEadd(buffer, sizeof(u8) * bytes_read, &w);
}

Regards,
Ike

osmium · Dec 25, 2008

Kenneth Kan said:
This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is
to take the path to a file and return the file digest. The weird thing
is, with both functions using the same hash, the one taking a message
returns a correct digest while the one taking the file does not (I
checked against various sources). For instance, when I have a file
called "test.txt" with "test files" as its content, the hash for "test
files" is correct but the checksum for the file is not correct
(checked against whirlpooldeep). Is it just C, the file, the hash
algorithm, or what? I'm desperate after hours of research. Here is my
source code in question. Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];
char* digest;
FILE* file;
int length;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {

Open the file in binary mode.
<snip>

Thad Smith · Dec 25, 2008

Kenneth said:
This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is
to take the path to a file and return the file digest. The weird thing
is, with both functions using the same hash, the one taking a message
returns a correct digest while the one taking the file does not (I
checked against various sources). For instance, when I have a file
called "test.txt" with "test files" as its content, the hash for "test
files" is correct but the checksum for the file is not correct
(checked against whirlpooldeep). Is it just C, the file, the hash
algorithm, or what? I'm desperate after hours of research. Here is my
source code in question. Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];
char* digest;
FILE* file;
int length;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {
rewind(file);
while(fread(buffer, sizeof(u8), DIGESTBYTES, file) != 0) {
NESSIEadd(buffer, sizeof(u8)*DIGESTBYTES, &w);
}
}

return whirlpool_hex(&w);
}

char* whirlpool_string(char* string) {
struct NESSIEstruct w;
NESSIEinit(&w);

NESSIEadd(string, 8*strlen(string), &w);

return whirlpool_hex(&w);
}

whirlpool_file does not use the actual length of data returned by fread,
which will vary on the final fread.

Also, assuming u8 is a typedef for unsigned char, the declaration for
buffer should be
u8 buffer[DIGESTBYTES];

You may be using the typedef u8 to designate that the object contains
exactly 8 information bits, regardless of the size of unsigned char. If
so, be aware that fread reads a given number of characters, rather than
octets.

Barry Schwarz · Dec 25, 2008

This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is
to take the path to a file and return the file digest. The weird thing
is, with both functions using the same hash, the one taking a message
returns a correct digest while the one taking the file does not (I
checked against various sources). For instance, when I have a file
called "test.txt" with "test files" as its content, the hash for "test
files" is correct but the checksum for the file is not correct
(checked against whirlpooldeep). Is it just C, the file, the hash

In all probability it is not C but the way you handle the data. It is
a shame you did not show us the actual file contents or how you built
it. You also did not show us how you call either function.

algorithm, or what? I'm desperate after hours of research. Here is my
source code in question. Thanks so much in advance!

void NESSIEinit(struct NESSIEstruct * const structpointer);

void NESSIEadd(const unsigned char * const source,
unsigned long sourceBits,
struct NESSIEstruct * const structpointer);

void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_file(char* path) {
u8* buffer[sizeof(u8)*DIGESTBYTES];

This is an array of pointers so it is probably 4 times larger than you
need. You fill buffer with characters, not addresses. Since you use
the elements of the array only as characters and never as pointers,
this is just bad coding and probably not the source of your problem.
However, it should have generated a diagnostic on the call to
NESSIEadd unless the first parameter happens to be of type void*.

char* digest;

Is this ever used in this function?

FILE* file;
int length;

You probably meant to use this to retain the value returned by fread
as others have suggested.

struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {

You open the file in text mode, consistent with your description of
the contents.

rewind(file);
while(fread(buffer, sizeof(u8), DIGESTBYTES, file) != 0) {
NESSIEadd(buffer, sizeof(u8)*DIGESTBYTES, &w);

Problem 1: You are passing NESSIEadd a length that is larger than the
"useful" contents of buffer. NESSIEadd may process the extra
characters whose value is indeterminate. Unless your system happens
to always initialize automatic variables to the same value, these
values are essentially random and you should not even get the same
result running the program twice.

Problem 2: Text files almost always contain at least one '\n'.
Consequently, you are processing one more "real" character from the
file than you are from the string.

}
}

return whirlpool_hex(&w);
}

char* whirlpool_string(char* string) {
struct NESSIEstruct w;
NESSIEinit(&w);

NESSIEadd(string, 8*strlen(string), &w);

Why when you call NESSIEadd for a file you pass the number of bytes
but when you call it for a string you pass the number of bits? If you
call whirlpool_string for either a pointer to a string literal or to
an array sized to hold the string literal, this may invoke undefined
behavior if NESSIEadd does not stop at the first '\0'.

Barry Schwarz · Dec 25, 2008

There is actually one more function preceding the two:

Please don't top post. Responses and additional data should be
inserted at the appropriate places in your quote of the original
message or at the end.

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[128];

Since result is static, it is initialized to zero before your program
begins. On the first call, you will store the "answer" at the
beginning of the array. If the second call is for a shorter string
than the first call, residual data will be left in the unused
elements. You may want to add a memset before the for loop.

NESSIEfinalize(w, digest);

// U8 to Hex
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {
first = digest/16;
second = digest%16;
result[i*2] = (char)((first<10) ? (first+48) : (first+55));

It's a shame this won't work on my EBCDIC system. If you would use
first+'0' and first+'A'-10, it would work on any system where A-F are
contiguous.

result[i*2+1] = (char)((second<10) ? (second+48) : (second+55));
}

return result;

Click to expand...

Didn't this generate a diagnostic? Your function returns a char*. You
actually return a u8*. You didn't tell us but u8* is usually unsigned
char*. This is a different type for which no implicit conversion is
defined.

}

This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is

Click to expand...

Click to expand...

The value you return does not point to a 512-bit object. The static
array "result" is at least twice as big. But at least digest is now
an array of char instead of an array of pointers.

snip previous code

Tomás Ó hÉilidhe · Dec 25, 2008

Open the file in binary mode.

Then you'll have to address the issue of what to do when \r\n is
encountered.

Kenneth Kan · Dec 25, 2008

There is actually one more function preceding the two:

Click to expand...

Please don't top post. Responses and additional data should be
inserted at the appropriate places in your quote of the original
message or at the end.

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[128];

Click to expand...

Since result is static, it is initialized to zero before your program
begins. On the first call, you will store the "answer" at the
beginning of the array. If the second call is for a shorter string
than the first call, residual data will be left in the unused
elements. You may want to add a memset before the for loop.

NESSIEfinalize(w, digest);

Click to expand...

// U8 to Hex
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {
first = digest/16;
second = digest%16;
result[i*2] = (char)((first<10) ? (first+48) : (first+55));

Click to expand...

It's a shame this won't work on my EBCDIC system. If you would use
first+'0' and first+'A'-10, it would work on any system where A-F are
contiguous.

result[i*2+1] = (char)((second<10) ? (second+48) : (second+55));
}

Click to expand...

return result;

Click to expand...

Didn't this generate a diagnostic? Your function returns a char*. You
actually return a u8*. You didn't tell us but u8* is usually unsigned
char*. This is a different type for which no implicit conversion is
defined.

}

Click to expand...

This is probably not only exclusive to C but I've programmed in C two
functions with the Whirlpool cryptographic hash downloaded from the
official site [http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html].
One is to take a string and return a 512-bit digest and the other is

Click to expand...

Click to expand...

The value you return does not point to a 512-bit object. The static
array "result" is at least twice as big. But at least digest is now
an array of char instead of an array of pointers.

snip previous code

I'm sorry that I wasn't aware of the etiquette as I utilize IRC way
more often than Usenet. Anyway, thank you everyone for helping me out.
It finally works and I realized how many mistakes I was making (I was
questioning C not because I meant it; I was just very frustrated after
hours of debugging and to make it more dramatic). And, Barry, thank
you especially since your responses were very detailed and helpful. As
you may tell, I'm not a C programmer. I need the Whirlpool library
(written in C) to connect to my Lisp code (so my call would be from a
Lisp wrapper). In response to Barry, The three functions that I posted
was my code in its entirety as the rest is library code and the whole
thing was then compiled and run in the Lisp environment, so I thought
it would be inappropriate to post here. Anyhow, I learned a lot about
C this time around and for future learners of C, I'll summarize my
lessons here (this is my working code):

#define DIGESTBYTES 64
#define HEX_STRLEN 128

void NESSIEinit(struct NESSIEstruct * const structpointer);
void NESSIEadd(const unsigned char * const source, unsigned long
sourceBits, struct NESSIEstruct * const structpointer);
void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[HEX_STRLEN];
/* reset static variables every time */
memset(result, (char)0, HEX_STRLEN);
NESSIEfinalize(w, digest);

// U8 to Hex in ASCII
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {
first = digest/16;
second = digest%16;
/* Use character codes to avoid dependence on ASCII numeric code */
result[i*2] = (char)((first<10) ? (first+'0') : (first+'a'-10));
result[i*2+1] = (char)((second<10) ? (second+'0') : (second
+'a'-10));
}

return result;
}

char* whirlpool_file(char* path) {
/* array of characters (u8 buffer[]) vs. array of pointers to
characters (u8* buffer[]) */
u8 buffer[DIGESTBYTES];
FILE* file;
size_t bytes_read;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {
rewind(file);
/* check end of file */
while((bytes_read = fread(buffer, sizeof(u8), DIGESTBYTES, file)) !=
0) {
/* only use what was read: 8*DIGESTBYTES vs. 8*bytes_read */
NESSIEadd(buffer, 8*bytes_read, &w);
}
return whirlpool_hex(&w);
}

return "\0";
}

char* whirlpool_string(char* string) {
struct NESSIEstruct w;
NESSIEinit(&w);
/* make sure casting is added to avoid diagnostics */
NESSIEadd((u8*)string, 8*strlen(string), &w);

return whirlpool_hex(&w);
}

Now that I learned by reciting all my major mistakes, hopefully others
wouldn't waste as much time when they encounter this for the first
time. Thanks C hackers!

CBFalconer · Dec 25, 2008

Tomás Ó hÉilidhe said:
Then you'll have to address the issue of what to do when \r\n is
encountered.

Why should it be encountered? Have you some knowledge about the
characteristics of his operating system, etc.? Why should either
'\r' or '\n' ever be encountered in a binary file?

Keith Thompson · Dec 26, 2008

CBFalconer said:
Why should it be encountered? Have you some knowledge about the
characteristics of his operating system, etc.? Why should either
'\r' or '\n' ever be encountered in a binary file?

Quoting the original post:

For instance, when I have a file called "test.txt" with "test
files" as its content, the hash for "test files" is correct but
the checksum for the file is not correct (checked against
whirlpooldeep).

The name "test.txt" strongly implies that it's a text file, as does
the content. We don't know that the OP is using Windows, but it's not
unlikely.

To the original poster: hash functions, when applied to files,
typically read the entire file in binary mode. For example, programs
such as "md5sum" and "sha1sum" are commonly available. If you apply a
hash function to the string "test files", you're probably generating a
hash from just those 10 character. When stored as a line in a text
file, they'll likely be stored as somethng like "test files\n" or
"test files\r\n"; this depends on the text file format, which depends
on the operating system. If you can store "test.txt" in a file with
no line terminator, the hash should be the same as for the string
"test.txt".

osmium · Dec 26, 2008

:

To the original poster: hash functions, when applied to files,
typically read the entire file in binary mode. For example, programs
such as "md5sum" and "sha1sum" are commonly available. If you apply a
hash function to the string "test files", you're probably generating a
hash from just those 10 character. When stored as a line in a text
file, they'll likely be stored as somethng like "test files\n" or
"test files\r\n"; this depends on the text file format, which depends
on the operating system. If you can store "test.txt" in a file with
no line terminator, the hash should be the same as for the string
"test.txt".

The OP seems happy so I say let the matter drop. I didn't try to figure out
what he was running on, and if it happened to be *nix no harm would be done
by opening in binary mode, and if he was in Windows it might be helpful. At
the very least, it would make a copy of the *file* - not something else -
available to his program. I am not enthusiastic about operating on a
derivative of a file and calling the result a "cryptographic hash" of the
actual, verbatim, file.

Barry Schwarz · Dec 26, 2008

On Thu, 25 Dec 2008 14:37:10 -0800 (PST), Kenneth Kan

snip

I'm sorry that I wasn't aware of the etiquette as I utilize IRC way
more often than Usenet. Anyway, thank you everyone for helping me out.
It finally works and I realized how many mistakes I was making (I was
questioning C not because I meant it; I was just very frustrated after
hours of debugging and to make it more dramatic). And, Barry, thank
you especially since your responses were very detailed and helpful. As
you may tell, I'm not a C programmer. I need the Whirlpool library
(written in C) to connect to my Lisp code (so my call would be from a
Lisp wrapper). In response to Barry, The three functions that I posted
was my code in its entirety as the rest is library code and the whole
thing was then compiled and run in the Lisp environment, so I thought
it would be inappropriate to post here. Anyhow, I learned a lot about
C this time around and for future learners of C, I'll summarize my
lessons here (this is my working code):

#define DIGESTBYTES 64
#define HEX_STRLEN 128

void NESSIEinit(struct NESSIEstruct * const structpointer);
void NESSIEadd(const unsigned char * const source, unsigned long
sourceBits, struct NESSIEstruct * const structpointer);
void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[HEX_STRLEN];

Since result must be at least twice as large as digest, using
2*DIGESTBYTES (either here of in the #define for HEX_STRLEN) would be
preferable.

/* reset static variables every time */
memset(result, (char)0, HEX_STRLEN);
NESSIEfinalize(w, digest);

// U8 to Hex in ASCII
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {

Unless NESSIEfinalize is guaranteed to fill every byte of digest, you
are still processing the wrong number of bytes. If it is guaranteed,
then DIGESTBYTES should be defined in a NESSIE header and not your
code.

first = digest/16;
second = digest%16;
/* Use character codes to avoid dependence on ASCII numeric code */
result[i*2] = (char)((first<10) ? (first+'0') : (first+'a'-10));
result[i*2+1] = (char)((second<10) ? (second+'0') : (second
+'a'-10));

'a' is fine with me but your original code wanted 'A'.

}

return result;
}

char* whirlpool_file(char* path) {
/* array of characters (u8 buffer[]) vs. array of pointers to
characters (u8* buffer[]) */
u8 buffer[DIGESTBYTES];
FILE* file;
size_t bytes_read;
struct NESSIEstruct w;
NESSIEinit(&w);

if((file = fopen(path, "r")) != NULL) {
rewind(file);
/* check end of file */
while((bytes_read = fread(buffer, sizeof(u8), DIGESTBYTES, file)) !=
0) {

Click to expand...

Unless one of the NESSIE functions is smart enough to ignore '\n', you
still will not be able to compare the results from the file with the
results from the string.

Kenneth Kan · Dec 26, 2008

On Thu, 25 Dec 2008 14:37:10 -0800 (PST), Kenneth Kan

snip

I'm sorry that I wasn't aware of the etiquette as I utilize IRC way
more often than Usenet. Anyway, thank you everyone for helping me out.
It finally works and I realized how many mistakes I was making (I was
questioning C not because I meant it; I was just very frustrated after
hours of debugging and to make it more dramatic). And, Barry, thank
you especially since your responses were very detailed and helpful. As
you may tell, I'm not a C programmer. I need the Whirlpool library
(written in C) to connect to my Lisp code (so my call would be from a
Lisp wrapper). In response to Barry, The three functions that I posted
was my code in its entirety as the rest is library code and the whole
thing was then compiled and run in the Lisp environment, so I thought
it would be inappropriate to post here. Anyhow, I learned a lot about
C this time around and for future learners of C, I'll summarize my
lessons here (this is my working code):

Click to expand...

#define DIGESTBYTES 64
#define HEX_STRLEN 128

Click to expand...

void NESSIEinit(struct NESSIEstruct * const structpointer);
void NESSIEadd(const unsigned char * const source, unsigned long
sourceBits, struct NESSIEstruct * const structpointer);
void NESSIEfinalize(struct NESSIEstruct * const structpointer,
unsigned char * const result);

Click to expand...

char* whirlpool_hex(struct NESSIEstruct* w) {
u8 digest[DIGESTBYTES];
static char result[HEX_STRLEN];

Click to expand...

Since result must be at least twice as large as digest, using
2*DIGESTBYTES (either here of in the #define for HEX_STRLEN) would be
preferable.

/* reset static variables every time */
memset(result, (char)0, HEX_STRLEN);
NESSIEfinalize(w, digest);

Click to expand...

// U8 to Hex in ASCII
int i, first, second;
for(i=0; i<DIGESTBYTES; i++) {

Click to expand...

Unless NESSIEfinalize is guaranteed to fill every byte of digest, you
are still processing the wrong number of bytes. If it is guaranteed,
then DIGESTBYTES should be defined in a NESSIE header and not your
code.

It is guaranteed and DIGESTBYTES is indeed defined in the NESSIE
header. I simply brought in the necessary definitions so my code could
be clear. Sorry for the confusion.

first = digest/16;
second = digest%16;
/* Use character codes to avoid dependence on ASCII numeric code */
result[i*2] = (char)((first<10) ? (first+'0') : (first+'a'-10));
result[i*2+1] = (char)((second<10) ? (second+'0') : (second
+'a'-10));

Click to expand...

'a' is fine with me but your original code wanted 'A'.

}

Click to expand...

return result;
}

Click to expand...

char* whirlpool_file(char* path) {
/* array of characters (u8 buffer[]) vs. array of pointers to
characters (u8* buffer[]) */
u8 buffer[DIGESTBYTES];
FILE* file;
size_t bytes_read;
struct NESSIEstruct w;
NESSIEinit(&w);

Click to expand...

if((file = fopen(path, "r")) != NULL) {
rewind(file);
/* check end of file */
while((bytes_read = fread(buffer, sizeof(u8), DIGESTBYTES, file)) !=
0) {

Click to expand...

Unless one of the NESSIE functions is smart enough to ignore '\n', you
still will not be able to compare the results from the file with the
results from the string.

My intent was actually to create a generic WHIRLPOOL hash function for
files, just like md5sum, so "\n" would be included. Just as Keith and
osmium pointed out, I want the entire raw content to be hashed, thus I
used u8[] here. The point is, this function now outputs the correct
checksum for both text and binary files, as their hashes from this
code match with whirlpooldeep's hashes of the GnuGP, FUSE, and OpenSSL
GZIP files that I downloaded. By the way, I'm running on an Intel Mac,
so a POSIX-compatible machine.

I'm really amazed at the meticulousness and thoroughness of this
group. Though I think we can let this settle since the code works
wonderfully (efficiency aside). Thanks for all the inputs. I have much
to learn from this place.

Richard Bos · Jan 5, 2009

=?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?= said:
Then you'll have to address the issue of what to do when \r\n is
encountered.

Treat it as two characters, '\r' and '\n', of course. It's a
cryptographic hash; it should be able to distinguish between two files
which differ even in a single character.

Richard

Help with EXT3 Filesystem work	1	Mar 13, 2022
Lexical Analysis on C++	1	Oct 31, 2023
Working with files	1	Dec 10, 2021
Adding adressing of IPv6 to program	1	Feb 16, 2023
Hash table implementation.	38	Aug 11, 2011
Hash table Implementation	3	Mar 29, 2011
ill-formed reference to pointer	11	Jan 30, 2011
pointer vs pointer to pointer	4	Jun 11, 2012

Cryptographic Hash: File vs Message?

Kenneth Kan

Kenneth Kan

Moi

Ike Naar

osmium

Thad Smith

Barry Schwarz

Barry Schwarz

Tomás Ó hÉilidhe

Kenneth Kan

CBFalconer

Keith Thompson

osmium

Barry Schwarz

Kenneth Kan

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads