Trying to implement MD5

S

Sloede

I don't know whether this is the right place to pose this question,
but I don't know any better group:

When I tried to implement the md5 algorithm in c I never got the
expected digests until I looked up the appendix of rfc1321 where
Rivest gives an reference implementation in C.

The RFC says "append the length of the message (before padding) as two
32-bit words, least significant word first, lsb-first as well. But
actually Rivest does save the length in a 32-bit word but then shifts
it 3 bits to the left!

What I get for a message length of one:

0x00000001 0x00000000

But he gets:

0x00000008 0x00000000

Does anyone know the reason for that and can explain it to me? Or does
anyone have a least a clue?
 
T

Tom St Denis

Sloede said:
I don't know whether this is the right place to pose this question,
but I don't know any better group:

When I tried to implement the md5 algorithm in c I never got the
expected digests until I looked up the appendix of rfc1321 where
Rivest gives an reference implementation in C.

The RFC says "append the length of the message (before padding) as two
32-bit words, least significant word first, lsb-first as well. But
actually Rivest does save the length in a 32-bit word but then shifts
it 3 bits to the left!

What I get for a message length of one:

0x00000001 0x00000000

But he gets:

0x00000008 0x00000000

Does anyone know the reason for that and can explain it to me? Or does
anyone have a least a clue?

Length is in bits not bytes.

Free MD5 [among many others] source in portable C at

http://libtomcrypt.org

Tom
 
A

Alex Fraser

Sloede said:
I don't know whether this is the right place to pose this question,
but I don't know any better group:

It isn't. But I don't know anywhere better either.

[snip]
The RFC says "append the length of the message (before padding) as two
32-bit words, least significant word first, lsb-first as well. But
actually Rivest does save the length in a 32-bit word but then shifts
it 3 bits to the left!

It would appear that the length referred to in your quote is measured in
bits not bytes.

Alex
 
G

Gordon Burditt

When I tried to implement the md5 algorithm in c I never got the
expected digests until I looked up the appendix of rfc1321 where
Rivest gives an reference implementation in C.

The RFC says "append the length of the message (before padding) as two
32-bit words, least significant word first, lsb-first as well. But
actually Rivest does save the length in a 32-bit word but then shifts
it 3 bits to the left!

Doesn't rfc1321 define the length of the message as being *IN BITS*,
not bytes?

Gordon L. Burditt
 
S

Sloede

Doesn't rfc1321 define the length of the message as being *IN BITS*,
not bytes?

Gordon L. Burditt

Phew, this piece of information I just oversaw. Thanks to all for
helping me, though this isn't actually the right group to ask (but
noone did name a more appropriate one, hence it didn't seem to be a
very bad choice).

Michael Schlottke
 
A

August Derleth

Alex said:
It isn't. But I don't know anywhere better either.

comp.programming springs to mind, as this is an algorithm and,
therefore, independent of any specific language. But maybe the
comp.programming folks would see things differently.

comp.lang.c would be a good place to come when you have an
implementation in standard C and it either a) doesn't work, or b) you
want to make it more portable (even standard C leaves enough up to the
implementation to make this a bit of a task if you're inexperienced).
Then the regs here wouldn't focus on your algorithm as much as your
code, but [OT] posts (that is, posts that are knowingly off-topic with
[OT] in their name) eventually crop up in most large threads.

(Need I say that we don't like having non-standard C in comp.lang.c?
There are plenty of places to ask about compiler-specific extensions and
modifications, including header files like conio.h and dos.h, or
unistd.h and wait.h. We usually aren't rude, but we are pretty firm.)
[snip]
The RFC says "append the length of the message (before padding) as two
32-bit words, least significant word first, lsb-first as well. But
actually Rivest does save the length in a 32-bit word but then shifts
it 3 bits to the left!


It would appear that the length referred to in your quote is measured in
bits not bytes.

Everyone else seems to concur with this assessment. It seems natural,
anyway: Who treats bits as a datatype these days? Even in C, the
smallest primitive type is the char, which maps naturally to the concept
of a byte. (CHAR_BITS == 8 for most implementations, and I think it is
guaranteed to be at least 8 for all conformant ones.)
 
T

Tom St Denis

August Derleth said:
Everyone else seems to concur with this assessment. It seems natural,
anyway: Who treats bits as a datatype these days? Even in C, the
smallest primitive type is the char, which maps naturally to the concept
of a byte. (CHAR_BITS == 8 for most implementations, and I think it is
guaranteed to be at least 8 for all conformant ones.)

Most implementations are octet oriented (e.g. compress an array of 64
"chars") so even on platforms where CHAR_BIT >= 9 the algorithms will work
provided you only put message bits in the lower 8 bits.

Technically though MD5 is bit oriented [can hash any arbitrary length
message]. However, it's not really useful that way since you can't really
transmit/store odd-lengths to most mediums anyways.

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top