Bit Manipulation for a Pure Ruby SHA1 implementation

Jason Larsen · Oct 12, 2010

I'm trying to write a pure Ruby implementation of SHA1. In order to do
so, I need to be able to directly manipulate bits. I know about the
bitwise operators in Ruby, but I'm having issues with the way that
Ruby wraps things up in classes. Basically I've run into two problems.

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn't be there. The fix I've found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I'd like to know how to do it bit
by bit if possible.

I've seen the String.unpack and Array.pack used for binary data.
However, I can only get the bits out in one glob str.unpack("B*"), or
str.unpack("B8"*string.length), which still just gives me a bunch of
stringified 8bit binary values. Seems like there should be a better
way.

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Any help would be greatly appreciated. Thanks in advance.

Josh Cheek · Oct 12, 2010

[Note: parts of this message were removed to make it a legal post.]

I'm trying to write a pure Ruby implementation of SHA1. In order to do
so, I need to be able to directly manipulate bits. I know about the
bitwise operators in Ruby, but I'm having issues with the way that
Ruby wraps things up in classes. Basically I've run into two problems.

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn't be there. The fix I've found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I'd like to know how to do it bit
by bit if possible.

I've seen the String.unpack and Array.pack used for binary data.
However, I can only get the bits out in one glob str.unpack("B*"), or
str.unpack("B8"*string.length), which still just gives me a bunch of
stringified 8bit binary values. Seems like there should be a better
way.

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Any help would be greatly appreciated. Thanks in advance.

Hi, if you just need a portable SHA1 implementation, there is already one in
the standard library.
http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html

As for your problems, I'm finding myself really confused by your
terminology. For example, when you say "adding a one-bit and k zeros" does
that mean num + 2**k, Or is num <<= 1 ; num += 1 ; num <<= k or something
else even?

It might be more helpful if you show "this is what I have, this is what I
want, this is how I tried to get there"

Jarsen · Oct 12, 2010

[Note: parts of this message were removed to make it a legal post.]

I'm trying to write a pure Ruby implementation of SHA1. In order to do
so, I need to be able to directly manipulate bits. I know about the
bitwise operators in Ruby, but I'm having issues with the way that
Ruby wraps things up in classes. Basically I've run into two problems.

Click to expand...

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn't be there. The fix I've found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I'd like to know how to do it bit
by bit if possible.

Click to expand...

I've seen the String.unpack and Array.pack used for binary data.
However, I can only get the bits out in one glob str.unpack("B*"), or
str.unpack("B8"*string.length), which still just gives me a bunch of
stringified 8bit binary values. Seems like there should be a better
way.

Click to expand...

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Click to expand...

Any help would be greatly appreciated. Thanks in advance.

Click to expand...

Hi, if you just need a portable SHA1 implementation, there is already onein
the standard library.http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html

As for your problems, I'm finding myself really confused by your
terminology. For example, when you say "adding a one-bit and k zeros" does
that mean num + 2**k, Or is num <<= 1 ; num += 1 ; num <<= k or something
else even?

It might be more helpful if you show "this is what I have, this is what I
want, this is how I tried to get there"

Sorry, looking back on that I even confused myself. I am aware of the
SHA1 implementation in the standard library, I'm just trying to write
one myself using pure Ruby.

This is what I have so far for padding:

def SHA1 message
k = 448-((message.bitsize % block_bitsize) + 1)
m = message.unpack("B*")[0].to_i(2) << (k+65)
padding = 1
padding <<= k+64
padding |= message.bitsize # then add last 64 bit block (message
length) which is the len
m |= padding # add the padding to the message
#... split into 512-bit blocks
#... do SHA1 operations
end

Where bitsize is defined as:
class String; def bitsize; bytesize * 8; end; end

The method so far seems to be giving me the correct value for m.
However, my problem is with the leading zeros. For example, if the
message is "abc, message.unpack("B*")[0] returns
"011000010110001001100011". When I call to_i(2) on that number, the
leading 0 bit will disappear - which makes sense because to 01 an 1
are both the same number.

However, when I begin to perform sha1 operations on those blocks I
will need that leading 0 back; in the example where message = "abc",
m.to_s(2) or will only give me a 511 digit long output, when I need a
512 bit block. So I need to figure out how to keep those leading zeros
in place so I have the correct block size. Hopefully this makes more
sense.

Jarsen · Oct 12, 2010

[Note: parts of this message were removed to make it a legal post.]

Click to expand...

Hi, if you just need a portable SHA1 implementation, there is already one in
the standard library.http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html

Click to expand...

As for your problems, I'm finding myself really confused by your
terminology. For example, when you say "adding a one-bit and k zeros" does
that mean num + 2**k, Or is num <<= 1 ; num += 1 ; num <<= k or something
else even?

Click to expand...

It might be more helpful if you show "this is what I have, this is whatI
want, this is how I tried to get there"

Click to expand...

Sorry, looking back on that I even confused myself. I am aware of the
SHA1 implementation in the standard library, I'm just trying to write
one myself using pure Ruby.

This is what I have so far for padding:

def SHA1 message
k = 448-((message.bitsize % block_bitsize) + 1)

# EDIT
# I think I need this here actually for certain cases
k+=512 if k < 0

m = message.unpack("B*")[0].to_i(2) << (k+65)
padding = 1
padding <<= k+64
padding |= message.bitsize # then add last 64 bit block (message
length) which is the len
m |= padding # add the padding to the message
#... split into 512-bit blocks
#... do SHA1 operations
end

Where bitsize is defined as:
class String; def bitsize; bytesize * 8; end; end

The method so far seems to be giving me the correct value for m.
However, my problem is with the leading zeros. For example, if the
message is "abc, message.unpack("B*")[0] returns
"011000010110001001100011". When I call to_i(2) on that number, the
leading 0 bit will disappear - which makes sense because to 01 an 1
are both the same number.

However, when I begin to perform sha1 operations on those blocks I
will need that leading 0 back; in the example where message = "abc",
m.to_s(2) or will only give me a 511 digit long output, when I need a
512 bit block. So I need to figure out how to keep those leading zeros
in place so I have the correct block size. Hopefully this makes more
sense.

Brian Candler · Oct 14, 2010

Jason Larsen wrote in post #949355:

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn't be there. The fix I've found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I'd like to know how to do it bit
by bit if possible.

A String in ruby is a collection of bytes, not bits.

I think you'll find that as long as your input to your sha1 algorithm is
also a String, and thus a multiple of 8 bits, everything will work fine.

The way the sha1 algorithm is defined it can be implemented for an input
which is not a multiple of 8 bits, but as you've found, it's not easy to
represent that anyway

Therefore, if you want your implementation to
work with strange numbers of bits, you'll first need to choose an input
representation.

If I were you, I'd be happy to limit my implementation to inputs which
are whole bytes. (After all, even a file on the filesystem is a whole
number of bytes)

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Don't worry about Fixnum versus Bignum, that's internal to the
implementation and mostly hidden. Even on a 32-bit machine you can make
Integers which are 64-bits and above.

Unfortunately, Array#pack("Q") works differently on little-endian and
big-endian machines:

irb(main):001:0> [255].pack("Q")
=> "\377\000\000\000\000\000\000\000"

So for maximum portability I'd do two instances of "N", i.e.

n = 0x0123456789abcdef => 81985529216486895
[n >> 32, n].pack("NN")

Click to expand...

=> "\001#Eg\211\253\315\357"

which you can check is correct like this:

[n >> 32, n].pack("NN").unpack("H*")

Click to expand...

=> ["0123456789abcdef"]

Regards,

Brian.

ruby + Qt + Windows + 64 bit	0	Jan 24, 2014
ruby on pure 64-bit computer	2	Nov 21, 2006
[ANN] Packet : A Pure Ruby Library for Event Driven NetworkProgramming	10	Nov 20, 2007
Pure Ruby zlib?	4	Dec 10, 2007
LiveAST: a pure Ruby 1.9.2 library obtaining live abstract syntaxtrees	12	Feb 18, 2011
[BOUNTY] Pure Ruby Zlib::GzipWriter	42	Dec 15, 2008
compiling "pure" ruby + installing gpref for msys	0	May 31, 2008
64-bit integers where the implementation supports max 32-bit ints	37	Aug 5, 2013

Bit Manipulation for a Pure Ruby SHA1 implementation

Jason Larsen

Josh Cheek

Jarsen

Jarsen

Brian Candler

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads