Getting a nice string representation of a MessageDigest

Z

zerg

MessageDigest.toString produces an awkward and bulky mess with square
brackets and other awkward symbols inside.

I'd like to know of a simple way to get a nicer, preferably purely
alphanumeric string representation from a MessageDigest, but one which
doesn't lose information (i.e. if two MessageDigests are different,
their representations will be different).

Is the best way to get the byte array out and then render it in hex or
similarly?
 
Z

zerg

Christian said:
Use a Base32 encoder to convert the byte array into a redable String..

Thanks, both of you. But I just went ahead and rolled my own, which just
turns the byte array into a hex string like 3abf91d83a81890ef.
 
T

Tom Anderson

MessageDigest.toString produces an awkward and bulky mess with square
brackets and other awkward symbols inside.

I'd like to know of a simple way to get a nicer, preferably purely
alphanumeric string representation from a MessageDigest, but one which
doesn't lose information (i.e. if two MessageDigests are different, their
representations will be different).

Is the best way to get the byte array out and then render it in hex or
similarly?

I don't think there's a bytes-to-hex converter in the standard library,
but it's not exactly aerospace engineering:

private static final String HEX = "0123456789abcdef" ;
static String toHexString(byte[] buf) {
char[] chars = new char[buf.length * 2] ;
for (int i = 0 ; i < buf.length ; ++i) {
chars[2 * i] = HEX.charAt((buf >> 4) & 0xf) ;
chars[(2 * i) + 1] = HEX.charAt(buf & 0xf) ;
}
return new String(chars) ;
}

static String toHexString(MessageDigest hash) {
return toHexString(hash.digest()) ;
}

tom
 
Z

zerg

Tom said:
MessageDigest.toString produces an awkward and bulky mess with square
brackets and other awkward symbols inside.

I'd like to know of a simple way to get a nicer, preferably purely
alphanumeric string representation from a MessageDigest, but one which
doesn't lose information (i.e. if two MessageDigests are different,
their representations will be different).

Is the best way to get the byte array out and then render it in hex or
similarly?

I don't think there's a bytes-to-hex converter in the standard library,
but it's not exactly aerospace engineering:

private static final String HEX = "0123456789abcdef" ;
static String toHexString(byte[] buf) {
char[] chars = new char[buf.length * 2] ;
for (int i = 0 ; i < buf.length ; ++i) {
chars[2 * i] = HEX.charAt((buf >> 4) & 0xf) ;
chars[(2 * i) + 1] = HEX.charAt(buf & 0xf) ;
}
return new String(chars) ;
}

static String toHexString(MessageDigest hash) {
return toHexString(hash.digest()) ;
}


Pretty close to what I came up with:

private static final char[] HEX_DIGITS = {'0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

private static String getHashString (byte[] bytes) {
StringBuilder sb = new StringBuilder(bytes.length*2);
for (byte b : bytes) {
sb.append(HEX_DIGITS[b >> 4]);
sb.append(HEX_DIGITS[b & 0xf]);
}
return sb.toString();
}

Output resembles 09f911029d74e35bd84156c5635688c0 (example is for a
128-bit byte array -- 16 bytes).
 
Z

zerg

zerg said:
private static final char[] HEX_DIGITS = {'0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

private static String getHashString (byte[] bytes) {
StringBuilder sb = new StringBuilder(bytes.length*2);
for (byte b : bytes) {
sb.append(HEX_DIGITS[b >> 4]);
sb.append(HEX_DIGITS[b & 0xf]);
}
return sb.toString();
}

Amend that to use (b >> 4) & 0xf in the first append, or it winds up
with problems from sign-extension and being expanded to an int. With
that change, the code works. The version I posted above was a
pre-debugging copy of the code from somewhere, instead of the current
version, which was not readily to hand at the time of my earlier post.

The current version was tested and produced correct output.

In its entirety:

private static final char[] HEX_DIGITS = {'0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

private static String getHashString (byte[] bytes) {
StringBuilder sb = new StringBuilder(bytes.length*2);
for (byte b : bytes) {
sb.append(HEX_DIGITS[(b >> 4) & 0xf]);
sb.append(HEX_DIGITS[b & 0xf]);
}
return sb.toString();
}

(I'm now finding Thunderbird's spellchecker annoying. The red underlines
on getHashString and the like have me thinking I've got typos, imports
to add, or something similar. And hitting ctrl-shift-I in Thunderbird
has ... unfortunate effects.)
 
M

Mark Space

Matt said:
Is the best way to get the byte array out and then render it in hex or
similarly?

This works for me. It's reliable to and from the binary representation.

public static String hash (byte [] raw) throws Exception {
MessageDigest md = MessageDigest.getInstance ("MD5");
md.update(raw);
byte [] md5hash = md.digest();

BASE64Encoder encoder = new BASE64Encoder ();


Funnily enough I was just working on something similar just yesterday.
While this "isn't rocket science" it seems that it might come up often
enough that asking Sun to add it to the base API would be a good idea.

The basic problem is that the Arrays.toString( byte[] ) method isn't
flexible enough. The solution I came up with was to use String.format
to format each element. This allows some neat tricks with
Arrays.toSring(Object[]) as well.

To use this code, call toFormattedString() with the appropriate format
string.

ArrayUtils.toFormattedString( "%02X", bytes, null );

will give the result the OP is asking for, where "bytes" is an array of
bytes. Use "%02x" for lower case hex digits.

You can put hex digits into a string just like Arrays.toString() does:

ArrayUtils.toFormattedString( "%02X", bytes );

or you can have full control over the entire thing, specifying what to
use in place of the comma and brackets too.

ArrayUtils.toFormattedString( "%02X", bytes, "-", "(", ")" );

I'm in the process of turning this code into something resembling a
general purpose library, so things are kinda in flux. Let's see if I
can cut and paste this with out introducing syntax errors. I'm adding
support for a locale string and all other primitives, plus a proper
Javadoc. Ugh, the typing....



package local.utils;

public class ArrayUtils
{

private ArrayUtils()
{
}

public static String toFormattedString( byte[] bytes, String format )
{
return toFormattedString( bytes, format, ", ", "[", "]" );
}

public static String toFormattedString( String format, byte[] bytes,
String separator )
{
if( separator == null ) {
return toFormattedString( bytes, format, null, null, null );
}
else {
return toFormattedString( bytes, format, separator, "[", "]" );
}
}

public static String toFormattedString( byte[] bytes, String format,
String separator, String leader, String trailer )
{
int sepLen = separator != null ? separator.length() : 0;
// take a SWAG at the length for the StringBuilder
StringBuilder sb = new StringBuilder( bytes.length *
(format.length() +
sepLen) );
if( leader != null ) {
sb.append( leader );
}
if( separator != null ) {
for( int i = 0; i < bytes.length - 1; i++ ) {
sb.append( String.format( format, bytes ) );
sb.append( separator );
}
}
else {
for( int i = 0; i < bytes.length - 1; i++ ) {
sb.append( String.format( format, bytes ) );
}
}
sb.append( String.format( format, bytes[bytes.length - 1] ) );
if( trailer != null ) {
sb.append( trailer );
}
return sb.toString();
}
}
 
M

Mark Space

Mark said:
To use this code, call toFormattedString() with the appropriate format
string.

ArrayUtils.toFormattedString( "%02X", bytes, null );

will give the result the OP is asking for, where "bytes" is an array of
bytes. Use "%02x" for lower case hex digits.

Oops, there are two versions of the code right now. Use this call to
match the version I posted:

ArrayUtils.toFormattedString( byte, "%02X", null );

Similarly with the other calls: array first, then format string.
 
Z

zerg

Mark said:
public static String toFormattedString( byte[] bytes, String format,
String separator, String leader, String trailer )
{
int sepLen = separator != null ? separator.length() : 0;
// take a SWAG at the length for the StringBuilder
StringBuilder sb = new StringBuilder( bytes.length *
(format.length() +
sepLen) );

"%02X" is four, but the length of the formatted byte ends up 2 (for
example, "4F"). Since the format for numeric types will generally begin
with a %, then have a character for every output digit, then a d or an x
or similarly, it might be better to use format.length() - 2.
if( separator != null ) {
for( int i = 0; i < bytes.length - 1; i++ ) {
sb.append( String.format( format, bytes ) );
sb.append( separator );
}
}
else {
for( int i = 0; i < bytes.length - 1; i++ ) {
sb.append( String.format( format, bytes ) );
}
}
sb.append( String.format( format, bytes[bytes.length - 1] ) );


This is going to detonate under your feet with an
ArrayIndexOutOfBoundsException if an empty array is ever passed in. The
loop will do nothing, but the final line will try to grab bytes[-1] in
this case.

Add an "if (bytes.length > 0) {" and "}" around that line and it should
work.
 
Z

zerg

zerg said:
Mark said:
public static String toFormattedString( byte[] bytes, String format,
String separator, String leader, String trailer )
{
int sepLen = separator != null ? separator.length() : 0;
// take a SWAG at the length for the StringBuilder
StringBuilder sb = new StringBuilder( bytes.length *
(format.length() +
sepLen) );

"%02X" is four, but the length of the formatted byte ends up 2 (for
example, "4F"). Since the format for numeric types will generally begin
with a %, then have a character for every output digit, then a d or an x
or similarly, it might be better to use format.length() - 2.

And clamp it above zero (say, at 2). Otherwise calling this with an
empty format string may call new StringBuilder(-1) or new
StringBuilder(-2). Though that should probably throw some exception
anyway, a more helpful one would be better. An explicit test for illegal
format strings and explicit throw would be best, come to think of it.
 
M

Mark Space

zerg said:
"%02X" is four, but the length of the formatted byte ends up 2 (for
example, "4F"). Since the format for numeric types will generally begin
with a %, then have a character for every output digit, then a d or an x
or similarly, it might be better to use format.length() - 2.

And a string of "%x" will result in more digits, especially for an int
or long. This test was primarily intended to catch extra characters
added into the format string: "\n This one is = to %x". There are
perhaps some improvements I can make here though....
This is going to detonate under your feet with an
ArrayIndexOutOfBoundsException if an empty array is ever passed in. The
loop will do nothing, but the final line will try to grab bytes[-1] in
this case.

Add an "if (bytes.length > 0) {" and "}" around that line and it should
work.

Good point. I hadn't got around to generating unit tests yet, I'll have
to be careful with arrays of length 0.
 
M

Mark Space

zerg said:
This was in the byte array method specifically.

And your comment was also for "%02X" specifically. But the format string
can actually take a wide range of vales. Bringing up ints was probably
a bad choice of counter examples on my part, but consider a format
string like "(%d)" or similar. %d will result in 1 to 4 characters, and
subtracting 2 from the length of the format string will result in a
gross underestimate of the length of the output. A format string of
just "%d" would result in a estimate of 0 for the output length if you
subtract 2 from the length of the format string.

My idea was to try to restrict the StringBuilder object to zero or one
doubling, without letting the estimate go too far over the final value.
I think "%02X" is pretty much the worst case as far as overestimating,
and the heuristic only doubles the value. Which is effectively what one
doubling of the StringBuilder would do, if the last doubling of it's
internal buffer happened right at the end of a string.

I don't want to get into parsing the format string just to estimate the
length of the StringBuilder object.

Short answer: for bytes at least I think the current heuristic isn't too
bad. For other types of parameters I might adjust it slightly. But
probably up, not down.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,226
Latest member
KristanTal

Latest Threads

Top