greatly differing processing time between java and Linux while calculating hashes?

Q

qwertmonkey

~
I have been noticing great differences (> 25%) in the processing time of
hashes between java and the implementations in the OS, which cannot attributed
to block size (I played with it as well). What I consistently got was the java
is much faster for sha512 and sha384, but then for sha256, sha1 and md5 Linux
becomes then much faster ...
~
So I grabbed two relatively large media files from youtube:
~
$ ls -l DQfUaXLk_sw.mp4
-rw-r--r-- 1 knoppix knoppix 628588285 Apr 1 18:09 DQfUaXLk_sw.mp4

$ ls -l 0buBJlPo9us.flv
-rw-r--r-- 1 knoppix knoppix 475965918 Aug 31 2010 0buBJlPo9us.flv
~
Quickly coded some test:
~
import java.util.Iterator;
import java.util.Set;
import java.util.HashSet;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;

import java.security.MessageDigest;
import java.security.Security;
import java.security.Provider;
import java.security.NoSuchAlgorithmException;

// __
public class CheckSum00Test{
// __
private static final String aLnSep = System.getProperty("line.separator");
// __
private static final int _BLCK_SZ = 512;

// __
private static final String[] aSecKeys = new String[]{
"MessageDigest.", "Alg.Alias.MessageDigest."
};
// __
private static final String[] aSecPrvdrs = getProviders();
// __
private static final String[] getProviders() throws SecurityException{
String[] aArSec = null;
// __
Set<String> SSPrvdrs = new HashSet<String>();
boolean IsPrvdr;
Iterator Itr;
int iPrvdrs = Integer.MIN_VALUE, iPrvdrPrfxL = aSecKeys.length, iSSz;
String aSecKey;
Provider[] SecPrvdr = Security.getProviders();
boolean IsSystemPrvdrs = ((SecPrvdr != null) &&
((iPrvdrs = SecPrvdr.length) > 0));
if(IsSystemPrvdrs){
for(int i = 0; (i < iPrvdrs); ++i){
Itr = SecPrvdr.keySet().iterator();
while(Itr.hasNext()){
String aKey = (String)Itr.next();
IsPrvdr = false;
for(int j = 0; (j < iPrvdrPrfxL) && !IsPrvdr; ++j){
IsPrvdr = aKey.startsWith(aSecKeys[j]);
// __
if(IsPrvdr){
aSecKey = aKey.split(" ")[0];
if(aSecKey != null){
aSecKey = aSecKey.trim();
aSecKey = aSecKey.substring(aSecKeys[j].length());
if(aSecKey.length() > 0){ SSPrvdrs.add(aSecKey); }
}// (aSecKey != null)
}// (IsPrvdr)
}// j [0, iPrvdrPrfxL)
}// (Itr.hasNext()){
}// i [0, iPrvdrs)
// __
IsSystemPrvdrs = ((iSSz = SSPrvdrs.size()) > 0);
if(IsSystemPrvdrs){
aArSec = new String[iSSz];
SSPrvdrs.toArray(aArSec);
}
}
// __
if(!IsSystemPrvdrs){
String aX = "FATAL ERROR: java implementation does NOT seem to
include Security Providers!" + aLnSep;
aX += "// __ java.version: " + System.getProperty("java.version") + aLnSep;
// ...
aX += "~";
throw new SecurityException(aLnSep + "// __ " + aX + aLnSep);
}
// __
return(aArSec);
}

// __
private static String getErrSecAlgos(){
StringBuilder aBldr = new StringBuilder();
aBldr.append(aLnSep + "// __ The " + aSecPrvdrs.length + " checksum
algorithms that could be used are: " + aLnSep + aLnSep);
for(int i = 0; (i < aSecPrvdrs.length); ++i){ aBldr.append((i + 1) + ": "
+ aSecPrvdrs + aLnSep); }
return(aBldr.toString());
}

// __
public static void main(String[] args){
String aKNm = "CheckSum00Test";
if((args != null) && (args.length == 2)){
try{
long lTm00 = System.currentTimeMillis();
// __
MessageDigest md = MessageDigest.getInstance(args[0]);
// __
FileInputStream fis = new FileInputStream(args[1]);
byte[] dataBytes = new byte[_BLCK_SZ];
int nread = 0;
while ((nread = fis.read(dataBytes)) != -1) { md.update(dataBytes,
0, nread); };
fis.close();
byte[] mdbytes = md.digest();
long lTm02 = System.currentTimeMillis();
// __ byte2hex
StringBuilder aBldr = new StringBuilder();
for (int i = 0; i < mdbytes.length; i++) {
aBldr.append(Integer.toString((mdbytes & 0xff) + 0x100, 16)
..substring(1));
}
System.err.println("// __ " + args[0] + " encrypting \"" + args[1]
+ "\":\"" + aBldr.toString() + "\" took: " +(lTm02 - lTm00) + " (ms)");
}catch(FileNotFoundException FlNtFndX){ FlNtFndX.
printStackTrace(System.err); }
catch(IOException IOX){ IOX.printStackTrace(System.err); }
catch(NoSuchAlgorithmException NSekAlgoX){
// System.err.println(getErrSecAlgos());
NSekAlgoX.printStackTrace(System.err);
}
}// ((args != null) && (args.length == 2))
else{
System.err.println(aLnSep + "// __ usage: java " + aKNm + "
<sum algorithm> <input file>" + aLnSep);
System.err.println(getErrSecAlgos());
}
}
}
~
then I went:
~
date; time java CheckSum00Test SHA-512 <input_file>; date;
date; time sha512sum -b <input_file>; date;

date; time java CheckSum00Test SHA-384 <input_file>; date;
date; time sha384sum -b <input_file>; date;

date; time java CheckSum00Test SHA-256 <input_file>; date;
date; time sha256sum -b <input_file>; date;

date; time java CheckSum00Test SHA-1 <input_file>; date;
date; time sha1sum -b <input_file>; date;

date; time java CheckSum00Test MD5 <input_file>; date;
date; time md5sum -b <input_file>; date;
~
What do you think is going on here?
~
thanks
lbrtchx
 
R

Roedy Green

On Sun, 9 Sep 2012 02:43:43 +0000 (UTC),
(e-mail address removed) wrote, quoted or indirectly quoted
someone who said :

I think you will find the function in handled by native methods. So it
has little to do with the difference between Java and C. Factors:
1. how optimised the code is for your CPU
2. how smart the C compiler is.
3. how big your blocks are. You have to traverse the "blood-brain"
barrier for each call.
 
M

markspace

$ ls -l DQfUaXLk_sw.mp4
-rw-r--r-- 1 knoppix knoppix 628588285 Apr 1 18:09 DQfUaXLk_sw.mp4

$ ls -l 0buBJlPo9us.flv
-rw-r--r-- 1 knoppix knoppix 475965918 Aug 31 2010 0buBJlPo9us.flv

I think we could use some URLs here. I doubt I'd be able find those
files by name.

date; time java CheckSum00Test MD5 <input_file>; date;
date; time md5sum -b <input_file>; date;
~
What do you think is going on here?
~


Well for starters I think your constant use of ~ as a line separator is
pretty annoying. And secondly you posted your homework file with
<input_file> instead of the actual file name, and no results either, so
it's pretty hard to say what the result you obtained was, let alone
guess at a cause.
 
M

markspace

I am generally not fussy about posting style....
However, I've stayed out of this discussion because of two problems,
fragmented threads and articles with unreadable formatting.


I feel the same way. I don't like harping on trivialities or oddities
of posting style or phrasing, but in this case it really is making it
harder for me to read. I just noticed the fragmented posts too.

OP: You may not be very concerned about this, but your posts seem to be
making it harder than needed for folks to give you additional input.
I'm just making you aware of the issue, just in case you were not aware
before. It would help us if your posts were not fragmented in our
newsreaders (Google groups is a poor UI and known to be broken), and the
~ thing is really aggravating.
 
R

Robert Klemme

What do you think is going on here?

As far as I could extract from that code (which I find pretty badly
readable btw.) you are measuring digest calculation and IO. What
measures did you take to ensure there are no effects from OS buffering?
Also, why are you comparing apples (Java) and oranges (Linux) - at
least in the subject? Are you aware that the JVM has some startup time
which can be significant when measuring run once applications?

And, why the heck, are there still people around who write a return like
a method call - with brackets?

....

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top