If you need to calculate a hash function on a file or a message, there is an existing API in Java that can do that for you. It isn’t perfect, but it is really easy to use and supports most of the popular checksum calculation algorithms – MD5 and SHA1 among them. Without further ado check out the following code snippet that reads in a file and calculates both hashes for it:

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.security.DigestInputStream;
import java.security.MessageDigest;
import java.util.Formatter;

public class HashFunctionTest {

    public static String calculateHash(MessageDigest algorithm,
            String fileName) throws Exception{

        FileInputStream     fis = new FileInputStream(fileName);
        BufferedInputStream bis = new BufferedInputStream(fis);
        DigestInputStream   dis = new DigestInputStream(bis, algorithm);

        // read the file and update the hash calculation
        while (dis.read() != -1);

        // get the hash value as byte array
        byte[] hash = algorithm.digest();

        return byteArray2Hex(hash);
    }

    private static String byteArray2Hex(byte[] hash) {
        Formatter formatter = new Formatter();
        for (byte b : hash) {
            formatter.format("%02x", b);
        }
        return formatter.toString();
    }

    public static void main(String[] args) throws Exception {
        String fileName = "javablogging.png";

        MessageDigest sha1 = MessageDigest.getInstance("SHA1");
        MessageDigest md5  = MessageDigest.getInstance("MD5");        

        System.out.println(calculateHash(sha1, fileName));
        System.out.println(calculateHash(md5, fileName));
    }
}

You are not forced to read input trough DigestInputStream – the API of MessageDigest class allows you also to simply use any array of bytes. Unfortunately there is no method to read in the whole input stream, you have to do the reading on your own.

This class is designed to be use internally and unfortunately does not support printing out the hash code as a human readable String. As you can see in the code above we use Formatter class to do that – it’s a readable, but slow solution. If you need something faster you can check the following post showing many different ways to do byte[] to Hex String conversion.

If you’re interested check out the following post on www.javamex.com about the comparison of different possible hash functions that you can use.