A checksum hash is an encrypted sequence of characters obtained after applying certain algorithms and manipulations on user-provided content. In this Java hashing tutorial, we will learn to generate the checksum hash for the files.
1. Why we may want to generate the hash for a file?
Any serious file provider provides a mechanism to have a checksum on their downloadable files. A checksum is a form of mechanism to ensure that the file we downloaded is properly downloaded.
Checksum acts like a proof of the validity of a file so if a file gets corrupted this checksum will change and thus let us know that this file is not the same file or the file has been corrupted between the transfer for any reason.
We can also create the checksum of the file to detect any possible change in the file by third party e.g. license files. We provide licenses to clients which they may upload to their server. We can cross verify the checksum of the file to verify that the license file has not been modified after creation.
Read More : Java MD5, SHA, PBKDF2, BCrypt and SCrypt Examples
2. How to generate checksum hash for a file
To create checksum for a file, we will need to read the content of file byte by byte in chunks, and then generate the hash for it using the given below function.
This function takes two arguments:
- The message digest algorithm’s implementation
- A file for which checksum needs to be generated
private static String getFileChecksum(MessageDigest digest, File file) throws IOException { //Get file input stream for reading the file content FileInputStream fis = new FileInputStream(file); //Create byte array to read data in chunks byte[] byteArray = new byte[1024]; int bytesCount = 0; //Read file data and update in message digest while ((bytesCount = fis.read(byteArray)) != -1) { digest.update(byteArray, 0, bytesCount); }; //close the stream; We don't need it now. fis.close(); //Get the hash's bytes byte[] bytes = digest.digest(); //This bytes[] has bytes in decimal format; //Convert it to hexadecimal format StringBuilder sb = new StringBuilder(); for(int i=0; i< bytes.length ;i++) { sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1)); } //return complete hash return sb.toString(); }
Example 1: Generate MD5 Hash for a File in Java
//Create checksum for this file File file = new File("c:/temp/testOut.txt"); //Use MD5 algorithm MessageDigest md5Digest = MessageDigest.getInstance("MD5"); //Get the checksum String checksum = getFileChecksum(md5Digest, file); //see checksum System.out.println(checksum);
Example 2: Generate SHA-256 Hash for a File in Java
//Use SHA-1 algorithm MessageDigest shaDigest = MessageDigest.getInstance("SHA-256"); //SHA-1 checksum String shaChecksum = getFileChecksum(shaDigest, file);
Drop me a comment if something needs more explanation.
Happy Learning !!
ashish
plz provide code for how to get file from hash
hasardeur
You can’t. The SHA-family and MD5 are one way functions. It is not possible to extract the input from the output. Actually that is the precise function of those algorithms – it is what we want them to do.
Shyam
Really wounderful , thanks for your articles….looking more useful technical articles from you
Duong
I wonder it works ok with file 1 or 2GB?
Shakir Ansari
It took almost 30 seconds for a file almost 3.00 GB
Qazi Jalil
i did md5 hashing it works correctly, shows same value if i do this over same file at same location but if i change the location of file its hash changes i don’t know why this happens and i really need to fix this bug help me out kindly ……
Lokesh Gupta
I don’t think it should happen. Can you please post your code which is creating problem.
Anil
Could you please tell me the exact situation where we have to use this checksum ?
Lokesh Gupta
Read this hashing guide.
suseelan john
sir i use this concept to find 2 difeerent name same doc file but it returns two different value why/
Lokesh Gupta
Please elaborate.