A checksum hash is an encrypted sequence of characters obtained after applying certain algorithms and manipulations on user provided content. In this post, we will learn to generate the checksum hash for files.
1. Why we may want to generate checksum hash for a file?
Any serious file providers provide a mechanism to have a checksum on their downloadable files. A checksum is a form of mechanism to ensure that the file we downloaded is properly downloaded. Checksum acts like a proof of validity of a file so if a file gets corrupted this checksum will change and thus letting us know that this is not the same file or file has been corrupted between transfer for any reason.
You can also create checksum of file to detect any possible change in file by third party e.g. license files. You provide licenses to clients which they may upload to your server. You can cross verify the checksum of file to verify that license file has not been modified after creation.
Read More : MD5, SHA, PBKDF2, BCrypt examples
2. How to generate checksum hash for a file
To create checksum for a file, you will need to read the content of file byte by byte in chunks; and then generate hash for it using below manner.
This function takes two arguments:
- The message digest algorithm’s implementation
- A file for which checksum needs to be generated
private static String getFileChecksum(MessageDigest digest, File file) throws IOException { //Get file input stream for reading the file content FileInputStream fis = new FileInputStream(file); //Create byte array to read data in chunks byte[] byteArray = new byte[1024]; int bytesCount = 0; //Read file data and update in message digest while ((bytesCount = fis.read(byteArray)) != -1) { digest.update(byteArray, 0, bytesCount); }; //close the stream; We don't need it now. fis.close(); //Get the hash's bytes byte[] bytes = digest.digest(); //This bytes[] has bytes in decimal format; //Convert it to hexadecimal format StringBuilder sb = new StringBuilder(); for(int i=0; i< bytes.length ;i++) { sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1)); } //return complete hash return sb.toString(); }
You can use above function as below to generate MD5 file checksum :
//Create checksum for this file File file = new File("c:/temp/testOut.txt"); //Use MD5 algorithm MessageDigest md5Digest = MessageDigest.getInstance("MD5"); //Get the checksum String checksum = getFileChecksum(md5Digest, file); //see checksum System.out.println(checksum);
To generate SHA file checksum, use the function as below:
//Use SHA-1 algorithm MessageDigest shaDigest = MessageDigest.getInstance("SHA-1"); //SHA-1 checksum String shaChecksum = getFileChecksum(shaDigest, file);
Drop me a comment if something needs more explanation.
Happy Learning !!
Shyam
Really wounderful , thanks for your articles….looking more useful technical articles from you
Duong
I wonder it works ok with file 1 or 2GB?
Shakir Ansari
It took almost 30 seconds for a file almost 3.00 GB
Qazi Jalil
i did md5 hashing it works correctly, shows same value if i do this over same file at same location but if i change the location of file its hash changes i don’t know why this happens and i really need to fix this bug help me out kindly ……
Lokesh Gupta
I don’t think it should happen. Can you please post your code which is creating problem.
Anil
Could you please tell me the exact situation where we have to use this checksum ?
Lokesh
Read this hashing guide.
suseelan john
sir i use this concept to find 2 difeerent name same doc file but it returns two different value why/
Lokesh
Please elaborate.