A checksum hash is an encrypted sequence of characters obtained after applying specific algorithms and manipulations on user-provided content. In this Java hashing tutorial, we will learn to generate the checksum hash for the files.
1. Why Generate a File’s Checksum?
Any serious file provider provides a mechanism to have a checksum on their downloadable files. A checksum is a form of mechanism to ensure that the file we downloaded is correctly downloaded.
Checksum acts like a proof of the validity of a file so if a file gets corrupted this checksum will change and thus let us know that this file is not the same file or the file has been compromised between the transfer for any reason.
We can also create the file’s checksum to detect any possible change in the file by third parties e.g. license files. We provide licenses to clients which they may upload to their servers. We can cross-verify the file’s checksum to verify that the license file has not been modified after creation.
Read More : Java MD5, SHA, PBKDF2, BCrypt and SCrypt Examples
To create checksum for a file, we will need to read the file’s content, and then generate the hash for it using one of the following methods. Note that both approaches support all types of algorithms so we can use the same code for other algorithms such as HmacMd5, SHA, SHA-512 etc.
2. Generate File Checksum with MessageDigest
MessageDigest class provides applications with the functionality of a message digest algorithm, such as MD5 or SHA-256. Its getInstance() method returns a MessageDigest object that implements the specified digest algorithm.
Example 1: Generate MD5 Hash for a File in Java
Path filePath = Path.of("c:/temp/testOut.txt");
byte[] data = Files.readAllBytes(Paths.get(filePath));
byte[] hash = MessageDigest.getInstance("MD5").digest(data);
String checksum = new BigInteger(1, hash).toString(16);
Example 2: Generate SHA-256 Hash for a File in Java
Path filePath = Path.of("c:/temp/testOut.txt");
byte[] data = Files.readAllBytes(Paths.get(filePath));
byte[] hash = MessageDigest.getInstance("SHA-256").digest(data);
String checksum = new BigInteger(1, hash).toString(16);
3. Generate File Checksum with Guava
In Google Guava, ByteSource.hash() method hashes the contents with the specified hash function as method argument.
Start with adding the latest version of Guava to the project’s classpath.
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>31.1-jre</version>
</dependency>
Now we can use the hash() function as follows.
Example 1: Generate MD5 Hash for a File in Java
File file = new File("c:/temp/test.txt");
ByteSource byteSource = com.google.common.io.Files.asByteSource(file);
HashCode hc = byteSource.hash(Hashing.md5());
String checksum = hc.toString();
Example 2: Generate SHA-256 Hash for a File in Java
File file = new File("c:/temp/test.txt");
ByteSource byteSource = com.google.common.io.Files.asByteSource(file);
HashCode hc = byteSource.hash(Hashing.sha256());
String checksum = hc.toString();
Drop me a comment if something needs more explanation.
Happy Learning !!
plz provide code for how to get file from hash
You can’t. The SHA-family and MD5 are one way functions. It is not possible to extract the input from the output. Actually that is the precise function of those algorithms – it is what we want them to do.
if you achieve that, consider yourself a nobel prize winner.
Really wounderful , thanks for your articles….looking more useful technical articles from you
I wonder it works ok with file 1 or 2GB?
It took almost 30 seconds for a file almost 3.00 GB
i did md5 hashing it works correctly, shows same value if i do this over same file at same location but if i change the location of file its hash changes i don’t know why this happens and i really need to fix this bug help me out kindly ……
I don’t think it should happen. Can you please post your code which is creating problem.
Could you please tell me the exact situation where we have to use this checksum ?
Read this hashing guide.
sir i use this concept to find 2 difeerent name same doc file but it returns two different value why/
Please elaborate.