Example of java code to uncompress and extract files from a compressed zip file using java.util.zip package.
The example opens a zip file and starts traversing the files in a similar manner used in walking a directory tree. If we find a directory entry, we create a new directory. If we find a file entry, we write the decompressed file.
1. Unzip File with java.util.zip.ZipFile
The example uses ZipInputStream to read a ZipFile and then read all the ZipEntry one by one. Then uses FileOutputStream to write all the files into the filesystem.
The following example does the following things:
- It creates a new folder where uncompressed files will be copied. The folder name is taken from the zip file name without extension. For example, if we unzip the
data.zipfile then it will be extracted intodatafolder in the same location. ZipFile object represents the.zipfile and is used to access its information. - The program iterates over all files in the zip, and checks whether it is a directory or a file. ZipEntry class represents an entry in the zip file – either file or directory. Each
ZipEntryinstance has the compressed and uncompressed size information, the name, and the input stream of the uncompressed bytes. - If the ZipEntry is directory then create a new directory inside the target directory data; else extract the file into the location.
- Using Files.copy(), we read the uncompressed file from the zip and copy this file into the target path.
- Keep doing it until the whole file is processed.
import java.io.*;
import java.nio.file.FileSystem;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import org.apache.commons.io.FilenameUtils;
public class UnzipExample
{
public static void main(String[] args)
{
Path zipFile = Path.of("c:/temp/data.zip");
unzipFile(zipFile);
}
private static void unzipFile(Path filePathToUnzip) {
Path parentDir = filePathToUnzip.getParent();
String fileName = filePathToUnzip.toFile().getName();
Path targetDir = parentDir.resolve(FilenameUtils.removeExtension(fileName));
//Open the file
try (ZipFile zip = new ZipFile(filePathToUnzip.toFile())) {
FileSystem fileSystem = FileSystems.getDefault();
Enumeration<? extends ZipEntry> entries = zip.entries();
//We will unzip files in this folder
if (!targetDir.toFile().isDirectory()
&& !targetDir.toFile().mkdirs()) {
throw new IOException("failed to create directory " + targetDir);
}
//Iterate over entries
while (entries.hasMoreElements()) {
ZipEntry entry = entries.nextElement();
File f = new File(targetDir.resolve(Path.of(entry.getName())).toString());
//If directory then create a new directory in uncompressed folder
if (entry.isDirectory()) {
if (!f.isDirectory() && !f.mkdirs()) {
throw new IOException("failed to create directory " + f);
}
}
//Else create the file
else {
File parent = f.getParentFile();
if (!parent.isDirectory() && !parent.mkdirs()) {
throw new IOException("failed to create directory " + parent);
}
try(InputStream in = zip.getInputStream(entry)) {
Files.copy(in, f.toPath());
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
2. Unzip File using Apache Commons Compress
The overall steps for unzipping the file using commons-compress library are similar to as described in first section. Only class names are different other than very few minor differences.
Begin with importing the latest version of commons-compress from maven repository.
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.21</version>
</dependency>
Next, we will rewrite the logic using the compress APIs. The major changes are new classes ArchiveEntry and ArchiveStreamFactory and ArchiveInputStream.
- Instances of ArchiveEntry provide meta data about the individual archive entries.
- ArchiveInputStream helps in reading the various format of zip files such as
.zip,.zip.gzor tar file.tar.gz. For example, to read a tar file we can create an instance of TarArchiveInputStream in following way.
try (InputStream fi = Files.newInputStream(Paths.get("my.tar.gz"));
InputStream bi = new BufferedInputStream(fi);
InputStream gzi = new GzipCompressorInputStream(bi);
ArchiveInputStream o = new TarArchiveInputStream(gzi)) {
}
Lets create a simple program, similar to first example, that will extract a given compressed file into same directory.
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.ArchiveException;
import org.apache.commons.compress.archivers.ArchiveInputStream;
import org.apache.commons.compress.archivers.ArchiveStreamFactory;
import org.apache.commons.compress.utils.IOUtils;
import org.apache.commons.io.FilenameUtils;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class UnzipWithCommonCompress {
public static void main(String[] args) throws IOException, ArchiveException {
Path zipFile = Path.of("c:/temp/data.zip");
extractZip(zipFile);
}
private static void extractZip(Path zipFilePath) {
Path parentDir = zipFilePath.getParent();
String fileName = zipFilePath.toFile().getName();
Path targetDir = parentDir.resolve(FilenameUtils.removeExtension(fileName));
ArchiveStreamFactory archiveStreamFactory = new ArchiveStreamFactory();
try (InputStream inputStream = Files.newInputStream(zipFilePath);
ArchiveInputStream archiveInputStream = archiveStreamFactory
.createArchiveInputStream(ArchiveStreamFactory.ZIP, inputStream)) {
ArchiveEntry archiveEntry = null;
while ((archiveEntry = archiveInputStream.getNextEntry()) != null) {
Path path = Paths.get(targetDir.toString(), archiveEntry.getName());
File file = path.toFile();
if (archiveEntry.isDirectory()) {
if (!file.isDirectory()) {
file.mkdirs();
}
} else {
File parent = file.getParentFile();
if (!parent.isDirectory()) {
parent.mkdirs();
}
try (OutputStream outputStream = Files.newOutputStream(path)) {
IOUtils.copy(archiveInputStream, outputStream);
}
}
}
} catch (IOException e) {
throw new RuntimeException(e);
} catch (ArchiveException e) {
throw new RuntimeException(e);
}
}
}
Happy Learning !!
This code is insecure and vulnerable to the Zip Slip vulnerability.
https://security.snyk.io/research/zip-slip-vulnerability
This code works absolutely perfect.
But with a huge file, this takes a lot of lots of time. Is there any way to increase the speed.
Hi,
we have a code that decompress the f zip files.
is there anyway to catch the filenames.
example:
i have 1.zip
2.zip
3.zip
1.pdf
4.zip
1.txt
can we catch the path as \1.zip\2.zip\3.zip\1.pdf
\1.zip\2.zip\4.zip\1.txt
for above compressions
for knowing which file belongs to which compression
Excellent Article…
this article saved my ton of time.
Hi, I am running into an issue where I have multiple levels of directories within a zip file. It seems that in some cases the code doesn’t work correctly because I am trying to unzip a file whose directory has not been created yet. The above code seems to have an expectation that the entry that represents the directory will come before any entries for files within that directory, but that does not seem to be the case. Any ideas?
Hi, if i have files in side directory, how can i unzip it, can i take another while loop inside of directory or does it works with your code?
Excelente aporte, gracias por el articulo.
Excellent contribution. Thanks for the article.