Java language does not provide any native support for effectively handling CSV files. So we will use Super CSV to read CSV files and write a new CSV file in Java.
1. Maven Dependencies
Let’s start by listing down maven dependencies needed to use Super CSV in our project.
<dependencies>
<dependency>
<groupId>net.sf.supercsv</groupId>
<artifactId>super-csv</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.4</version>
</dependency>
If you are using Gradle build then use this.
'net.sf.supercsv:super-csv:2.4.0'
'org.slf4j:slf4j-api:1.7.4'
2. Core Classes
Let’s go through the main classes we need to know about while working with Super CSV for reading or writing CSV files.
2.1. ICsvBeanReader and CsvBeanReader
ICsvBeanReader (interface) and CsvBeanReader (implementing class) are used to read CSV files. It reads a CSV file by instantiating a bean for every row and mapping each column to a field on the bean.
The bean to populate can be either a class or interface. If a class is used, it must be a valid Java bean, i.e. it must have a default no-argument constructor and getter/setter methods. An interface may also be used if it defines getters/setters – a proxy object that implements the interface will be created.
2.2. ICsvBeanWriter and CsvBeanWriter
ICsvBeanWriter (interface) and CsvBeanWriter (implementing class) are used to write CSV files. It writes a CSV file by mapping each field on the bean to a column in the CSV file (using the supplied name mapping).
2.3. CellProcessor
CellProcessor instances are used to read a value from CSV file and process it before setting it to java bean class/interface. e.g. We want to convert a value to Date
object or even you may want to run some regex validation over values.
2.4. CsvPreference
Before reading or writing CSV files, you must supply the reader/writer with some preferences. Essentially it means that you are setting delimiter-related configuration in CSV file. e.g. CsvPreference.STANDARD_PREFERENCE means :
Quote character = "
Delimiter character = ,
End of line symbols = \r\n
We can also create your own preferences. For example if your file was pipe-delimited, you could use the following:
private static final CsvPreference PIPE_DELIMITED = new CsvPreference.Builder('"', '|', "\n").build();
3. Reading a CSV File
Now let’s see an example of reading a CSV file using above described classes. I will read below given data.csv
:
CustomerId,CustomerName,Country,PinCode,Email
10001,Lokesh,India,110001,abc@gmail.com
10002,John,USA,220002,def@gmail.com
10003,Blue,France,330003,ghi@gmail.com
10004,Reddy,Jermany,440004,abc@gmail.com
10005,Kumar,India,110001,def@gmail.com
10006,Paul,USA,220002,ghi@gmail.com
10007,Grimm,France,330003,abc@gmail.com
10008,WhoAmI,Jermany,440004,def@gmail.com
10009,Bharat,India,110001,ghi@gmail.com
10010,Rocky,USA,220002,abc@gmail.com
10011,Voella,France,330003,def@gmail.com
10012,Gruber,Jermany,440004,ghi@gmail.com
10013,Satty,India,110001,abc@gmail.com
10014,Bean,USA,220002,def@gmail.com
10015,Krish,France,330003,ghi@gmail.com
And we will be populating the instances of Customer.java
with values of the above file.
public class Customer
{
private Integer CustomerId;
private String CustomerName;
private String Country;
private Long PinCode;
private String Email;
//Setters, getters, constructors, toString()
}
Now look at CSV file, the first row is column names. They should match up exactly with the bean’s field names, and the bean has the appropriate setters defined for each field.
If the header doesn’t match (or there is no header), then we can simply define your own name mapping array. [I have commented out the line, but you may take the hint.]
import java.io.FileReader;
import java.io.IOException;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.ParseInt;
import org.supercsv.cellprocessor.ParseLong;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.StrRegEx;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.prefs.CsvPreference;
public class ReadCSVFileExample {
static final String CSV_FILENAME = "data.csv";
public static void main(String[] args) throws IOException
{
try(ICsvBeanReader beanReader = new CsvBeanReader(new FileReader(CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE))
{
// the header elements are used to map the values to the bean
final String[] headers = beanReader.getHeader(true);
//final String[] headers = new String[]{"CustomerId","CustomerName","Country","PinCode","Email"};
final CellProcessor[] processors = getProcessors();
Customer customer;
while ((customer = beanReader.read(Customer.class, headers, processors)) != null) {
System.out.println(customer);
}
}
}
/**
* Sets up the processors used for the examples.
*/
private static CellProcessor[] getProcessors() {
final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+";
StrRegEx.registerMessage(emailRegex, "must be a valid email address");
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull(), // Country
new Optional(new ParseLong()), // PinCode
new StrRegEx(emailRegex) // Email
};
return processors;
}
}
Program Output.
Customer [CustomerId=10001, CustomerName=Lokesh, Country=India, PinCode=110001, Email=abc@gmail.com]
Customer [CustomerId=10002, CustomerName=John, Country=USA, PinCode=220002, Email=def@gmail.com]
Customer [CustomerId=10003, CustomerName=Blue, Country=France, PinCode=330003, Email=ghi@gmail.com]
//... So on
4. Partially Reading a CSV File
Partial reading allows us to ignore columns when reading CSV files by simply setting the appropriate header columns to null
. For example, in the below code, I have decided NOT to read the PinCode column.
final String[] headers = new String[]{"CustomerId", "CustomerName", "Country", null, "Email"};
Now if we run the above program again, the pin code value will not be populated as shown in the following output.
Customer [CustomerId=10001, CustomerName=Lokesh, Country=India, PinCode=null, Email=abc@gmail.com] Customer [CustomerId=10002, CustomerName=John, Country=USA, PinCode=null, Email=def@gmail.com] Customer [CustomerId=10003, CustomerName=Blue, Country=France, PinCode=null, Email=ghi@gmail.com] //... So on
5. Reading a CSV File in Key-value Pairs
To read in key value pairs, we need to use CsvMapReader. It allows you to retrieve each column by name from the resulting Map, though you’ll have to cast each column to it’s appropriate type.
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.ParseInt;
import org.supercsv.cellprocessor.ParseLong;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.StrRegEx;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvMapReader;
import org.supercsv.io.ICsvMapReader;
import org.supercsv.prefs.CsvPreference;
public class ReadCSVFileInKeyValuePairs {
static final String CSV_FILENAME = "data.csv";
public static void main(String[] args) throws IOException
{
try(ICsvMapReader listReader = new CsvMapReader(new FileReader(CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE))
{
//First Column is header names
final String[] headers = listReader.getHeader(true);
final CellProcessor[] processors = getProcessors();
Map<String, Object> fieldsInCurrentRow;
while ((fieldsInCurrentRow = listReader.read(headers, processors)) != null) {
System.out.println(fieldsInCurrentRow);
}
}
}
/**
* Sets up the processors used for the examples.
*/
private static CellProcessor[] getProcessors() {
final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+";
StrRegEx.registerMessage(emailRegex, "must be a valid email address");
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull(), // Country
new Optional(new ParseLong()), // PinCode
new StrRegEx(emailRegex) // Email
};
return processors;
}
}
Program Output.
{Country=India, CustomerId=10001, CustomerName=Lokesh, Email=abc@gmail.com, PinCode=110001}
{Country=USA, CustomerId=10002, CustomerName=John, Email=def@gmail.com, PinCode=220002}
{Country=France, CustomerId=10003, CustomerName=Blue, Email=ghi@gmail.com, PinCode=330003}
//... So on
6. Reading a CSV File with Arbitrary Number of Columns
Some CSV files don’t conform to RFC4180 and have a different number of columns on each row. If you have got such a CSV file, you will need to use CsvListReader, as it’s the only reader supporting it.
Read such files is tricky, as you do not know the number of columns in any row. So you read all columns in a row in a List
and then based on the size of the list, you determine how you may want to handle the read values.
Let’s modify the data.csv
and remove some data from it randomly.
CustomerId,CustomerName,Country,PinCode,Email
10001,Lokesh,India,110001,abc@gmail.com
10002,John,USA
10003,Blue,France,330003
Let’s read this CSV file.
import java.io.FileReader;
import java.io.IOException;
import java.util.List;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.ParseInt;
import org.supercsv.cellprocessor.ParseLong;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.StrRegEx;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvListReader;
import org.supercsv.io.ICsvListReader;
import org.supercsv.prefs.CsvPreference;
public class ReadCSVFileWithArbitraryNumberOfColumns {
static final String CSV_FILENAME = "data.csv";
public static void main(String[] args) throws IOException
{
try(ICsvListReader listReader = new CsvListReader(new FileReader(CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE))
{
//First Column is header names- though we don't need it in runtime
@SuppressWarnings("unused")
final String[] headers = listReader.getHeader(true);
CellProcessor[] processors = null;
List<String> fieldsInCurrentRow;
while ((fieldsInCurrentRow = listReader.read()) != null) {
if(fieldsInCurrentRow.size() == 5){
processors = getFiveColumnProcessors();
}else if(fieldsInCurrentRow.size() == 4) {
processors = getFourColumnProcessors();
}else if(fieldsInCurrentRow.size() == 3) {
processors = getThreeColumnProcessors();
}else{
//Create more processors
}
final List<Object> formattedFields = listReader.executeProcessors(processors);
System.out.println(String.format("rowNo=%s, customerList=%s", listReader.getRowNumber(), formattedFields));
}
}
}
private static CellProcessor[] getFiveColumnProcessors() {
final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+";
StrRegEx.registerMessage(emailRegex, "must be a valid email address");
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull(), // Country
new Optional(new ParseLong()), // PinCode
new StrRegEx(emailRegex) // Email
};
return processors;
}
private static CellProcessor[] getFourColumnProcessors() {
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull(), // Country
new Optional(new ParseLong()) // PinCode
};
return processors;
}
private static CellProcessor[] getThreeColumnProcessors() {
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull() //Country
};
return processors;
}
}
Program Output.
rowNo=2, customerList=[10001, Lokesh, India, 110001, abc@gmail.com]
rowNo=3, customerList=[10002, John, USA]
rowNo=4, customerList=[10003, Blue, France, 330003]
7. Writing to a New CSV File
Writing a CSV file is as simple as it was for reading the CSV file. Create CsvBeanWriter instance, define headers and processors and write the beans. It will generate the CSV file with data values populated from beans.
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.ParseInt;
import org.supercsv.cellprocessor.ParseLong;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.StrRegEx;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanWriter;
import org.supercsv.io.ICsvBeanWriter;
import org.supercsv.prefs.CsvPreference;
public class WriteCSVFileExample
{
//Watch out for Exception in thread "main" java.lang.ExceptionInInitializerError
private static List<Customer> customers = new ArrayList<Customer>();
static
{
customers.add(new Customer(1, "Lokesh", "India", 12345L, "howtodoinjava@gmail.com"));
customers.add(new Customer(2, "Mukesh", "India", 34234L, "mukesh@gmail.com"));
customers.add(new Customer(3, "Paul", "USA", 52345345L, "paul@gmail.com"));
}
private static CellProcessor[] getProcessors()
{
final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+";
StrRegEx.registerMessage(emailRegex, "must be a valid email address");
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(new ParseInt()), // CustomerId
new NotNull(), // CustomerName
new NotNull(), // Country
new Optional(new ParseLong()), // PinCode
new StrRegEx(emailRegex) // Email
};
return processors;
}
public static void main(String[] args)
{
ICsvBeanWriter beanWriter = null;
try
{
beanWriter = new CsvBeanWriter(new FileWriter("temp.csv"), CsvPreference.STANDARD_PREFERENCE);
final String[] header = new String[] { "CustomerId", "CustomerName", "Country", "PinCode" ,"Email" };
final CellProcessor[] processors = getProcessors();
// write the header
beanWriter.writeHeader(header);
// write the beans data
for (Customer c : customers) {
beanWriter.write(c, header, processors);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
beanWriter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Output of above program will be written in file temp.csv
as below:
CustomerId,CustomerName,Country,PinCode,Email
1,Lokesh,India,12345,howtodoinjava@gmail.com
2,Mukesh,India,34234,mukesh@gmail.com
3,Paul,USA,52345345,paul@gmail.com
That’s all for simple usecases and examples of using Super CSV for reading and writing CSV files in various ways.
Drop me your questions in the comments section.
Happy Learning !!
Leave a Reply