Spring Batch provides a FlatFileItemReader that we can use to read data from flat files, including CSV files. Here’s an example of how to configure and use FlatFileItemReader to read data from a CSV file in a Spring Batch job.
1. CSV File and Model
For demo purposes, we will be using the following CSF files:
Lokesh,Gupta,41,true
Brian,Schultz,42,false
John,Cena,43,true
Albert,Pinto,44,false
Then we need to create a domain object to represent the data.
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Person {
String firstName;
String lastName;
Integer age;
Boolean active;
}
2. Configuring FlatFileItemReader
The org.springframework.batch.item.file.FlatFileItemReader consists of two main components:
- A Spring Resource that represents the file to be read
- An implementation of the LineMapper interface (same as RowMapper in Spring JDBC). When reading a flat file, each line is presented to LineMapper as a String to parse.
The LineMapper internally consists of a LineTokenizer and FieldSetMapper. The LineTokenizer implementation parses the line into a FieldSet (similar to columns in a database row). The FieldSetMapper later maps the FieldSets to a domain object.

2.1. Delimited Files (CSV Files)
In delimited files, a character acts as a divider between each field in the record. In delimited files, we map the columns to the POJO fields after dividing each record with the delimiter. The default delimiter is always a comma.
A Step configuration that can read the delimited flat file can be built using the FlatFileItemReaderBuilder.
@Bean
@StepScope
public FlatFileItemReader<Person> personItemReader() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.delimited()
.names("firstName", "lastName", "age", "active")
.targetType(Person.class)
.resource(csvFile)
.build();
}
If we want to configure a different delimiter, we can define the custom DelimitedLineTokenizer bean.
@Bean
public DelimitedLineTokenizer tokenizer() {
var tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter("#"); // Specify a different delimiter. Default is comma.
tokenizer.setNames("firstName", "lastName", "age", "active");
return tokenizer;
}
2.2. Fixed-Width Files
When working on legacy mainframe systems, we may encounter fixed-width files due to the way COBOL and other such technologies declare their storage.
In the absence of a delimiter (or any other metadata), we have to rely on the length of each field in the file. Consider the following fixed-width file:
Lokesh Gupta 41 true
Brian Schultz 42 false
John Cena 43 true
Albert Pinto 44 false
In the above file, the lengths of the fields are:
firstName | 10 |
lastName | 10 |
age | 4 |
active | 5 |
The equivalent FlatFileItemReader can be used by using the methods .fixedLength() and columns() specifying the length of the fields.
@Bean
@StepScope
public FlatFileItemReader<Person> personItemReaderFixedWidth() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.fixedLength()
.columns(new Range(1, 10), new Range(11, 20), new Range(21, 24), new Range(25, 30))
.names("firstName", "lastName", "age", "active")
.targetType(Person.class)
.resource(csvFile)
.build();
}
2.3. FieldSetMapper
By default, Spring batch uses BeanWrapperFieldSetMapper which is a FieldSetMapper implementation based on a fuzzy search of bean property paths. It makes a good guess to match the column names with the field names in the POJO class. For example, the BeanWrapperFieldSetMapper will call Person#setFirstName, Person#setLastName, and so on, based on the names of the columns configured in the LineTokenizer.
If there is quite a difference in the column manes and the PJO class field names or structure of fields, we can define our own implementation of FieldSetMapper.
public class PersonFieldSetMapper implements FieldSetMapper<Person> {
public Person mapFieldSet(FieldSet fieldSet) {
Person person = new Person();
person.setFirstName(fieldSet.readString("firstName"));
person.setLastName(fieldSet.readString("lastName"));
....
return person;
}
}
And then inject this PersonFieldSetMapper into FlatFileItemReaderBuilder as follows:
@Bean
@StepScope
public FlatFileItemReader<Person> personItemReader() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.delimited()
.names("firstName", "lastName", "age", "active")
.fieldSetMapper(new PersonFieldSetMapper())
.resource(csvFile)
.build();
}
3. Read CSV with FlatFileItemReader
In the following configuration, the FlatFileItemReader is configured to read a CSV file. The DelimitedLineTokenizer is used to specify the column names, and the BeanWrapperFieldSetMapper is used to map each line to a Person object.
We’ll need to customize the ItemProcessor and ItemWriter beans according to the business logic and data destination. This configuration writes data to the database.
Finally, create a Job that includes the Steps.
import com.howtodoinjava.demo.batch.jobs.csvToDb.model.Person;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.LineMapper;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.LineTokenizer;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import org.springframework.transaction.PlatformTransactionManager;
import javax.sql.DataSource;
@Configuration
public class CsvToDatabaseJob {
public static final Logger logger = LoggerFactory.getLogger(CsvToDatabaseJob.class);
private static final String INSERT_QUERY = """
insert into person (first_name, last_name, age, is_active)
values (:firstName,:lastName,:age,:active)""";
private final JobRepository jobRepository;
public CsvToDatabaseJob(JobRepository jobRepository) {
this.jobRepository = jobRepository;
}
@Value("classpath:csv/inputData.csv")
private Resource inputFeed;
@Bean(name="insertIntoDbFromCsvJob")
public Job insertIntoDbFromCsvJob(Step step1, Step step2) {
var name = "Persons Import Job";
var builder = new JobBuilder(name, jobRepository);
return builder.start(step1).build();
}
@Bean
public Step step1(ItemReader<Person> reader,
ItemWriter<Person> writer,
ItemProcessor<Person, Person> processor,
PlatformTransactionManager txManager) {
var name = "INSERT CSV RECORDS To DB Step";
var builder = new StepBuilder(name, jobRepository);
return builder
.reader(reader)
.writer(writer)
.build();
}
@Bean
public FlatFileItemReader<Person> reader(
LineMapper<Person> lineMapper) {
var itemReader = new FlatFileItemReader<Person>();
itemReader.setLineMapper(lineMapper);
itemReader.setResource(inputFeed);
return itemReader;
}
@Bean
public DefaultLineMapper<Person> lineMapper(LineTokenizer tokenizer,
FieldSetMapper<Person> fieldSetMapper) {
var lineMapper = new DefaultLineMapper<Person>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
@Bean
public BeanWrapperFieldSetMapper<Person> fieldSetMapper() {
var fieldSetMapper = new BeanWrapperFieldSetMapper<Person>();
fieldSetMapper.setTargetType(Person.class);
return fieldSetMapper;
}
@Bean
public DelimitedLineTokenizer tokenizer() {
var tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter(",");
tokenizer.setNames("firstName", "lastName", "age", "active");
return tokenizer;
}
@Bean
public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
var provider = new BeanPropertyItemSqlParameterSourceProvider<Person>();
var itemWriter = new JdbcBatchItemWriter<Person>();
itemWriter.setDataSource(dataSource);
itemWriter.setSql(INSERT_QUERY);
itemWriter.setItemSqlParameterSourceProvider(provider);
return itemWriter;
}
}
If the above configuration seems like a lot then you can merge the DefaultLineMapper, DelimitedLineTokenizer and BeanWrapperFieldSetMapper in the FlatFileItemReader bean itself.
@Bean
public FlatFileItemReader<Person> reader() {
FlatFileItemReader<Person> reader = new FlatFileItemReader<>();
reader.setResource(inputFile);
reader.setLineMapper(new DefaultLineMapper<Person>() {{
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames("firstName", "lastName", "age", "active");
}});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
setTargetType(Person.class);
}});
}});
return reader;
}
4. Demo
4.1. Maven
Make sure you have the following dependencies in the project:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-quartz</artifactId>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
4.2. Run the Application
Now run the application, and watch out for the console logs.
import org.springframework.batch.core.Job; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.JobParametersBuilder; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.boot.CommandLineRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.ApplicationContext; @SpringBootApplication public class BatchProcessingApplication implements CommandLineRunner { private final JobLauncher jobLauncher; private final ApplicationContext applicationContext; public BatchProcessingApplication(JobLauncher jobLauncher, ApplicationContext applicationContext) { this.jobLauncher = jobLauncher; this.applicationContext = applicationContext; } public static void main(String[] args) { SpringApplication.run(BatchProcessingApplication.class, args); } @Override public void run(String... args) throws Exception { Job job = (Job) applicationContext.getBean("insertIntoDbFromCsvJob"); JobParameters jobParameters = new JobParametersBuilder() .addString("JobID", String.valueOf(System.currentTimeMillis())) .toJobParameters(); var jobExecution = jobLauncher.run(job, jobParameters); var batchStatus = jobExecution.getStatus(); while (batchStatus.isRunning()) { System.out.println("Still running..."); Thread.sleep(5000L); } } }
The program output:
2023-11-29T14:32:54.612+05:30 INFO 24044 --- [main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=Persons Import Job]] launched with the following parameters: [{'JobID':'{value=1701248574579, type=class java.lang.String, identifying=true}'}] 2023-11-29T14:32:54.631+05:30 INFO 24044 --- [main] o.s.batch.core.job.SimpleStepHandler : Executing step: [INSERT CSV RECORDS To DB Step] 2023-11-29T14:32:54.647+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : Reading a new Person Record 2023-11-29T14:32:54.662+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : New Person record read : Person(firstName=Lokesh, lastName=Gupta, age=41, active=true) 2023-11-29T14:32:54.664+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : Reading a new Person Record 2023-11-29T14:32:54.665+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : New Person record read : Person(firstName=Brian, lastName=Schultz, age=42, active=false) 2023-11-29T14:32:54.665+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : Reading a new Person Record 2023-11-29T14:32:54.665+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : New Person record read : Person(firstName=John, lastName=Cena, age=43, active=true) 2023-11-29T14:32:54.666+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : Reading a new Person Record 2023-11-29T14:32:54.666+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : New Person record read : Person(firstName=Albert, lastName=Pinto, age=44, active=false) 2023-11-29T14:32:54.666+05:30 INFO 24044 --- [main] c.h.d.b.j.c.l.PersonItemReadListener : Reading a new Person Record 2023-11-29T14:32:54.676+05:30 INFO 24044 --- [main] o.s.batch.core.step.AbstractStep : Step: [INSERT CSV RECORDS To DB Step] executed in 44ms 2023-11-29T14:32:54.679+05:30 INFO 24044 --- [main] .j.c.l.JobCompletionNotificationListener : JOB FINISHED !!
Drop me your questions in the comments section.
Happy Learning !!
Comments