Spring Batch MultiResourceItemReader – Read Multiple CSV Files Example

Learn to read multiple CSV files from filesystem or resources folder using MultiResourceItemReader class. These files may have first rows as header, so do not forget to skip first line.

Table of Contents

Project Structure
Read CSV files with MultiResourceItemReader
Write rows to console
Maven Dependency
Demo

Project Structure

In this project, we will –

  1. Read 3 CSV files from input/*.csv.
  2. Write data to console.
Project Structure
Project Structure

Read CSV files with MultiResourceItemReader

You need to use MultiResourceItemReader to read lines from CSV file. It reads items from multiple resources sequentially.

@Value("input/inputData*.csv")
private Resource[] inputResources;

@Bean
public Job readCSVFilesJob() {
	return jobBuilderFactory
			.get("readCSVFilesJob")
			.incrementer(new RunIdIncrementer())
			.start(step1())
			.build();
}

@Bean
public Step step1() {
	return stepBuilderFactory.get("step1").<Employee, Employee>chunk(5)
			.reader(multiResourceItemReader())
			.writer(writer())
			.build();
}

@Bean
public MultiResourceItemReader<Employee> multiResourceItemReader() 
{
	MultiResourceItemReader<Employee> resourceItemReader = new MultiResourceItemReader<Employee>();
	resourceItemReader.setResources(inputResources);
	resourceItemReader.setDelegate(reader());
	return resourceItemReader;
}

@Bean
public FlatFileItemReader<Employee> reader() 
{
	//Create reader instance
	FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
	
	//Set number of lines to skips. Use it if file has header rows.
	reader.setLinesToSkip(1); 	
	
	//Configure how each line will be parsed and mapped to different values
	reader.setLineMapper(new DefaultLineMapper() {
		{
			//3 columns in each row
			setLineTokenizer(new DelimitedLineTokenizer() {
				{
					setNames(new String[] { "id", "firstName", "lastName" });
				}
			});
			//Set values in Employee class
			setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {
				{
					setTargetType(Employee.class);
				}
			});
		}
	});
	return reader;
}
public class Employee {

	String id;
	String firstName;
	String lastName;

	//public setter and getter methods
}
id,firstName,lastName
1,Lokesh,Gupta
2,Amit,Mishra
3,Pankaj,Kumar
4,David,Miller
id,firstName,lastName
5,Ramesh,Gupta
6,Vineet,Mishra
7,Amit,Kumar
8,Dav,Miller
id,firstName,lastName
9,Vikas,Kumar
10,Pratek,Mishra
11,Brian,Kumar
12,David,Cena

Write read rows to console

Create ConsoleItemWriter class implementing ItemWriter interface.

import java.util.List;

import org.springframework.batch.item.ItemWriter;

public class ConsoleItemWriter<T> implements ItemWriter<T> { 
    @Override 
    public void write(List<? extends T> items) throws Exception { 
        for (T item : items) { 
            System.out.println(item); 
        } 
    } 
}

Use ConsoleItemWriter as writer.

@Bean
public ConsoleItemWriter<Employee> writer() 
{
	return new ConsoleItemWriter<Employee>();
}

Maven Dependency

Look at project dependencies.

<project xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.howtodoinjava</groupId>
	<artifactId>App</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>

	<name>App</name>
	<url>http://maven.apache.org</url>

	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.0.3.RELEASE</version>
	</parent>

	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	</properties>

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-batch</artifactId>
		</dependency>
		<dependency>
			<groupId>com.h2database</groupId>
			<artifactId>h2</artifactId>
			<scope>runtime</scope>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>

	<repositories>
		<repository>
			<id>repository.spring.release</id>
			<name>Spring GA Repository</name>
			<url>http://repo.spring.io/release</url>
		</repository>
	</repositories>
</project>

Demo

Before running the application, look at complete code of BatchConfig.java.

package com.howtodoinjava.demo.config;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;

import com.howtodoinjava.demo.model.Employee;

@Configuration
@EnableBatchProcessing
public class BatchConfig 
{
	@Autowired
	private JobBuilderFactory jobBuilderFactory;
	
	@Autowired
	private StepBuilderFactory stepBuilderFactory;

	@Value("input/inputData*.csv")
	private Resource[] inputResources;

	@Bean
	public Job readCSVFilesJob() {
		return jobBuilderFactory
				.get("readCSVFilesJob")
				.incrementer(new RunIdIncrementer())
				.start(step1())
				.build();
	}

	@Bean
	public Step step1() {
		return stepBuilderFactory.get("step1").<Employee, Employee>chunk(5)
				.reader(multiResourceItemReader())
				.writer(writer())
				.build();
	}

	@Bean
	public MultiResourceItemReader<Employee> multiResourceItemReader() 
	{
		MultiResourceItemReader<Employee> resourceItemReader = new MultiResourceItemReader<Employee>();
		resourceItemReader.setResources(inputResources);
		resourceItemReader.setDelegate(reader());
		return resourceItemReader;
	}

	@SuppressWarnings({ "rawtypes", "unchecked" })
	@Bean
	public FlatFileItemReader<Employee> reader() 
	{
		//Create reader instance
		FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
		
		//Set number of lines to skips. Use it if file has header rows.
		reader.setLinesToSkip(1); 	
		
		//Configure how each line will be parsed and mapped to different values
		reader.setLineMapper(new DefaultLineMapper() {
			{
				//3 columns in each row
				setLineTokenizer(new DelimitedLineTokenizer() {
					{
						setNames(new String[] { "id", "firstName", "lastName" });
					}
				});
				//Set values in Employee class
				setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {
					{
						setTargetType(Employee.class);
					}
				});
			}
		});
		return reader;
	}
	
	@Bean
	public ConsoleItemWriter<Employee> writer() 
	{
		return new ConsoleItemWriter<Employee>();
	}
}
package com.howtodoinjava.demo;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;

@SpringBootApplication
@EnableScheduling
public class App
{
    @Autowired
    JobLauncher jobLauncher;
     
    @Autowired
    Job job;
     
    public static void main(String[] args)
    {
        SpringApplication.run(App.class, args);
    }
     
    @Scheduled(cron = "0 */1 * * * ?")
    public void perform() throws Exception
    {
        JobParameters params = new JobParametersBuilder()
                .addString("JobID", String.valueOf(System.currentTimeMillis()))
                .toJobParameters();
        jobLauncher.run(job, params);
    }
}
#Disable batch job's auto start 
spring.batch.job.enabled=false

spring.main.banner-mode=off

Run the application

Run the application as Spring boot application, and watch the console. Batch job will start at start of each minute. It will read the input file, and print the read values in console.

2018-07-10 15:32:26 INFO  - Starting App on XYZ with PID 4596 (C:\Users\user\workspace\App\target\classes started by zkpkhua in C:\Users\user\workspace\App)
2018-07-10 15:32:26 INFO  - No active profile set, falling back to default profiles: default
2018-07-10 15:32:27 INFO  - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@3c9d0b9d: startup date [Tue Jul 10 15:32:27 IST 2018]; root of context hierarchy
2018-07-10 15:32:28 INFO  - HikariPool-1 - Starting...
2018-07-10 15:32:29 INFO  - HikariPool-1 - Start completed.
2018-07-10 15:32:29 INFO  - No database type set, using meta data indicating: H2
2018-07-10 15:32:29 INFO  - No TaskExecutor has been set, defaulting to synchronous executor.
2018-07-10 15:32:29 INFO  - Executing SQL script from class path resource [org/springframework/batch/core/schema-h2.sql]
2018-07-10 15:32:29 INFO  - Executed SQL script from class path resource [org/springframework/batch/core/schema-h2.sql] in 68 ms.
2018-07-10 15:32:30 INFO  - Registering beans for JMX exposure on startup
2018-07-10 15:32:30 INFO  - Bean with name 'dataSource' has been autodetected for JMX exposure
2018-07-10 15:32:30 INFO  - Located MBean 'dataSource': registering with JMX server as MBean [com.zaxxer.hikari:name=dataSource,type=HikariDataSource]
2018-07-10 15:32:30 INFO  - No TaskScheduler/ScheduledExecutorService bean found for scheduled processing
2018-07-10 15:32:30 INFO  - Started App in 4.036 seconds (JVM running for 4.827)

2018-07-10 15:33:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] launched with the following parameters: [{JobID=1531216980005}]

2018-07-10 15:33:00 INFO  - Executing step: [step1]

Employee [id=1, firstName=Lokesh, lastName=Gupta]
Employee [id=2, firstName=Amit, lastName=Mishra]
Employee [id=3, firstName=Pankaj, lastName=Kumar]
Employee [id=4, firstName=David, lastName=Miller]
Employee [id=5, firstName=Ramesh, lastName=Gupta]
Employee [id=6, firstName=Vineet, lastName=Mishra]
Employee [id=7, firstName=Amit, lastName=Kumar]
Employee [id=8, firstName=Dav, lastName=Miller]
Employee [id=9, firstName=Vikas, lastName=Kumar]
Employee [id=10, firstName=Pratek, lastName=Mishra]
Employee [id=11, firstName=Brian, lastName=Kumar]
Employee [id=12, firstName=David, lastName=Cena]

2018-07-10 15:33:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] completed with the following parameters: [{JobID=1531216980005}] and the following status: [COMPLETED]

Drop me your questions in comments section.

Happy Learning !!

Was this post helpful?

Join 7000+ Awesome Developers

Get the latest updates from industry, awesome resources, blog updates and much more.

* We do not spam !!

6 thoughts on “Spring Batch MultiResourceItemReader – Read Multiple CSV Files Example”

  1. I am getting an error of no resource to read Set strict = True , Any idea how to fix it. the path and folder is correct.

    Reply
  2. How can we get the name of each file which is read by multiResourceItemReader?
    How can we compare the name of the each file against a regex ?
    Process only those file which satisfy the regex else move the file in error folder without processing ?

    Reply
  3. Thanks for sharing!

    What about repeated data? let’s say that file1 and file2 have duplicated data, how would you do to make the comparassions and discard those that are repeated?

    Reply
  4. Nice Post !!
    Do you have an explanation how to integrate a file with multi record like this :

    AAAA,Warren,Q,Darrow,8272 4th Street,New York,IL,76091
    BBBBB,1165965,2011-01-22 00:13:29,51.43
    CCCC,Ann,V,Gates,9247 Infinite Loop Drive,Hollywood,NE,37612
    BBBBB,Erica,I,Jobs,8875 Farnam Street,Aurora,IL,36314
    AAAAA,8116369,2011-01-21 20:40:52,-14.83
    DDDDD,8116369,2011-01-21 15:50:17,-45.45

    Reply

Leave a Comment

HowToDoInJava

A blog about Java and related technologies, the best practices, algorithms, and interview questions.