Spring Batch MultiResourceItemReader Example

Learn to read multiple CSV files from the filesystem or resources folder using the MultiResourceItemReader class. These flat files may have the first row as the header, so do not forget to skip the first line.

1. CSV Files and Model

Consider a scenario in which we have multiple CSV files and we want to read and process them in parallel.

# inputData1.csv

id,firstName,lastName
1,Lokesh,Gupta
2,Amit,Mishra
3,Pankaj,Kumar
4,David,Miller

# inputData2.csv

id,firstName,lastName
5,Ramesh,Gupta
6,Vineet,Mishra
7,Amit,Kumar
8,Dav,Miller

# inputData3.csv

id,firstName,lastName
9,Vikas,Kumar
10,Pratek,Mishra
11,Brian,Kumar
12,David,Cena

The model class to represent the record is:

public class Employee {
 
  String id;
  String firstName;
  String lastName;
 
  // public setter and getter methods
}

In the demo, we will read all three CSV files from ‘input/*.csv‘ and write the Employee records in the database.

2. The MultiResourceItemReader

2.1. Introduction

The Spring Batch MultiResourceItemReader is a special-purpose reader that is used in batch processing scenarios to read items from multiple resources, typically files, and process them in a batch.

Each resource is treated as a separate input, and the reader delegates the reading of each resource to a specified delegate ItemReader. The items read from each delegate reader are then aggregated and processed as a single batch.

We should use MultiResourceItemReader when we have a large amount of data spread across multiple files and we want to process them concurrently to improve performance.

2.2. Configuration

Here’s a basic example of how we can configure and use MultiResourceItemReader in a Spring Batch job:

@Value("input/inputData*.csv")
private Resource[] inputResources;
 
@Bean
public Job readCSVFilesJob() {
  return jobBuilderFactory
      .get("readCSVFilesJob")
      .incrementer(new RunIdIncrementer())
      .start(step1())
      .build();
}
 
@Bean
public Step step1() {
  return stepBuilderFactory.get("step1").<Employee, Employee>chunk(5)
      .reader(multiResourceItemReader())
      .writer(writer())
      .build();
}
 
@Bean
public MultiResourceItemReader<Employee> multiResourceItemReader() 
{
  MultiResourceItemReader<Employee> resourceItemReader = new MultiResourceItemReader<Employee>();
  resourceItemReader.setResources(inputResources);
  resourceItemReader.setDelegate(reader());
  return resourceItemReader;
}
 
@Bean
public FlatFileItemReader<Employee> reader() 
{
  //Create reader instance
  FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
   
  //Set number of lines to skips. Use it if file has header rows.
  reader.setLinesToSkip(1);   
   
  //Configure how each line will be parsed and mapped to different values
  reader.setLineMapper(new DefaultLineMapper() {
    {
      //3 columns in each row
      setLineTokenizer(new DelimitedLineTokenizer() {
        {
          setNames(new String[] { "id", "firstName", "lastName" });
        }
      });
      //Set values in Employee class
      setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {
        {
          setTargetType(Employee.class);
        }
      });
    }
  });
  return reader;
}

3. Writer

To keep the demo focused, we are creating an ItemWriter that simply logs the records in the console.

@Bean
public ItemWriter<MyObject> loggingItemWriter() {
  return items -> {
    for (MyObject item : items) {
      // Log each item
      log.info("Writing item: {}", item.toString());
    }
  };
}

4. Maven

Make sure you have the following dependencies in the project:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-quartz</artifactId>
</dependency>
<dependency>
  <groupId>com.h2database</groupId>
  <artifactId>h2</artifactId>
  <scope>runtime</scope>
</dependency>

5. Demo

Now run the application, and watch out for the console logs.

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ApplicationContext;

@SpringBootApplication
public class BatchProcessingApplication implements CommandLineRunner {

  private final JobLauncher jobLauncher;
  private final ApplicationContext applicationContext;

  public BatchProcessingApplication(JobLauncher jobLauncher, ApplicationContext applicationContext) {
    this.jobLauncher = jobLauncher;
    this.applicationContext = applicationContext;
  }

  public static void main(String[] args) {
    SpringApplication.run(BatchProcessingApplication.class, args);
  }

  @Override
  public void run(String... args) throws Exception {

    Job job = (Job) applicationContext.getBean("readCSVFilesJob");

    JobParameters jobParameters = new JobParametersBuilder()
        .addString("JobID", String.valueOf(System.currentTimeMillis()))
        .toJobParameters();

    var jobExecution = jobLauncher.run(job, jobParameters);

    var batchStatus = jobExecution.getStatus();
    while (batchStatus.isRunning()) {
      System.out.println("Still running...");
      Thread.sleep(5000L);
    }
  }
}

The program output:

2018-07-10 15:32:26 INFO  - Starting App on XYZ with PID 4596 (C:\Users\user\workspace\App\target\classes started by zkpkhua in C:\Users\user\workspace\App)
2018-07-10 15:32:26 INFO  - No active profile set, falling back to default profiles: default
2018-07-10 15:32:27 INFO  - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@3c9d0b9d: startup date [Tue Jul 10 15:32:27 IST 2018]; root of context hierarchy
2018-07-10 15:32:28 INFO  - HikariPool-1 - Starting...
2018-07-10 15:32:29 INFO  - HikariPool-1 - Start completed.
2018-07-10 15:32:29 INFO  - No database type set, using meta data indicating: H2
2018-07-10 15:32:29 INFO  - No TaskExecutor has been set, defaulting to synchronous executor.
2018-07-10 15:32:29 INFO  - Executing SQL script from class path resource [org/springframework/batch/core/schema-h2.sql]
2018-07-10 15:32:29 INFO  - Executed SQL script from class path resource [org/springframework/batch/core/schema-h2.sql] in 68 ms.
2018-07-10 15:32:30 INFO  - Registering beans for JMX exposure on startup
2018-07-10 15:32:30 INFO  - Bean with name 'dataSource' has been autodetected for JMX exposure
2018-07-10 15:32:30 INFO  - Located MBean 'dataSource': registering with JMX server as MBean [com.zaxxer.hikari:name=dataSource,type=HikariDataSource]
2018-07-10 15:32:30 INFO  - No TaskScheduler/ScheduledExecutorService bean found for scheduled processing
2018-07-10 15:32:30 INFO  - Started App in 4.036 seconds (JVM running for 4.827)
 
2018-07-10 15:33:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] launched with the following parameters: [{JobID=1531216980005}]
 
2018-07-10 15:33:00 INFO  - Executing step: [step1]
 
Employee [id=1, firstName=Lokesh, lastName=Gupta]
Employee [id=2, firstName=Amit, lastName=Mishra]
Employee [id=3, firstName=Pankaj, lastName=Kumar]
Employee [id=4, firstName=David, lastName=Miller]
Employee [id=5, firstName=Ramesh, lastName=Gupta]
Employee [id=6, firstName=Vineet, lastName=Mishra]
Employee [id=7, firstName=Amit, lastName=Kumar]
Employee [id=8, firstName=Dav, lastName=Miller]
Employee [id=9, firstName=Vikas, lastName=Kumar]
Employee [id=10, firstName=Pratek, lastName=Mishra]
Employee [id=11, firstName=Brian, lastName=Kumar]
Employee [id=12, firstName=David, lastName=Cena]
 
2018-07-10 15:33:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] completed with the following parameters: [{JobID=1531216980005}] and the following status: [COMPLETED]

Drop me your questions in the comments section.

Happy Learning !!

Source Code on Github

Comments

Subscribe
Notify of
guest
8 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.