Spring Batch Delete Files After Processing

In Spring batch jobs, the best approach to delete the flat files after reading or processing is to create a separate Tasklet and execute it at the end of the job when processing is complete.

1. Tasklet for Deleting Processed Files or Records

In Spring batch, Tasklets are meant to perform a single task within a step. A Job can have multiple steps, and each step should perform only one defined task. Its execute() method is invoked by the framework during batch processing.

This is an example of such Tasklet which will delete all CSV files from c:/temp/input/ location at the end of the job. This Tasklet will be added as the last step in the batch processing to it can remove the files after the records processing is complete.

import java.io.File;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.UnexpectedJobExecutionException;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.core.io.Resource;
import org.springframework.util.Assert;

public class FileDeletingTasklet implements Tasklet, InitializingBean {

    private Resource[] resources;

    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {

    	for(Resource r: resources) {
    		File file = r.getFile();
    		boolean deleted = file.delete();
            if (!deleted) {
                throw new UnexpectedJobExecutionException("Could not delete file " + file.getPath());
            }
    	}
        return RepeatStatus.FINISHED;
    }

    public void setResources(Resource[] resources) {
        this.resources = resources;
    }

    public void afterPropertiesSet() throws Exception {
        Assert.notNull(resources, "directory must be set");
    }
}

Feel free to modify the logic inside FileDeletingTasklet to archive the files to a different location or implement your own archiving logic.

2. How to use FileDeletingTasklet

Create a Step, step2() in this example, to be executed after the main step and execute the FileDeletingTasklet. In execute() method, we can delete all the files or records or rows, according to project needs.

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import com.howtodoinjava.demo.model.Employee;
 
@Configuration
@EnableBatchProcessing
public class BatchConfig
{
  @Autowired
  private JobBuilderFactory jobBuilderFactory;
 
  @Autowired
  private StepBuilderFactory stepBuilderFactory;
 
  @Value("file:c:/temp/input/inputData*.csv")
  private Resource[] inputResources;
 
  @Bean
  public Job readCSVFilesJob() {
    return jobBuilderFactory
        .get("readCSVFilesJob")
        .incrementer(new RunIdIncrementer())
        .start(step1())
        .next(step2())
        .build();
  }
 
  @Bean
  public Step step1() {
    return stepBuilderFactory.get("step1").<Employee, Employee>chunk(5)
        .reader(multiResourceItemReader())
        .writer(writer())
        .build();
  }
   
  @Bean
    public Step step2() {
    FileDeletingTasklet task = new FileDeletingTasklet();
    task.setResources(inputResources);
        return stepBuilderFactory.get("step2")
            .tasklet(task)
                .build();
    }
 
  @Bean
  public MultiResourceItemReader<Employee> multiResourceItemReader()
  {
    MultiResourceItemReader<Employee> resourceItemReader = new MultiResourceItemReader<Employee>();
    resourceItemReader.setResources(inputResources);
    resourceItemReader.setDelegate(reader());
    return resourceItemReader;
  }
 
  @SuppressWarnings({ "rawtypes", "unchecked" })
  @Bean
  public FlatFileItemReader<Employee> reader()
  {
    // Create reader instance
    FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
    // Set number of lines to skips. Use it if file has header rows.
    reader.setLinesToSkip(1);
    // Configure how each line will be parsed and mapped to different values
    reader.setLineMapper(new DefaultLineMapper() {
      {
        // 3 columns in each row
        setLineTokenizer(new DelimitedLineTokenizer() {
          {
            setNames(new String[] { "id", "firstName", "lastName" });
          }
        });
        // Set values in Employee class
        setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {
          {
            setTargetType(Employee.class);
          }
        });
      }
    });
    return reader;
  }
 
  @Bean
  public ConsoleItemWriter<Employee> writer()
  {
    return new ConsoleItemWriter<Employee>();
  }
}

3. Demo

Now run the application and watch the logs.

2018-07-11 12:30:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] launched with the following parameters: [{JobID=1531292400004}]

2018-07-11 12:30:00 INFO  - Executing step: [step1]

Employee [id=1, firstName=Lokesh, lastName=Gupta]
Employee [id=2, firstName=Amit, lastName=Mishra]
Employee [id=3, firstName=Pankaj, lastName=Kumar]
Employee [id=4, firstName=David, lastName=Miller]
Employee [id=5, firstName=Ramesh, lastName=Gupta]
Employee [id=6, firstName=Vineet, lastName=Mishra]
Employee [id=7, firstName=Amit, lastName=Kumar]
Employee [id=8, firstName=Dav, lastName=Miller]
Employee [id=9, firstName=Vikas, lastName=Kumar]
Employee [id=10, firstName=Pratek, lastName=Mishra]
Employee [id=11, firstName=Brian, lastName=Kumar]
Employee [id=12, firstName=David, lastName=Cena]

2018-07-11 12:30:00 INFO  - Executing step: [step2]

Deleted file :: c:\temp\input\inputData1.csv
Deleted file :: c:\temp\input\inputData2.csv
Deleted file :: c:\temp\input\inputData3.csv

2018-07-11 12:30:00 INFO  - Job: [SimpleJob: [name=readCSVFilesJob]] completed with the following parameters: [{JobID=1531292400004}] and the following status: [COMPLETED]

Drop me your questions in the comments section.

Happy Learning !!

Comments

Subscribe
Notify of
guest

3 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.