In Spring batch jobs, the best approach to delete the flat files after reading or processing is to create a separate Tasklet and execute it at the end of the job when processing is complete.
1. Tasklet for Deleting Processed Files or Records
In Spring batch, Tasklets are meant to perform a single task within a step. A Job can have multiple steps, and each step should perform only one defined task. Its execute() method is invoked by the framework during batch processing.
This is an example of such Tasklet
which will delete all CSV files from c:/temp/input/
location at the end of the job. This Tasklet will be added as the last step in the batch processing to it can remove the files after the records processing is complete.
import java.io.File;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.UnexpectedJobExecutionException;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.core.io.Resource;
import org.springframework.util.Assert;
public class FileDeletingTasklet implements Tasklet, InitializingBean {
private Resource[] resources;
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
for(Resource r: resources) {
File file = r.getFile();
boolean deleted = file.delete();
if (!deleted) {
throw new UnexpectedJobExecutionException("Could not delete file " + file.getPath());
}
}
return RepeatStatus.FINISHED;
}
public void setResources(Resource[] resources) {
this.resources = resources;
}
public void afterPropertiesSet() throws Exception {
Assert.notNull(resources, "directory must be set");
}
}
Feel free to modify the logic inside FileDeletingTasklet
to archive the files to a different location or implement your own archiving logic.
2. How to use FileDeletingTasklet
Create a Step
, step2() in this example, to be executed after the main step and execute the FileDeletingTasklet
. In execute() method, we can delete all the files or records or rows, according to project needs.
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import com.howtodoinjava.demo.model.Employee;
@Configuration
@EnableBatchProcessing
public class BatchConfig
{
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Value("file:c:/temp/input/inputData*.csv")
private Resource[] inputResources;
@Bean
public Job readCSVFilesJob() {
return jobBuilderFactory
.get("readCSVFilesJob")
.incrementer(new RunIdIncrementer())
.start(step1())
.next(step2())
.build();
}
@Bean
public Step step1() {
return stepBuilderFactory.get("step1").<Employee, Employee>chunk(5)
.reader(multiResourceItemReader())
.writer(writer())
.build();
}
@Bean
public Step step2() {
FileDeletingTasklet task = new FileDeletingTasklet();
task.setResources(inputResources);
return stepBuilderFactory.get("step2")
.tasklet(task)
.build();
}
@Bean
public MultiResourceItemReader<Employee> multiResourceItemReader()
{
MultiResourceItemReader<Employee> resourceItemReader = new MultiResourceItemReader<Employee>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(reader());
return resourceItemReader;
}
@SuppressWarnings({ "rawtypes", "unchecked" })
@Bean
public FlatFileItemReader<Employee> reader()
{
// Create reader instance
FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
// Set number of lines to skips. Use it if file has header rows.
reader.setLinesToSkip(1);
// Configure how each line will be parsed and mapped to different values
reader.setLineMapper(new DefaultLineMapper() {
{
// 3 columns in each row
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[] { "id", "firstName", "lastName" });
}
});
// Set values in Employee class
setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {
{
setTargetType(Employee.class);
}
});
}
});
return reader;
}
@Bean
public ConsoleItemWriter<Employee> writer()
{
return new ConsoleItemWriter<Employee>();
}
}
3. Demo
Now run the application and watch the logs.
2018-07-11 12:30:00 INFO - Job: [SimpleJob: [name=readCSVFilesJob]] launched with the following parameters: [{JobID=1531292400004}]
2018-07-11 12:30:00 INFO - Executing step: [step1]
Employee [id=1, firstName=Lokesh, lastName=Gupta]
Employee [id=2, firstName=Amit, lastName=Mishra]
Employee [id=3, firstName=Pankaj, lastName=Kumar]
Employee [id=4, firstName=David, lastName=Miller]
Employee [id=5, firstName=Ramesh, lastName=Gupta]
Employee [id=6, firstName=Vineet, lastName=Mishra]
Employee [id=7, firstName=Amit, lastName=Kumar]
Employee [id=8, firstName=Dav, lastName=Miller]
Employee [id=9, firstName=Vikas, lastName=Kumar]
Employee [id=10, firstName=Pratek, lastName=Mishra]
Employee [id=11, firstName=Brian, lastName=Kumar]
Employee [id=12, firstName=David, lastName=Cena]
2018-07-11 12:30:00 INFO - Executing step: [step2]
Deleted file :: c:\temp\input\inputData1.csv
Deleted file :: c:\temp\input\inputData2.csv
Deleted file :: c:\temp\input\inputData3.csv
2018-07-11 12:30:00 INFO - Job: [SimpleJob: [name=readCSVFilesJob]] completed with the following parameters: [{JobID=1531292400004}] and the following status: [COMPLETED]
Drop me your questions in the comments section.
Happy Learning !!
Comments