Spring Batch 批处理(4) - ItemReader

·  阅读 608

ItemReader概述


1.ItemReader:提供数据的接口

2.在这个接口中只有一个方法read(),它读取一个数据并且移动到下一个数据上去,在读取结束时必须返回一个null,否则表明数据没有读取完毕;

接口定义如下:

public interface ItemReader<T> { @Nullable T read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException;}
复制代码

提供的默认实现有33个之多,基本涵盖所有的数据源读取类型。

file

file

pics7.baidu.com/feed/43a7d9…




# 从数据库中读取数据
1.在实际应用中,我们都需要从数据库中读取数据,并且进行分页读取,在spring-batch中为我们提供了JDBCPagingItemReader这个类进行数据库数据读取

2.在数据库中建立user表

数据库数据如下:

file



3.使用JdbcPagingItemReader从数据库读取数据

@Configuration
public class DBJdbcDemoJobConfiguration {
    @Autowired
    private JobBuilderFactory jobBuilderFactory;
 
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
 
    @Autowired
    @Qualifier("dbJdbcDemoWriter")
    private ItemWriter<? super Customer> dbJdbcDemoWriter;
 
    @Autowired
    private DataSource dataSource;
 
    @Bean
    public Job DBJdbcDemoJob(){
        return jobBuilderFactory.get("DBJdbcDemoJob")
                .start(dbJdbcDemoStep())
                .build();
 
    }
 
    @Bean
    public Step dbJdbcDemoStep() {
        return stepBuilderFactory.get("dbJdbcDemoStep")
                .<Customer,Customer>chunk(100)
                .reader(dbJdbcDemoReader())
                .writer(dbJdbcDemoWriter)
                .build();
    }
 
    @Bean
    @StepScope
    public JdbcPagingItemReader<Customer> dbJdbcDemoReader() {
        JdbcPagingItemReader<Customer> reader = new JdbcPagingItemReader<>();
 
        reader.setDataSource(this.dataSource);
        reader.setFetchSize(100); //批量读取
        reader.setRowMapper((rs,rowNum)->{
				    // 讲读取的记录转化为对象
            return Customer.builder().id(rs.getLong("id"))
                    .firstName(rs.getString("firstName"))
                    .lastName(rs.getString("lastName"))
                    .birthdate(rs.getString("birthdate"))
                    .build();
 
        });
 
        // 指定sql语句
        MySqlPagingQueryProvider queryProvider = new MySqlPagingQueryProvider();
        queryProvider.setSelectClause("id, firstName, lastName, birthdate");
        queryProvider.setFromClause("from Customer");
        // 指定字段排序
        Map<String, Order> sortKeys = new HashMap<>(1);
        sortKeys.put("id", Order.ASCENDING);
        queryProvider.setSortKeys(sortKeys);
 
        reader.setQueryProvider(queryProvider);
 
        return reader;
 
    }
}
复制代码

输出方法 ``` @Component("dbJdbcDemoWriter") public class DbJdbcDemoWriter implements ItemWriter { @Override public void write(List items) throws Exception { for (Customer customer:items) System.out.println(customer);
}
复制代码

}


<br/><br/><br/>
# 从CVS/txt文件中读取数据

<br/>
在项目中的resources中放入csv文件,以读取customer.csv为例

文件内容

![file](https://graph.baidu.com/resource/222620df58b12c167892e01583251119.png)


FlatFileItemReader
复制代码

@Configuration public class FlatFileDemoJobConfiguration { @Autowired private JobBuilderFactory jobBuilderFactory;

@Autowired
private StepBuilderFactory stepBuilderFactory;

@Autowired
@Qualifier("flatFileDemoWriter")
private ItemWriter<? super Customer> flatFileDemoWriter;

@Bean
public Job flatFileDemoJob(){
    return jobBuilderFactory.get("flatFileDemoJob")
            .start(flatFileDemoStep())
            .build();

}

@Bean
public Step flatFileDemoStep() {
    return stepBuilderFactory.get("flatFileDemoStep")
            .<Customer,Customer>chunk(100)
            .reader(flatFileDemoReader())
            .writer(flatFileDemoWriter)
            .build();
}

@Bean
@StepScope
public FlatFileItemReader<Customer> flatFileDemoReader() {
    FlatFileItemReader<Customer> reader = new FlatFileItemReader<>();
			// 读取文件名
    reader.setResource(new ClassPathResource("customer.csv"));
			// 跳过第一行
    reader.setLinesToSkip(1);

    // 解析数据
    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
    tokenizer.setNames(new String[]{"id","firstName","lastName","birthdate"});

    // 解析出来的数据映射为对象
    DefaultLineMapper<Customer> lineMapper = new DefaultLineMapper<>();
    lineMapper.setLineTokenizer(tokenizer);
    lineMapper.setFieldSetMapper((fieldSet -> {
        return Customer.builder().id(fieldSet.readLong("id"))
                .firstName(fieldSet.readString("firstName"))
                .lastName(fieldSet.readString("lastName"))
                .birthdate(fieldSet.readString("birthdate"))
                .build();
    }));
    lineMapper.afterPropertiesSet();

    reader.setLineMapper(lineMapper);

    return reader;

}
复制代码

}


<br/>
输出方法
复制代码

@Component("flatFileDemoWriter") public class FlatFileDemoWriter implements ItemWriter { @Override public void write(List<? extends Customer> items) throws Exception { for (Customer customer:items) System.out.println(customer);

}
复制代码

}


<br/>
打印如下:

![file](https://graph.baidu.com/resource/222b0a0a173d9936b8ec601583252983.png)

<br/><br/><br/>


## 文件读写FlatFileItem 
<br/>


使用 FlatFileItemReader,FlatFileItemWriter 帮我们做了什么?

1、FlatFileItem 能够以固定长度进行读写(对于大文件尤为重要),开发者不用关注文件的读写流问题

2、对文件读写时能够保证事物

<br/>

### 详解 FlatFileItemReader
<br/>
FlatFileItemReader 是对文件读取的类,一般是对表格数据,或者文本文件数据的处理。该类的以下两个属性是必须要set的

* setResource 指定文件资源的位置:通过ClassPathResource(类所在路径)或者FileSystemResource(文件系统所在路径)来指定要读取的文件

* setLineMapper 行映射:指定行与实体对象之间的映射关系,示例代码使用了DefaultLineMapper

* seEncoding 读取编码格式,默认为‘iso-8859-1’

* setStrict 严格模式,输入文件不存在会抛出异常,阻断当前job;默认为true

<br/>
示例代码:
复制代码

@Bean public FlatFileItemReader csvItemReader() { FlatFileItemReader csvItemReader = new FlatFileItemReader<>(); csvItemReader.setResource(new ClassPathResource("data/sample-data.csv")); csvItemReader.setLineMapper(new DefaultLineMapper() {{ setLineTokenizer(new DelimitedLineTokenizer() {{ setNames(new String[]{"name", "age"}); }}); setFieldSetMapper(new BeanWrapperFieldSetMapper() {{ setTargetType(Person.class); }}); }}); return csvItemReader; }


<br/>

### 详解 FlatFileItemWriter
<br/>
FlatFileItemWriter 是对文件的写入类,将批量数据流写入文件,该类使用必须了解下面几个方法的用法:

- setLineAggregator 和 FlatFileItemReader 的setLineMapper方法有着相似之处,setLineAggregator方法是将对象属性聚合为字符串,聚合时根据需要设置分隔符(setDelimiter),以及对象属性对应的字符名称(setFieldExtractor)

 - LineAggregator 接口是创建对象属性聚合字符串

 - ExtractorLineAggregator 是抽象类实现LineAggregator接口。使用 FieldExtractor将对象属性转换为数组,该类的扩展类负责将数组转换字符串(doAggregate)

   - DelimitedLineAggregator 继承 ExtractorLineAggregator。是一种更常使用的聚合方式、将数组用指定符号分割,默认使用逗号

   - FormatterLineAggregator 继承 ExtractorLineAggregator。对数组字符串的最大长度,最小长度的校验,以及格式化操作

 - PassThroughLineAggregator 实现LineAggregator接口,是一种简单的聚合方式使用对象的.toString()返回值,作为聚合字符串

- RecursiveCollectionLineAggregator 实现LineAggregator接口,将Collection<T> 集合遍历,集合的聚合通过系统行分割符分割,对象字段的聚合使用LineAggregator接口对应的聚合方法是可选择的。

- setResource 是指定输出文件的位置,同样也是必须的,示例代码中使用了new ClassPathResource("/data/sample-data.txt") 实际开发中更多的是 new FilePathResource()

- setEncoding 设置编码,默认也是 iso-8859-1

<br/>

![file](https://graph.baidu.com/resource/222784a7966e50e2edb1101583297793.png)

<br/>

示例代码:
复制代码

@Bean public FlatFileItemWriter txtItemWriter() { FlatFileItemWriter txtItemWriter = new FlatFileItemWriter<>(); txtItemWriter.setAppendAllowed(true); txtItemWriter.setEncoding("UTF-8"); txtItemWriter.setResource(new ClassPathResource("/data/sample-data.txt")); txtItemWriter.setLineAggregator(new DelimitedLineAggregator() {{ setDelimiter(","); setFieldExtractor(new BeanWrapperFieldExtractor() {{ setNames(new String[]{"name", "age"}); }}); }}); return txtItemWriter; }

<br/>
<br/>
<br/>


# 从XML文件中读取数据
<br/>

1.使用StaxEventItemReader<T>读取xml数据 
2.例:在项目中加入一个customer.xml文件,以读取此文件为例

**待读取的xml文件**

![file](https://graph.baidu.com/resource/22266ba2119d2ce23921a01583296313.png)

<br/>


**pom.xml 配置**

复制代码
org.springframework spring-oxm com.thoughtworks.xstream xstream 1.4.7 ```

StaxEventItemReader

@Configuration
public class XmlFileDemoJobConfiguration {
    @Autowired
    private JobBuilderFactory jobBuilderFactory;
 
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
 
    @Autowired
    @Qualifier("xmlFileDemoWriter")
    private ItemWriter<? super Customer> xmlFileDemoWriter;
 
    @Bean
    public Job xmlFileDemoJob(){
        return jobBuilderFactory.get("xmlFileDemoJob")
                .start(xmlFileDemoStep())
                .build();
 
    }
 
    @Bean
    public Step xmlFileDemoStep() {
        return stepBuilderFactory.get("xmlFileDemoStep")
                .<Customer,Customer>chunk(10)
                .reader(xmlFileDemoReader())
                .writer(xmlFileDemoWriter)
                .build();
    }
 
    @Bean
    @StepScope
    public StaxEventItemReader<Customer> xmlFileDemoReader() {
        StaxEventItemReader<Customer> reader = new StaxEventItemReader<>();
 
        reader.setResource(new ClassPathResource("customer.xml"));
        // 指定需要处理的根标签
        reader.setFragmentRootElementName("customer");
				
        // 需要转成的对象
        Map<String,Class> map = new HashMap<>();
        map.put("customer",Customer.class);
 
        // 将xml转成对象
        XStreamMarshaller unMarshaller = new XStreamMarshaller();
        unMarshaller.setAliases(map);
        reader.setUnmarshaller(unMarshaller);
 
 
        return reader;
 
    }
}
复制代码

输出方法 ``` @Component("xmlFileDemoWriter") public class XmlFileDemoWriter implements ItemWriter { @Override public void write(List items) throws Exception { for (Customer customer:items) System.out.println(customer);
}
复制代码

}


<br/>
打印如下:

![file](https://graph.baidu.com/resource/222ac917e105e5648fe1c01583297500.png)

<br/><br/><br/>


## XML文件处理
<br/>
对xml文件的处理需要引入spring-oxm包,仅对xml的输出进行详解,XML读取类似
对xml写入操作的对象为StaxEventItemWriter,与FlatFileItemWriter的使用类似,StaxEventItemWriter 与 FlatFileItemWriter都有着setResource方法,StaxEventItemWriter默认编码为utf-8

* setRootTagName 设置根节点标签名称

* setMarshaller 指定对象与节点 映射关系

<br/>
示例代码:

复制代码

@Bean public StaxEventItemWriter xmlItemWriter() { StaxEventItemWriter xmlItemWriter = new StaxEventItemWriter<>(); xmlItemWriter.setRootTagName("root") xmlItemWriter.setEncoding("UTF-8"); xmlItemWriter.setResource(new ClassPathResource("/data/sample-data.xml")); xmlItemWriter.setMarshaller(new XStreamMarshaller() {{ Map<String, Class> map = new HashMap<>(); map.put("person",Person.class); setAliases(map); }}); return xmlItemWriter; }


<br/>
<br/>
<br/>


# 从多个文件读取数据
<br/>

1.在一个给定的目录下一次读取多个文件时非常常见的


2.我们可以使用MultiResourceItemReader来注册一个input file并且设置代理的ItemReader去处理每一个源文件


例:我们在项目classpath路径中同时存放三个file开头的csv文件,如下所示:

![file](https://graph.baidu.com/resource/222cdef6af54e1131e13b01583300700.png)

<br/>
MultiResourceItemReader
复制代码

@Configuration public class MultipleFileDemoJobConfiguration { @Autowired private JobBuilderFactory jobBuilderFactory;

@Autowired
private StepBuilderFactory stepBuilderFactory;

@Autowired
@Qualifier("multiFileDeWriter")
private ItemWriter<? super Customer> multiFileDeWriter;

@Value("classpath*:/file*.csv")
private Resource[] inputFiles;

@Bean
public Job multipleFileDemoJob(){
    return jobBuilderFactory.get("multipleFileDemoJob")
            .start(multipleFileDemoStep())
            .build();

}

@Bean
public Step multipleFileDemoStep() {
    return stepBuilderFactory.get("multipleFileDemoStep")
            .<Customer,Customer>chunk(50)
            .reader(multipleResourceItemReader())
            .writer(multiFileDeWriter)
            .build();
}

private MultiResourceItemReader<Customer> multipleResourceItemReader() {

    MultiResourceItemReader<Customer> reader = new MultiResourceItemReader<>();

    reader.setDelegate(flatFileReader());
    reader.setResources(inputFiles);

    return reader;
}

@Bean
public FlatFileItemReader<Customer> flatFileReader() {
    FlatFileItemReader<Customer> reader = new FlatFileItemReader<>();
    reader.setResource(new ClassPathResource("customer.csv"));
   // reader.setLinesToSkip(1);

    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
    tokenizer.setNames(new String[]{"id","firstName","lastName","birthdate"});

    DefaultLineMapper<Customer> lineMapper = new DefaultLineMapper<>();
    lineMapper.setLineTokenizer(tokenizer);
    lineMapper.setFieldSetMapper((fieldSet -> {
        return Customer.builder().id(fieldSet.readLong("id"))
                .firstName(fieldSet.readString("firstName"))
                .lastName(fieldSet.readString("lastName"))
                .birthdate(fieldSet.readString("birthdate"))
                .build();
    }));
    lineMapper.afterPropertiesSet();

    reader.setLineMapper(lineMapper);

    return reader;

}
复制代码

}



<br/>
输出方法
复制代码

@Component("multiFileDeWriter") public class MultiFileDeWriter implements ItemWriter { @Override public void write(List<? extends Customer> items) throws Exception { for (Customer customer:items) System.out.println(customer);

}
复制代码

}

<br/>
打印如下:

![file](https://graph.baidu.com/resource/22239a27970444932658a01583300958.png)

<br/>

<br/>
参考:

https://blog.csdn.net/wuzhiwei549/article/details/88592509

https://blog.51cto.com/13501268/2298081

https://www.jianshu.com/p/9b7088471371复制代码
分类:
阅读
标签:
分类:
阅读
标签:
收藏成功!
已添加到「」, 点击更改