SpringBoot + MyBatis 百万级流式Excel导出工具

1,105 阅读4分钟

因为业务调整,我们的某个工程也随之迁移。从原来的“你只管大胆写剩下的交给机器”到现在需要注意性能优化,特别是我们的订单导出,一次性百万的量让小机器吃不消了,敢点击导出就敢OOM重启,就要重新弄一个能支持百万级别导出的Excel工具

一、依赖

<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>easyexcel</artifactId>
    <version>3.1.5</version>
</dependency>

二、mysql开启流式导出

1.xml

<select id="ExportData" fetchSize="-2147483648" resultType="java.util.Map">

2.注解

@Options(fetchSize = Integer.MIN_VALUE)

必须要加上 fetchSize="-2147483648" 这个标识才行,为什么呢?

/**
 * We only stream result sets when they are forward-only, read-only, and the
 * fetch size has been set to Integer.MIN_VALUE
 *
 * @return true if this result set should be streamed row at-a-time, rather
 *         than read all at once.
 */
protected boolean createStreamingResultSet() {
    return ((this.resultSetType == java.sql.ResultSet.TYPE_FORWARD_ONLY) &&   (this.resultSetConcurrency == java.sql.ResultSet.CONCUR_READ_ONLY) && (this.fetchSize == Integer.MIN_VALUE));
}

jdbc包里的实现就是这样,只有满足这3个条件才会创建流式的结果集,resultSetType和resultSetConcurrency默认值已经对应的了,但是fetchSize默认值是0,所以我们需要手动添加fetchSize="-2147483648"才行。我们拿到这个 iterator结果集 的时候,也是一条一条遍历并通过游标偏移去获取数据的,性能上会有损耗,对性能有要求就要考虑其他方法了。

三、文件返回接口以及默认实现

/**  
* @description: 需要什么类型的返回就自己实现  
* @author: hinotoyk  
* @created: 2023/11/30  
*/  
public interface FileSender {  
    void send(File file);  
}


 
/**
 * @description: 默认的文件导出实现类,使用response流导出
 * @author: hinotoyk
 * @created: 2023/11/30
 */

@Slf4j
public class ResponseStreamSender implements FileSender {

    private String fileName;
    private HttpServletResponse response;

    public ResponseStreamSender(String fileName, HttpServletResponse response) {
        this.fileName = fileName;
        this.response = response;
    }

    @Override
    public void send(File file) {
        try {
            response.addHeader("Access-Control-Expose-Headers", "Content-Disposition");
            response.setContentType("application/vnd.ms-excel");
            response.setCharacterEncoding("utf-8");
            response.setHeader("Content-disposition", "attachment;filename=" +
                URLEncoder.encode(fileName, "UTF-8") + ".xlsx");
            try (InputStream inputStream = new FileInputStream(file); OutputStream outputStream = response.getOutputStream()) {
                byte[] buffer = new byte[4096];
                int bytesRead = -1;
                while ((bytesRead = inputStream.read(buffer)) != -1) {
                    outputStream.write(buffer, 0, bytesRead);
                }
                outputStream.flush();
            } catch (FileNotFoundException e) {
                log.error(e.getMessage(), e);
            } catch (IOException e) {
                log.error(e.getMessage(), e);
            }
        } catch (UnsupportedEncodingException e) {
            log.error(e.getMessage(), e);
        }
    }
}

三、StreamExportExcelUtils工具实现


import com.alibaba.excel.EasyExcel;
import com.alibaba.excel.ExcelWriter;
import com.alibaba.excel.write.metadata.WriteSheet;
import lombok.extern.slf4j.Slf4j;
import org.apache.ibatis.cursor.Cursor;
import org.apache.ibatis.session.SqlSession;
import org.apache.ibatis.session.SqlSessionFactory;

import java.io.File;
import java.io.IOException;
import java.util.*;
import java.util.function.Consumer;

/**
 * @description:
 * @author: hinotoyk
 * @created: 2023/12/11
 */

@Slf4j
public class StreamExportExcelUtils<T>  {

    /****
     *
     * @param fileName 文件名
     * @param titles 表头 ["名字","年龄"]
     * @param keys 表头对应的key ["name","key"]
     * @param search 搜索条件,可为null
     * @param getDataMethodString 获取数据的方法全限定名
     * @param dataProcessingFunction 可以对读取出来的数据进行再处理,可为null
     * @param batchSize 单次写入Excel的数据量
     * @param sheetCount 单个sheet的总行数
     * @param fileSender 文件发送接口
     */
    public static <T> void  exportExcel(String fileName, String[] titles, String[] keys,
                                       Object search, String getDataMethodString,
                                       Consumer<List<T>> dataProcessingFunction, Long batchSize, Long sheetCount,
                                       FileSender fileSender){
        //当前sheet索引
        int sheetIndex = 0;
        //当前sheet行数
        long currentSheetCount = 0L;
        File tempXlsxFile = null;
        SqlSessionFactory sqlSessionFactory = ApplicationContext.getBean(SqlSessionFactory.class);
        try (SqlSession sqlSession = sqlSessionFactory.openSession()) {
            //写入Excel
            String tempXlsxFileName = UUID.randomUUID().toString().replaceAll("-", "");
            tempXlsxFile = File.createTempFile(tempXlsxFileName, ".xlsx");

            try(ExcelWriter excelWriter = EasyExcel.write(tempXlsxFile).build()){

                //初始化sheet
                WriteSheet writeSheet = EasyExcel.writerSheet(fileName).build();
                writeSheet.setHead((handlerTitle(titles)));
                excelWriter.write(new ArrayList<>(), writeSheet);
                Cursor<T> selectCursor = sqlSession.selectCursor(getDataMethodString, search);
                Iterator<T> iterator = selectCursor.iterator();
                List<T> tempList = new ArrayList<>();
                while (iterator.hasNext()){
                    T next = iterator.next();
                    tempList.add(next);
                    if(tempList.size() >= batchSize){
                        currentSheetCount += tempList.size();

                        if (currentSheetCount >= sheetCount) {
                            //另起一个sheet
                            writeSheet = EasyExcel.writerSheet(fileName + (++sheetIndex)).build();
                            writeSheet.setHead((handlerTitle(titles)));
                            excelWriter.write(new ArrayList<>(), writeSheet);
                            currentSheetCount = 0L;
                        }
                        //数据再处理
                        if(Objects.nonNull(dataProcessingFunction)){
                            dataProcessingFunction.accept(tempList);
                        }
                        //写出Excel
                        excelWriter.write(handlerData(tempList, keys), writeSheet);
                        //避免内存溢出
                        tempList.clear();
                    }
                }

                excelWriter.finish();
                //发送数据
                fileSender.send(tempXlsxFile);

                sqlSession.clearCache();
                selectCursor.close();
            } catch (IOException e) {
                log.error(e.getMessage(),e);
                throw new RuntimeException(e);
            }
        } catch (IOException e) {
            log.error(e.getMessage(),e);
            throw new RuntimeException(e);
        } finally {
            if(tempXlsxFile != null && tempXlsxFile.exists()){
                tempXlsxFile.delete();
            }
        }

    }

    /**
     * 将title处理为easyExcel需要的格式
     *
     * @param titles 导出数据的标题
     * @return 处理后数据
     */
    private static List<List<String>> handlerTitle(String[] titles) {
        List<List<String>> result = new ArrayList<>();
        for (String title : titles) {
            List<String> head = new ArrayList<>();
            head.add(title);
            result.add(head);
        }
        return result;
    }

    /**
     * 把数据处理成key对应的顺序的List集合
     *
     * @param dataList 源数据
     * @param keys 标题对应的字段key
     * @return 处理后数据
     */
    private static <T> List<List<Object>> handlerData(List<T> dataList, String[] keys) {
        List<List<Object>> result = new ArrayList<>();
        if (dataList.isEmpty()) {
            return result;
        }
        T firstElement = dataList.get(0);
        if (firstElement instanceof Map) {
            for (T item : dataList) {
                Map<String, Object> m = (Map) item;
                List<Object> data = new ArrayList<>();
                result.add(data);
                for (String key : keys) {
                    Object o = m.get(key);
                    if (o instanceof Long) {
                        //long过长会变成科学计数法,直接转字符串解决
                        o = String.valueOf(o);
                    }
                    data.add(o);
                }
            }
        } else {
            //todo
            //其他对象转Map
        }
        return result;
    }

}

四、测试&总结

设置不同的堆内存进行测试

代码demo

long start = System.currentTimeMillis();  
StreamExportExcelUtils.exportExcel("用户信息表",titles,keys,  
search,"com.hinotoyk.mapper.UsersMapper.ExportData",  
null,size, 1000000L,new ResponseStreamSender("用户信息表",response));  
long end = System.currentTimeMillis();  
log.info("StreamExportExcelUtils:[{}]ms", (end - start));

测试一

-Xmx204m  
-Xms102m 
注:单次次写入excel2w数据

image.png

image.png

测试二

-Xmx2040m  
-Xms1020m  
注:单次写入excel20w数据

image.png

image.png

总结

目前的最直观的效果就是无论内存大小怎样都能成功导出百万级的数据,内存释放也是合理的。并且同样100w数据量导出,内存大时可以通过加大单批次的写入量极大的减少导出时间。这次写个工具也从头到尾debug了一遍mybatis代理执行jdbc的调用链路,巩固了一遍基础。