二进制文件导成excle

489 阅读2分钟

1.确认需求

这是一个悲伤的故事。

后端:要加一个excle导出的功能
我:你把excle文件传给我
后端:不,我给你JSON你自己生成excle
我:&……%
过了一会...
后端:我不能传给你JSON,数据量太大,我传给你二进制文件,和转化规则
我:&
&*……&
我:多大的二进制
后端:30-40MB
我:这么大的二进制转成excle少说200MB,根本就打不开啊,你给客户沟通下,这个实现不了
后端:你来实验一下,30-40MB的二进制转成excle要多大,能不能打开
我:@#¥¥@#¥¥#¥@#@#E@#@#,你给我二进制数据啊
后端:自己模拟
我已经不想再说话了

2.实验

实验思路:

  • 1.生成二进制数据
  • 2.读取二进制
  • 3.将二进制转化为对象
  • 4.将对象转化为excle

开始实践

1.生成二进制数据

后端说的二进制就是0101的字符串,我们只要生成30MB的字符串就可以了,我们用node的fs模块来生成文件,代码如下:

var fs = require("fs");
//写入到test.txt文件
var writerStream = fs.createWriteStream('test.txt');
for (var i = 0; i < 5000000; i++) {
  writerStream.write(`${i}`, 'utf8');
}
//标记写入完成
writerStream.end();
writerStream.on('finish', function () {
  console.log('写入完成');
})
//失败
writerStream.on('error', function () {
  console.log('写入失败');
})

上面代码写了34MB

2.读取二进制

读取文件用到fs.readFile这个方法

const fs = require("fs");
let date9 = new Date()
fs.readFile("./test.txt", "utf-8", function (error, data) {
  console.log(`读取文件时间${new Date() - date9}毫秒`)
  console.log(data.length)
  if (error) return console.log("读取文件失败,内容是" + error.message);
});

读取34MB的文件:
读取文件时间36毫秒 读取文本长度33888890

3.将二进制转为对象

经过我的反复确认,转化的规则是: 后端先给我传表头,也就是对象的key,每两个字节是一个key值,也就每两个字节是一个value,假设表头是100个,也就是说生成的对象有100个key,vulue需要200字节。以上读取的文件共生成33888890/200个对象。 二进制转字符串的方法是:String.fromCharCode(dome)
代码如下: 我先用递归写的,但是超栈了,经同事提示改成for循环代码如下:

/**
 * 
 * @param {二进制字符串} str 
 * @param {一个对象需要的字节长度} len 
 */
function conversion(str, len) {
  let date = new Date()
  let a = ""    //生成的value
  let data = [] //生成的数组
  let obj = {}  //生成的对象
  let key = [...new Array(100).keys()]//模拟的key
  //第一次for循环,先截取一个对象所需要的自己长度 
  for (let i = 0; i < str.length / len; i++) {
    // 截取一个对象所需要的自己长度
    a = str.substring(i * len, (i + 1) * len)
    // 第二个for循环截取一个key的字节长度
    for (let j = 0; j < a.length / 2; j++) {
      //截取一个key的字节长度
      let b = a.substring(j * 2, (j + 1) * 2)
       // 将二进制转化为字符串并赋值给对应对象的value
        obj[key[j]] = String.fromCharCode(b);
    }
    // 将对象添加到数组
    data.push(obj)
  };
  console.log(`生成对象长度${key.length}`)
  console.log(`数组长度${data.length}`)
  console.log(`转化参数用时${new Date() - date}毫秒`)
}

在读取文件成功后调一下函数:

fs.readFile("./test.txt", "utf-8", function (error, data) {
  console.log(`读取文件时间${new Date() - date9}毫秒`)
  console.log(`读取文本长度${data.length}`)
  conversion(data, 200)
  if (error) return console.log("读取文件失败,内容是" + error.message);
});

结果如下:
读取34MB的文件
读取文件时间38毫秒
读取文本长度33888890
生成对象长度100
生成数组长度169445
转化参数用时951毫秒
下面我们将生成的数据写入到文件里看看有多大 我们先一次性写入:

  fs.writeFile('./test.js', JSON.stringify(data), 'utf8', function (err) {
    let dates = new Date()
    //如果err=null,表示文件使用成功,否则,表示希尔文件失败
    if (err)
      console.log('写文件出错了,错误是:' + err);
    else
      console.log(`写入用时间${new Date() - dates}毫秒`)
  })

运行结果:

Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory 报错了,文件没有写完,js内存不足,我们看一下这时文件有多大: 我们把二进制文件该小一点,我生成了一个6.3MB的二进制字符串,看一下生成的数组有多大。 6.3MB的二进制生成26MB的数组,其实生成的数组的大小是和表头的大小有关系的。

3.将数据生成excle表格

我之前也没做过,找了一篇文章步骤如下: 安装依赖:

npm install -S file-saver xlsx
npm install -S script-loader

在src新建文件夹Excel 在Excel新建Blob.js、Export2Excel.js
Blob.js:

/* eslint-disable */
/* Blob.js
 * A Blob implementation.
 * 2014-05-27
 *
 * By Eli Grey, http://eligrey.com
 * By Devin Samarin, https://github.com/eboyjr
 * License: X11/MIT
 *   See LICENSE.md
 */

/*global self, unescape */
/*jslint bitwise: true, regexp: true, confusion: true, es5: true, vars: true, white: true,
 plusplus: true */

/*! @source http://purl.eligrey.com/github/Blob.js/blob/master/Blob.js */

(function (view) {
    "use strict";

    view.URL = view.URL || view.webkitURL;

    if (view.Blob && view.URL) {
        try {
            new Blob;
            return;
        } catch (e) {}
    }

    // Internally we use a BlobBuilder implementation to base Blob off of
    // in order to support older browsers that only have BlobBuilder
    var BlobBuilder = view.BlobBuilder || view.WebKitBlobBuilder || view.MozBlobBuilder || (function(view) {
            var
                get_class = function(object) {
                    return Object.prototype.toString.call(object).match(/^\[object\s(.*)\]$/)[1];
                }
                , FakeBlobBuilder = function BlobBuilder() {
                    this.data = [];
                }
                , FakeBlob = function Blob(data, type, encoding) {
                    this.data = data;
                    this.size = data.length;
                    this.type = type;
                    this.encoding = encoding;
                }
                , FBB_proto = FakeBlobBuilder.prototype
                , FB_proto = FakeBlob.prototype
                , FileReaderSync = view.FileReaderSync
                , FileException = function(type) {
                    this.code = this[this.name = type];
                }
                , file_ex_codes = (
                    "NOT_FOUND_ERR SECURITY_ERR ABORT_ERR NOT_READABLE_ERR ENCODING_ERR "
                    + "NO_MODIFICATION_ALLOWED_ERR INVALID_STATE_ERR SYNTAX_ERR"
                ).split(" ")
                , file_ex_code = file_ex_codes.length
                , real_URL = view.URL || view.webkitURL || view
                , real_create_object_URL = real_URL.createObjectURL
                , real_revoke_object_URL = real_URL.revokeObjectURL
                , URL = real_URL
                , btoa = view.btoa
                , atob = view.atob

                , ArrayBuffer = view.ArrayBuffer
                , Uint8Array = view.Uint8Array
                ;
            FakeBlob.fake = FB_proto.fake = true;
            while (file_ex_code--) {
                FileException.prototype[file_ex_codes[file_ex_code]] = file_ex_code + 1;
            }
            if (!real_URL.createObjectURL) {
                URL = view.URL = {};
            }
            URL.createObjectURL = function(blob) {
                var
                    type = blob.type
                    , data_URI_header
                    ;
                if (type === null) {
                    type = "application/octet-stream";
                }
                if (blob instanceof FakeBlob) {
                    data_URI_header = "data:" + type;
                    if (blob.encoding === "base64") {
                        return data_URI_header + ";base64," + blob.data;
                    } else if (blob.encoding === "URI") {
                        return data_URI_header + "," + decodeURIComponent(blob.data);
                    } if (btoa) {
                        return data_URI_header + ";base64," + btoa(blob.data);
                    } else {
                        return data_URI_header + "," + encodeURIComponent(blob.data);
                    }
                } else if (real_create_object_URL) {
                    return real_create_object_URL.call(real_URL, blob);
                }
            };
            URL.revokeObjectURL = function(object_URL) {
                if (object_URL.substring(0, 5) !== "data:" && real_revoke_object_URL) {
                    real_revoke_object_URL.call(real_URL, object_URL);
                }
            };
            FBB_proto.append = function(data/*, endings*/) {
                var bb = this.data;
                // decode data to a binary string
                if (Uint8Array && (data instanceof ArrayBuffer || data instanceof Uint8Array)) {
                    var
                        str = ""
                        , buf = new Uint8Array(data)
                        , i = 0
                        , buf_len = buf.length
                        ;
                    for (; i < buf_len; i++) {
                        str += String.fromCharCode(buf[i]);
                    }
                    bb.push(str);
                } else if (get_class(data) === "Blob" || get_class(data) === "File") {
                    if (FileReaderSync) {
                        var fr = new FileReaderSync;
                        bb.push(fr.readAsBinaryString(data));
                    } else {
                        // async FileReader won't work as BlobBuilder is sync
                        throw new FileException("NOT_READABLE_ERR");
                    }
                } else if (data instanceof FakeBlob) {
                    if (data.encoding === "base64" && atob) {
                        bb.push(atob(data.data));
                    } else if (data.encoding === "URI") {
                        bb.push(decodeURIComponent(data.data));
                    } else if (data.encoding === "raw") {
                        bb.push(data.data);
                    }
                } else {
                    if (typeof data !== "string") {
                        data += ""; // convert unsupported types to strings
                    }
                    // decode UTF-16 to binary string
                    bb.push(unescape(encodeURIComponent(data)));
                }
            };
            FBB_proto.getBlob = function(type) {
                if (!arguments.length) {
                    type = null;
                }
                return new FakeBlob(this.data.join(""), type, "raw");
            };
            FBB_proto.toString = function() {
                return "[object BlobBuilder]";
            };
            FB_proto.slice = function(start, end, type) {
                var args = arguments.length;
                if (args < 3) {
                    type = null;
                }
                return new FakeBlob(
                    this.data.slice(start, args > 1 ? end : this.data.length)
                    , type
                    , this.encoding
                );
            };
            FB_proto.toString = function() {
                return "[object Blob]";
            };
            FB_proto.close = function() {
                this.size = this.data.length = 0;
            };
            return FakeBlobBuilder;
        }(view));

    view.Blob = function Blob(blobParts, options) {
        var type = options ? (options.type || "") : "";
        var builder = new BlobBuilder();
        if (blobParts) {
            for (var i = 0, len = blobParts.length; i < len; i++) {
                builder.append(blobParts[i]);
            }
        }
        return builder.getBlob(type);
    };
}(typeof self !== "undefined" && self || typeof window !== "undefined" && window || this.content || this));


Export2Excel.js:

/* eslint-disable */
require('script-loader!file-saver');
require('./Blob.js');
require('script-loader!xlsx/dist/xlsx.core.min');
function generateArray(table) {
    var out = [];
    var rows = table.querySelectorAll('tr');
    var ranges = [];
    for (var R = 0; R < rows.length; ++R) {
        var outRow = [];
        var row = rows[R];
        var columns = row.querySelectorAll('td');
        for (var C = 0; C < columns.length; ++C) {
            var cell = columns[C];
            var colspan = cell.getAttribute('colspan');
            var rowspan = cell.getAttribute('rowspan');
            var cellValue = cell.innerText;
            if (cellValue !== "" && cellValue == +cellValue) cellValue = +cellValue;

            //Skip ranges
            ranges.forEach(function (range) {
                if (R >= range.s.r && R <= range.e.r && outRow.length >= range.s.c && outRow.length <= range.e.c) {
                    for (var i = 0; i <= range.e.c - range.s.c; ++i) outRow.push(null);
                }
            });

            //Handle Row Span
            if (rowspan || colspan) {
                rowspan = rowspan || 1;
                colspan = colspan || 1;
                ranges.push({s: {r: R, c: outRow.length}, e: {r: R + rowspan - 1, c: outRow.length + colspan - 1}});
            }
            ;

            //Handle Value
            outRow.push(cellValue !== "" ? cellValue : null);

            //Handle Colspan
            if (colspan) for (var k = 0; k < colspan - 1; ++k) outRow.push(null);
        }
        out.push(outRow);
    }
    return [out, ranges];
};

function datenum(v, date1904) {
    if (date1904) v += 1462;
    var epoch = Date.parse(v);
    return (epoch - new Date(Date.UTC(1899, 11, 30))) / (24 * 60 * 60 * 1000);
}

function sheet_from_array_of_arrays(data, opts) {
    var ws = {};
    var range = {s: {c: 10000000, r: 10000000}, e: {c: 0, r: 0}};
    for (var R = 0; R != data.length; ++R) {
        for (var C = 0; C != data[R].length; ++C) {
            if (range.s.r > R) range.s.r = R;
            if (range.s.c > C) range.s.c = C;
            if (range.e.r < R) range.e.r = R;
            if (range.e.c < C) range.e.c = C;
            var cell = {v: data[R][C]};
            if (cell.v == null) continue;
            var cell_ref = XLSX.utils.encode_cell({c: C, r: R});

            if (typeof cell.v === 'number') cell.t = 'n';
            else if (typeof cell.v === 'boolean') cell.t = 'b';
            else if (cell.v instanceof Date) {
                cell.t = 'n';
                cell.z = XLSX.SSF._table[14];
                cell.v = datenum(cell.v);
            }
            else cell.t = 's';

            ws[cell_ref] = cell;
        }
    }
    if (range.s.c < 10000000) ws['!ref'] = XLSX.utils.encode_range(range);
    return ws;
}

function Workbook() {
    if (!(this instanceof Workbook)) return new Workbook();
    this.SheetNames = [];
    this.Sheets = {};
}

function s2ab(s) {
    var buf = new ArrayBuffer(s.length);
    var view = new Uint8Array(buf);
    for (var i = 0; i != s.length; ++i) view[i] = s.charCodeAt(i) & 0xFF;
    return buf;
}

export function export_table_to_excel(id) {
    var theTable = document.getElementById(id);
    console.log('a')
    var oo = generateArray(theTable);
    var ranges = oo[1];

    /* original data */
    var data = oo[0];
    var ws_name = "SheetJS";
    console.log(data);

    var wb = new Workbook(), ws = sheet_from_array_of_arrays(data);

    /* add ranges to worksheet */
    // ws['!cols'] = ['apple', 'banan'];
    ws['!merges'] = ranges;

    /* add worksheet to workbook */
    wb.SheetNames.push(ws_name);
    wb.Sheets[ws_name] = ws;

    var wbout = XLSX.write(wb, {bookType: 'xlsx', bookSST: false, type: 'binary'});

    saveAs(new Blob([s2ab(wbout)], {type: "application/octet-stream"}), "test.xlsx")
}

function formatJson(jsonData) {
    console.log(jsonData)
}
export function export_json_to_excel(th, jsonData, defaultTitle) {

    /* original data */

    var data = jsonData;
    data.unshift(th);
    var ws_name = "SheetJS";

    var wb = new Workbook(), ws = sheet_from_array_of_arrays(data);


    /* add worksheet to workbook */
    wb.SheetNames.push(ws_name);
    wb.Sheets[ws_name] = ws;

    var wbout = XLSX.write(wb, {bookType: 'xlsx', bookSST: false, type: 'binary'});
    var title = defaultTitle || '列表'
    saveAs(new Blob([s2ab(wbout)], {type: "application/octet-stream"}), title + ".xlsx")
}

用的时候调用下面两个方法:

 export2Excel() {
        require.ensure([], () => {
          const { export_json_to_excel } = require('../../excel/Export2Excel');
          const tHeader = ['商品名称', '货号/条码', '进价', '售价', '会员价赠送积分', 'VIP价', '赠送积分', '进货数量'];
          // 上面设置Excel的表格第一行的标题
          const filterVal = ['goodsName', 'goodsId', 'goodsImportCost', 'goodsImportPrice', 'vipGiveIntegral', 'vipPrice', 'giveIntegral', 'goodsImportCount'];
          // 上面的 goodsName、goodsId、是tableData里对象的属性
          const list = this.tableData;  //把data里的tableData存到list
          const data = this.formatJson(filterVal, list);
          export_json_to_excel(tHeader, data, '订货单明细');
        })
      },
 formatJson(filterVal, jsonData) {
        return jsonData.map(v => filterVal.map(j => v[j]))
      },

我们先请求10MB的二进制文件看看导出的erxcl有多大: 代码如下:

//二进制转为数组
 conversion(str, len) {
      let date = new Date();
      let a = "";
      let data = [];
      let obj = {};
      let key = [...new Array(100).keys()];
      for (let i = 0; i < str.length / len; i++) {
        a = str.substring(i * len, (i + 1) * len);
        for (let j = 0; j < a.length / 2; j++) {
          let b = a.substring(j * 2, (j + 1) * 2);
          obj[key[j]] = String.fromCharCode(b);
        }
        data.push(obj);
      }
      console.log(`数组长度${data.length}`);
      console.log(`两次for循环转化参数用时${new Date() - date}毫秒`);
      return {
        header: key,
        datas: data,
      };
    },
    //导出的方法
      exportExcel() {
      let date4 = new Date();
      file({}, "", "GET").then((res) => {
        console.log(`二进制文件请求时间${new Date() - date4}`);
        const {
          header,
          datas
        } = this.conversion(res, 200);
        let date5 = new Date();
        require.ensure([], () => {
          const {
            export_json_to_excel,
          } = require("../../../Excel/Export2Excel.js");
          const tHeader = header;
          // 上面设置Excel的表格第一行的标题
          const filterVal = header;
          // 上面的index、nickName、name是tableData里对象的属性
          const list = datas; //把data里的tableData存到list
          const data = this.formatJson(filterVal, list);
          export_json_to_excel(tHeader, data, "测试文件");
        });
        console.log(`数据生成excle表格时间:${new Date() - date5}`);
      });
    },
    formatJson(filterVal, jsonData) {
      return jsonData.map((v) => filterVal.map((j) => v[j]));
    },

测试结果: 10MB二进制文件生成127MB的excle,打开看一下,全是空白。 我们改下二进制文件改成6MB,再试一下: 6MB生成101MB大小的rexcle,打开看一下:

实验结果

经过我反复实验,二进制文件在7MB以上生成的excle就打不开了,考虑到实际情况数据更复杂,保守预计,一次生成的excle最大二进制文件应该在5MB左右,30-40MB的二进制应该分多个excle表格