大文件分片上传完整指南:从原理到实现

181 阅读5分钟

大文件分片上传完整指南:从原理到实现

前言

在实际开发中,我们经常会遇到上传大文件的需求,比如视频、压缩包、大型安装包等。直接使用普通的 input[type=file] 上传会面临以下问题:

  • 请求超时:大文件上传时间长,服务端可能超时
  • 内存溢出:一次性读取大文件到内存可能导致浏览器崩溃
  • 无法断点续传:网络中断后必须从头开始
  • 进度不明确:用户不知道上传进度

分片上传 就是解决这些问题的标准方案。本文将带你从零实现一个完整的大文件分片上传功能。

源码已开源:github.com/313258196/l…


一、分片上传原理

核心思路

┌─────────────────────────────────────────────────────────────────┐
│                        大文件 (100MB)                            │
│  ┌────────┬────────┬────────┬────────┬────────┬────────┐       │
│  │ chunk0 │ chunk1 │ chunk2 │ chunk3 │  ...   │chunkN  │       │
│  │ 5MB    │ 5MB    │ 5MB    │ 5MB    │        │ 5MB    │       │
│  └───┬────┴───┬────┴───┬────┴───┬────┴────────┴───┬────┘       │
└──────┼────────┼────────┼────────┼─────────────────┼────────────┘
       │        │        │        │                 │
       ↓        ↓        ↓        ↓                 ↓
   ┌───────┐ ┌───────┐ ┌───────┐              ┌───────┐
   │ 上传1 │ │ 上传2 │ │ 上传3 │    ...       │ 上传N │
   └───┬───┘ └───┬───┘ └───┬───┘              └───┬───┘
       │         │         │                      │
       └─────────┴────┬────┴──────────────────────┘
                      ↓
              ┌───────────────┐
              │  服务端合并   │
              └───────────────┘

实现步骤

步骤前端后端
1计算文件 hash 作为唯一标识-
2按固定大小切片(如 5MB)-
3逐个或并发上传切片接收并存储切片
4通知后端合并合并所有切片文件
5接收最终文件地址返回文件地址

二、项目快速启动

项目结构

large-file-block-upload/
├── client/           # 前端 Vue3 项目
│   ├── src/
│   │   ├── App.vue
│   │   └── main.js
│   └── package.json
├── server.js         # 后端 Node.js
├── uploads/          # 上传文件目录
└── package.json

启动方式

# 1. 克隆项目
git clone https://github.com/313258196/large-file-block-upload.git

# 2. 安装依赖
npm install

# 3. 启动(前端 + 后端同时启动)
npm run dev

启动成功后,后端运行在 http://localhost:3132,前端在 http://localhost:5173


三、前端实现详解

1. 文件切片核心代码

/**
 * 将文件切分为多个 Blob
 * @param {File} file - 要上传的文件
 * @param {number} chunkSize - 每个切片大小,默认 5MB
 * @returns {Array<Blob>} 切片数组
 */
function createFileChunks(file, chunkSize = 5 * 1024 * 1024) {
  const chunks = []
  let start = 0
  
  while (start < file.size) {
    const end = Math.min(start + chunkSize, file.size)
    chunks.push(file.slice(start, end))
    start = end
  }
  
  return chunks
}

2. 计算文件 Hash(SparkMD5)

import SparkMD5 from 'spark-md5'

/**
 * 计算文件 hash 值
 * @param {File} file 
 * @returns {Promise<string>}
 */
function calculateFileHash(file) {
  return new Promise((resolve, reject) => {
    const chunkSize = 2 * 1024 * 1024  // 2MB per chunk
    const chunks = Math.ceil(file.size / chunkSize)
    let currentChunk = 0
    
    const spark = new SparkMD5.ArrayBuffer()
    const fileReader = new FileReader()
    
    fileReader.onload = (e) => {
      spark.append(e.target.result)
      currentChunk++
      
      if (currentChunk < chunks) {
        loadNext()
      } else {
        const hash = spark.end()
        resolve(hash)
      }
    }
    
    fileReader.onerror = reject
    
    function loadNext() {
      const start = currentChunk * chunkSize
      const end = Math.min(start + chunkSize, file.size)
      fileReader.readAsArrayBuffer(file.slice(start, end))
    }
    
    loadNext()
  })
}

3. 完整上传类封装

import axios from 'axios'
import SparkMD5 from 'spark-md5'

class ChunkUploader {
  constructor(options = {}) {
    this.chunkSize = options.chunkSize || 5 * 1024 * 1024  // 5MB
    this.concurrency = options.concurrency || 3            // 并发数
    this.uploadUrl = '/upload'
    this.mergeUrl = '/merge'
  }
  
  /**
   * 上传单个文件
   * @param {File} file 
   * @param {Function} onProgress - 进度回调
   */
  async upload(file, onProgress) {
    // 1. 计算文件 hash
    const fileHash = await this.calculateHash(file)
    const ext = file.name.split('.').pop()
    
    // 2. 创建切片
    const chunks = this.createChunks(file)
    const total = chunks.length
    
    // 3. 准备上传数据
    const uploadList = chunks.map((chunk, index) => ({
      fileHash,
      chunk: chunk,
      chunkIndex: index,
      total,
      filename: file.name,
      ext
    }))
    
    // 4. 并发上传切片
    let uploaded = 0
    await this.concurrentUpload(uploadList, () => {
      uploaded++
      onProgress && onProgress(Math.round((uploaded / total) * 100))
    })
    
    // 5. 请求合并
    const result = await this.mergeChunks(fileHash, file.name, ext)
    return result
  }
  
  /**
   * 并发上传控制
   */
  async concurrentUpload(uploadList, onSingleComplete) {
    const results = []
    const executing = []
    const concurrency = this.concurrency
    
    for (const item of uploadList) {
      const promise = this.uploadChunk(item).then(() => {
        executing.splice(executing.indexOf(promise), 1)
        onSingleComplete()
      })
      
      results.push(promise)
      executing.push(promise)
      
      if (executing.length >= concurrency) {
        await Promise.race(executing)
      }
    }
    
    await Promise.all(results)
  }
  
  /**
   * 上传单个切片
   */
  async uploadChunk({ fileHash, chunk, chunkIndex, total, filename, ext }) {
    const formData = new FormData()
    formData.append('file', chunk)
    formData.append('fileHash', fileHash)
    formData.append('chunkIndex', chunkIndex)
    formData.append('total', total)
    formData.append('filename', filename)
    formData.append('ext', ext)
    
    return axios.post(this.uploadUrl, formData, {
      headers: { 'Content-Type': 'multipart/form-data' }
    })
  }
  
  /**
   * 请求合并
   */
  async mergeChunks(fileHash, filename, ext) {
    const { data } = await axios.post(this.mergeUrl, {
      fileHash,
      filename,
      ext
    })
    return data
  }
}

// 使用示例
const uploader = new ChunkUploader({ chunkSize: 5 * 1024 * 1024 })

uploader.upload(file, (progress) => {
  console.log(`上传进度: ${progress}%`)
}).then(result => {
  console.log('上传完成:', result.url)
})

四、后端实现详解

1. 项目初始化

// server.js
const express = require('express')
const multiparty = require('multiparty')
const path = require('path')
const fs = require('fs-extra')
const cors = require('cors')

const app = express()
const PORT = 3132

// 跨域
app.use(cors())
app.use(express.json())

// 文件目录
const UPLOAD_DIR = path.resolve(__dirname, 'uploads')
const TEMP_DIR = path.resolve(__dirname, 'temp')

app.listen(PORT, () => {
  console.log(`Server running on http://localhost:${PORT}`)
})

2. 接收切片接口

app.post('/upload', async (req, res) => {
  const form = new multiparty.Form()
  
  form.parse(req, async (err, fields, files) => {
    if (err) {
      return res.status(500).json({ error: err.message })
    }
    
    const fileHash = fields.fileHash[0]
    const chunkIndex = fields.chunkIndex[0]
    
    // 切片存储目录: temp/{fileHash}/
    const chunkDir = path.resolve(TEMP_DIR, fileHash)
    await fs.ensureDir(chunkDir)
    
    // 移动文件
    const chunkPath = path.resolve(chunkDir, chunkIndex)
    await fs.move(files.file[0].path, chunkPath)
    
    res.json({ code: 0, message: '切片上传成功' })
  })
})

3. 合并切片接口

app.post('/merge', async (req, res) => {
  const { fileHash, filename, ext } = req.body
  
  const chunkDir = path.resolve(TEMP_DIR, fileHash)
  const finalPath = path.resolve(UPLOAD_DIR, `${fileHash}.${ext}`)
  
  // 确保上传目录存在
  await fs.ensureDir(UPLOAD_DIR)
  
  // 读取所有切片并合并
  const chunkPaths = await fs.readdir(chunkDir)
  chunkPaths.sort((a, b) => a - b)  // 按序号排序
  
  // 合并文件
  await Promise.all(
    chunkPaths.map((chunkName, index) => {
      return new Promise((resolve) => {
        const chunkPath = path.resolve(chunkDir, chunkName)
        const readStream = fs.createReadStream(chunkPath)
        const writeStream = fs.createWriteStream(finalPath, {
          start: index * 5 * 1024 * 1024,
          flags: 'r+'  // 读写模式
        })
        
        readStream.pipe(writeStream)
        readStream.on('end', resolve)
      })
    })
  )
  
  // 删除临时目录
  await fs.remove(chunkDir)
  
  // 返回文件地址
  const url = `http://localhost:${PORT}/uploads/${fileHash}.${ext}`
  res.json({ code: 0, url })
})

4. 静态文件服务

app.use('/uploads', express.static(UPLOAD_DIR))

五、Vue3 组件完整实现

<template>
  <div class="upload-container">
    <div class="upload-area" @click="triggerUpload" @dragover.prevent @drop="handleDrop">
      <input 
        type="file" 
        ref="fileInput" 
        @change="handleFileSelect" 
        multiple 
        style="display: none;"
      />
      <div v-if="!uploading">
        <div class="upload-icon">📁</div>
        <p>点击或拖拽文件到此处上传</p>
        <p class="hint">支持大文件,支持多选</p>
      </div>
      <div v-else>
        <el-progress 
          :percentage="totalProgress" 
          :status="uploadStatus"
          style="width: 300px;"
        />
        <p>{{ statusText }}</p>
      </div>
    </div>
    
    <div class="file-list" v-if="fileList.length">
      <div v-for="file in fileList" :key="file.uid" class="file-item">
        <span class="file-name">{{ file.name }}</span>
        <span class="file-size">{{ formatSize(file.size) }}</span>
        <el-progress 
          v-if="file.progress" 
          :percentage="file.progress" 
          style="width: 100px;"
        />
        <span v-if="file.url" class="success">✓ 上传完成</span>
      </div>
    </div>
  </div>
</template>

<script setup>
import { ref, computed } from 'vue'
import { ElProgress, ElMessage } from 'element-plus'
import ChunkUploader from './ChunkUploader'

const fileInput = ref(null)
const fileList = ref([])
const uploading = ref(false)
const totalProgress = ref(0)

const uploader = new ChunkUploader({
  chunkSize: 5 * 1024 * 1024,
  concurrency: 3
})

const uploadStatus = computed(() => {
  if (totalProgress.value < 100) return null
  return 'success'
})

const statusText = computed(() => {
  const completed = fileList.value.filter(f => f.url).length
  return `${completed}/${fileList.value} 个文件上传完成`
})

function triggerUpload() {
  fileInput.value.click()
}

function handleFileSelect(e) {
  const files = Array.from(e.target.files)
  addFiles(files)
}

function handleDrop(e) {
  const files = Array.from(e.dataTransfer.files)
  addFiles(files)
}

function addFiles(files) {
  files.forEach(file => {
    fileList.value.push({
      uid: Date.now() + Math.random(),
      name: file.name,
      size: file.size,
      raw: file,
      progress: 0
    })
  })
  
  startUpload()
}

async function startUpload() {
  uploading.value = true
  
  for (const file of fileList.value) {
    if (file.url) continue  // 已上传的跳过
    
    try {
      const result = await uploader.upload(file.raw, (progress) => {
        file.progress = progress
        calculateTotalProgress()
      })
      file.url = result.url
    } catch (err) {
      console.error('上传失败:', err)
      ElMessage.error(`${file.name} 上传失败`)
    }
  }
  
  calculateTotalProgress()
  ElMessage.success('所有文件上传完成!')
}

function calculateTotalProgress() {
  const total = fileList.value.reduce((sum, f) => {
    return sum + (f.progress || 0)
  }, 0)
  totalProgress.value = Math.round(total / fileList.value.length)
}

function formatSize(bytes) {
  if (bytes < 1024) return bytes + ' B'
  if (bytes < 1024 * 1024) return (bytes / 1024).toFixed(1) + ' KB'
  return (bytes / (1024 * 1024)).toFixed(1) + ' MB'
}
</script>

<style scoped>
.upload-area {
  border: 2px dashed #409eff;
  border-radius: 8px;
  padding: 40px;
  text-align: center;
  cursor: pointer;
  transition: all 0.3s;
}
.upload-area:hover {
  border-color: #67c23a;
  background: #f0f9ff;
}
</style>

六、进阶功能

1. 断点续传

原理:上传前检查服务端已有哪些切片,只上传缺失的

// 后端: 检查接口
app.post('/check', async (req, res) => {
  const { fileHash } = req.body
  const chunkDir = path.resolve(TEMP_DIR, fileHash)
  
  if (await fs.pathExists(chunkDir)) {
    const chunks = await fs.readdir(chunkDir)
    res.json({ 
      shouldUpload: false,
      uploadedChunks: chunks.map(Number)
    })
  } else {
    res.json({ shouldUpload: true, uploadedChunks: [] })
  }
})
// 前端: 过滤已上传切片
async upload(file, onProgress) {
  const fileHash = await this.calculateHash(file)
  
  // 检查已上传切片
  const { data } = await axios.post('/check', { fileHash })
  const uploadedSet = new Set(data.uploadedChunks)
  
  const chunks = this.createChunks(file)
  const uploadList = chunks
    .map((chunk, index) => ({ chunk, chunkIndex: index }))
    .filter(item => !uploadedSet.has(item.chunkIndex))  // 跳过已上传的
  
  // ... 继续上传
}

2. 秒传(文件去重)

当文件 hash 相同时,直接返回已存在的文件地址:

// 后端合并时检查
app.post('/merge', async (req, res) => {
  const { fileHash, ext } = req.body
  const finalPath = path.resolve(UPLOAD_DIR, `${fileHash}.${ext}`)
  
  // 如果文件已存在,直接返回
  if (await fs.pathExists(finalPath)) {
    const url = `http://localhost:${PORT}/uploads/${fileHash}.${ext}`
    return res.json({ code: 0, url, fromCache: true })
  }
  
  // ... 否则执行合并
})

3. 并发数动态调整

// 根据网络状况调整
function adjustConcurrency(speed) {
  if (speed > 1024 * 1024) {  // 网速 > 1MB/s
    uploader.concurrency = 5
  } else if (speed > 512 * 1024) {
    uploader.concurrency = 3
  } else {
    uploader.concurrency = 1
  }
}

七、实际运行效果

启动项目

npm run dev

控制台会显示:

  • 前端服务: http://localhost:5173
  • 后端服务: http://localhost:3132

功能演示

  • 支持拖拽上传点击选择
  • 支持多文件同时上传
  • 实时显示上传进度
  • 大文件自动分片处理
  • 服务端自动合并文件

八、总结

技术要点

技术点说明
文件切片Blob.slice() 按固定大小切割
文件标识SparkMD5 计算唯一 hash
并发控制Promise 并发限制,防止请求过多
进度追踪已上传切片数 / 总切片数
服务端合并fs.createReadStream + pipe

扩展方向

  • ✅ 断点续传
  • ✅ 秒传功能
  • ✅ 上传暂停/恢复
  • ✅ 大文件夹上传(WebkitDirectory)
  • ✅ 图片/视频预览
  • ✅ 上传失败重试

最后

源码地址:github.com/313258196/l…

如果对你有帮助,欢迎 Star ⭐ 支持一下!

有任何问题欢迎留言讨论~