web音视频浅谈——视频篇

1,788 阅读4分钟

前景

视频剪辑出现那么久,但是关于web端相关的内容文章却少了可怜,本文目的是视频剪辑过程中,基于mp4格式的视频如何生成帧缩列图和渲染视频

缩列图生成

拿到一个mp4文件(有兴趣的可以了解下这篇文章mp4实例分析)后,前端如何去获取一段时间内的帧画面,这里介绍4类方法

方式一 通过多个video标签来获取帧

这里的思路是先获取文件的blob对象

  fetch('/music/file', {
      method: 'get',
      responseType: 'blob',
    })
      .then((res) => {
        return res.blob()
      })
      .then((blob) => {
        const blobURL = URL.createObjectURL(myBlob)
      })

然后根据当前的视频总时长duration和缩列图的宽度算出需要的帧数frameCount和每一帧的时间点,放入到对应的vedio标签中

const renderFrames = () => {
    const frameOffsetTime = duration / frameCount
    return Array(framCount)
      .fill(0)
      .forEach((_, index) => (
        <video
          key={index}
          src={`${blobURL}#t=${frameOffsetTime * index}`}
          width={frameWidth}
          height={frameHeight}
        />
      ))
  }
 

这种方式利用的是blobURL后#t=相应的时间点去获取对应的那一帧画面,当然你可以直接设置这一堆video标签的currentTime达到相应的目的

方式二 只通过一个video标签来获取帧

这种方式的原理是也是一样先算出每一帧对应的时间差来不断的设置currentTime,触发video的onSeeked时候去渲染这一帧到画布上

首先设置一个隐藏的video标签和一个可以用来渲染缩列图的画布

    <video
      style={{ display: 'none' }}
      ref={videoRef}
      src={filePath}
      onCanPlay={onCanDrawFrame}
      onSeeked={onDrawFrame}
    />
    <canvas ref={canvasRef} />

然后,等待video canPlay后开始设置画布

  const onCanDrawFrame = () => {
    const canvas = canvasRef.current.transferControlToOffscreen()
    canvas.width = framesWidth // 缩列图的宽度
    canvas.height = framesHeight // 缩列图的高度
    ctxRef.current = canvas.getContext('2d')
    ctxRef.current.clearRect(0, 0, canvas.width, canvas.height)
    videoRef.current.currentTime = 0 // 设置完后会触发onSeeked事件
  }
    const onDrawFrame = () => {
    if (videoRef.current.currentTime < duration) {
      window.requestAnimationFrame(() => {
        ctxRef.current.drawImage(
          videoRef.current,
          videoRef.current.currentTime * secondWidth, // secondWidth每一秒占用的宽度,这里是当前帧要渲染到画布中的位置
          0,
          frameWidth,
          60,
        )
        videoRef.current.currentTime += frameOffsetTime // 等待下一次onSeeked后继续渲染下一帧
      })
    }
  }
  

这样只需要一个canvas一个video标签就可以渲染出整个视频帧缩列图

方式三 通过ffmpeg的获取

这种方式很简单,一个ffmpeg命令就可以搞定,但是这玩意的你得在前端网页嵌入一个简单版的ffmpeg;通过webassembly方式引入和调用,这里就不描述了,直接上某一个可用的大神demo(CcClip)需要注意的是如果你使用的shareBuffer,记得加入跨域安全的请求头配置,网上搜一堆

方式四 通过浏览器自带的WebCodecs api来获取

像谷歌浏览器本身就内嵌了一个ffmpeg,那么我们是不是可以直接用它,答案是可以用一部分,官方只开放了一部分的api,任重道远,但是解析个视频帧却没啥问题,具体可以看看这篇文章(WebCodecs)或者这篇官方原文

主流程图如下

image.png

代码实现

主程序放入一个canvas标签,为了方便改变缩列图宽高,只需要woker返回一个可用的url就行,创建一个离线canvas传入worker,方便直接渲染帧

import MediaWorker from './media.worker.js'
function Main () {
  const canvasRef = useRef(null)
  const [frameUrl, setFrameUrl] = useState('')
  useEffect(() => {
      const mediaWorker = new MediaWorker('./media.worker.js')
      const offscreenCanvas = canvasRef.current.transferControlToOffscreen()
      mediaWorker.postMessage(
        {
          md: 'init',
          info: {
            filePath,
            canvas: offscreenCanvas,
          },
        },
        [offscreenCanvas],
      )

      mediaWorker.onmessage = ({ data }) => {
        setFrameUrl(data.frameUrl)
      }
  }, [])        
  
  return <div>
     <canvas
       ref={canvasRef}
       style={{ display: 'none' }}
    />
    <div style={{  backgroundImage: `url(${frameUrl})`}></div>
  </div>
}
  

woker实现 media.worker.js

importScripts('/demuxer_mp4.js')
self.addEventListener('message', ({ data }) => {
    getRangeFrames(data.info)
})

function getRangeFrames({ canvas, filePath }) {
  const demuxer = new MP4Demuxer({
    filePath,
  })
  new Promise((resolve) => {
    let mediaInfo = {
      width: 0, // 视频宽度
      height: 0,
      frameCount: 0,
      frames: [],
    }
    let hasEncodeVideoFrameLen = 0
    let lastVideoTimestamp = 0
    const videoDecoder = new VideoDecoder({
      async output(frame) {
        const { width, height, frameCount } = mediaInfo
        // 将视频帧变为可渲染的bitmap
        const bitmap = await createImageBitmap(frame, 0, 0, width, height)
        frame.close()
        const endTime = frame.timestamp - lastVideoTimestamp
        mediaInfo.frames.push({
          bitmap,
          timestamp: frame.timestamp,
          duration: frame.timestamp - lastVideoTimestamp,
          startTime: lastVideoTimestamp,
          endTime: frame.timestamp,
        })
        lastVideoTimestamp = frame.timestamp
        if (frameCount === ++hasEncodeVideoFrameLen) {
          // 结束
          resolve()
        }
      },
      error: reject,
    })

    demuxer.decodeTrack({
      onConfig({ decodeConfig, info }) {
        mediaInfo = {
          ...mediaInfo,
          ...info,
        }
        videoDecoder.configure(decodeConfig)
      },
      onChunk(chunk, isEnd) {
        videoDecoder.decode(chunk)
        isEnd && videoDecoder.flush()
      },
    })
    videoLoad.promise.then(() => {
      resolve(mediaInfo)
    })
  })
    .then((mediaInfo) => {
      // mediaInfo里面的frames就是我们获取到的视频的所有帧,想要渲染那一帧直接获取对应的bitmap就可以渲染在画布上了
      const renderFrames = []
      let i = 0
      const frameLen = mediaInfo.frames.length
      while (i < frameLen) {
        // 一秒之内任意取一帧
        renderFrames.push(seekFrames[i])
        i += mediaInfo.frameRate
      }

      // renderFrames 就可以拿来渲染了
    })
    .catch(console.error)
}


demuxer_mp4.js 可以参考官网

importScripts('/static/mp4box.all.min.js')

class MP4Source {
  constructor(uri) {
    this.file = MP4Box.createFile()
    this.file.onError = console.error.bind(console)
    this.file.onReady = this.onReady.bind(this)
    this.file.onSamples = this.onSamples.bind(this)

    fetch(uri).then((response) => {
      const reader = response.body.getReader()
      let offset = 0
      let mp4File = this.file

      function appendBuffers({ done, value }) {
        if (done) {
          mp4File.flush()
          return
        }
        let buf = value.buffer
        buf.fileStart = offset

        offset += buf.byteLength

        mp4File.appendBuffer(buf)

        return reader.read().then(appendBuffers)
      }

      return reader.read().then(appendBuffers)
    })

    this.info = null
    this._info_resolver = null
  }

  onReady(info) {
    // TODO: Generate configuration changes.
    this.info = info

    if (this._info_resolver) {
      this._info_resolver(info)
      this._info_resolver = null
    }
  }

  getInfo() {
    if (this.info) return Promise.resolve(this.info)

    return new Promise((resolver) => {
      this._info_resolver = resolver
    })
  }

  getAvccBox() {
    // TODO: make sure this is coming from the right track.
    return this.file.moov.traks[0].mdia.minf.stbl.stsd.entries[0].avcC
  }

  getAudioSpecificConfig(trakIndex) {
    return this.file.moov.traks[trakIndex].mdia.minf.stbl.stsd.entries[0].esds
      .esd.descs[0].descs[0].data
  }

  selectTrack(track) {
    this.file.setExtractionOptions(track.id)
  }

  start(onSamples) {
    this._onSamples = onSamples
    this.file.start()
  }

  stop() {
    this.file.stop()
  }

  onSamples(track_id, ref, samples) {
    this._onSamples(samples)
  }
}

// Demuxes the first video track of an MP4 file using MP4Box, calling
// `onConfig()` and `onChunk()` with appropriate WebCodecs objects.
class MP4Demuxer {
  constructor({ filePath, mediaType }) {
    this.filePath = filePath
    this.mediaType = mediaType
    this.source = new MP4Source(this.filePath)
  }

  async decodeTrack({ mediaType, onChunk, onConfig }) {
    const info = await this.source.getInfo()
    const track = info[`${mediaType}Tracks`][0]
    this.source.selectTrack(track)
    const config = this.getMediaConfig(info, mediaType, track)
    onConfig(config)

    let hasReadSampleLen = 0
    const ChunkType =
      mediaType === 'video' ? EncodedVideoChunk : EncodedAudioChunk
    this.source.start((samples) => {
      const samplesLen = samples.length
      for (const sample of samples) {
        const type = sample.is_sync ? 'key' : 'delta'
        const pts_us = (sample.cts * 1e6) / sample.timescale
        const duration_us = (sample.duration * 1e6) / sample.timescale
        onChunk(
          new ChunkType({
            type,
            timestamp: pts_us,
            duration: duration_us,
            data: sample.data,
          }),
          ++hasReadSampleLen === config.info.frameCount,
        )
      }
    })
  }

  getMediaConfig(info, mediaType, track) {
    const format = info.mime.replace(/(?=;\s).*/, '')
    if (mediaType == 'audio') {
      return {
        decodeConfig: {
          codec: track.codec,
          sampleRate: track.audio.sample_rate,
          numberOfChannels: track.audio.channel_count,
          description: this.source.getAudioSpecificConfig(
            this.mediaType === 'audio' ? 0 : 1,
          ),
        },
        info: {
          codec: track.codec,
          sampleRate: track.audio.sample_rate,
          numberOfChannels: track.audio.channel_count,
          sampleSize: track.audio.sample_size,
          format,
          frameCount: track.nb_samples,
          frameRate: track.timescale / 1000,
          bitrate: parseInt(track.bitrate),
          duration: parseInt((1e6 * track.duration) / track.timescale),
        },
      }
    } else {
      return {
        decodeConfig: {
          codec: track.codec,
          codedWidth: track.track_width,
          codedHeight: track.track_height,
          description: this.getAvcDescription(this.source.getAvccBox()),
        },
        info: {
          codec: track.codec,
          height: track.track_height,
          width: track.track_width,
          format,
          frameRate: track.timescale / 1000,
          frameCount: track.nb_samples,
          bitrate: parseInt(track.bitrate),
          duration: parseInt(
            (1e6 * track.duration) / track.timescale,
          ),
          displayAspectRatio: +(track.track_width / track.track_height).toFixed(
            3,
          ),
        },
      }
    }
  }

  getAvcDescription(avccBox) {
    const stream = new DataStream(undefined, 0, DataStream.BIG_ENDIAN)
    avccBox.write(stream)
    return new Uint8Array(stream.buffer, 8) // Remove the box header.
  }
}

这样我们就可以实现帧渲染了

视频渲染

视频剪辑实时预览可以基于canvas去做,分两步做,第一步创建计时器,第二步根据当前时间点渲染对应的帧

计时器创建

settimeout你值得拥有,但是计时得有一个参考时间点

let playTimer = null // 定时器
let currentTime = 0 // 当前播放时间点
let duration = 0 // 视频总时长
const tracks = [] // 轨道信息

const play = () => {
  playTimer && clearTimeout(playTimer)
  const timeStart = Date.now() // 播放开始时间
  let runTime = 0 // 以及播放了多少时间
  const defaultStart = currentTime // 默认开始的时间

  const timeFn = () => {
    // setTimeout是不稳定的,所以我们需要一个参考时间点timeStart来计算当前已经播放了多少时间
    runTime = (Date.now() - timeStart) / 1000 + defaultStart

    currentTime = runTime > duration ? duration : +runTime.toFixed(2)
    if (runTime <= duration) {
      renderFrame()
      playTimer = setTimeout(timeFn, 10)
    } else {
      // 播放结束
      pause()
    }
  }
  timeFn()
}

渲染

const renderFrame = () => {
  const canvas = canvasRef.current
  const ctx = canvas.getContext('2d')
  window.requestAnimationFrame(() => {
    tracks.forEach((item) => {
      if (item.startTime <= currentTime && currentTime <= item.endTime) {
        ctx.drawImage(
          item.frame,
          item.clipX, // 裁切的信息
          item.clipY,
          item.clipWidth,
          item.clipHeight,
          item.x, // 渲染到画布上的位置
          item.y,
          item.width,
          item.height,
        )
      }
    })
  })
}

结语

以上的代码都只是写核心的部分,只为读者打开思路,想要深入,还得去了解媒体文件的编码