前端实现音频倍速(wasm)

1,163 阅读5分钟

背景

工作背景就职于教育类公司,从事直播业务。目前正在寻找一种新的音频播放方式。之前一直采用编解码成FMP4和通过MSE流式播放音频,但是收到用户反馈部分电脑播放不出声音,所以采用一种新的播放方案即通过AudioContext播放pcm实现。为什么要采用sonic?主要原因是之前采用FMP4的方案浏览器帮助我们实现了倍速效果,然而AudioContext播放pcm实现倍速需要自己来实现,所以采用sonic来实现。

编译sonic

一般我们在调试Webassmebly过程中,我们都是先在native层调试通,然后在编译成wasm和js调试。

首先下载sonic库:

git clone github.com/waywardgeek…

编译:

cd sonic/
make
make install

这个时候会生成sonic.a,sonic.o等文件,编写native代码现在native层调试通:

建立wrapper_sonic.c:

/*
 * @Author: xiuquanxu
 * @Company: kaochong
 * @Date: 2020-04-12 16:49:47
 * @LastEditors: xiuquanxu
 * @LastEditTime: 2020-07-24 11:35:18
 */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include "sonic.h"

#define TAG "wrapper_sonic"

typedef struct {
    sonicStream sonic_handle;
    uint8_t *in_data;
    uint8_t *out_data;
} as_sonic_t;

uint8_t* sonicInit(int sampleRate, int numChannels, int maxRate) {
    as_sonic_t *sonicer = (as_sonic_t *)malloc(sizeof(as_sonic_t));
    sonicer->sonic_handle = sonicCreateStream(sampleRate, numChannels);
    sonicer->in_data = (uint8_t *)malloc(sampleRate * numChannels * maxRate);
    sonicer->out_data = (uint8_t *)malloc(sampleRate * numChannels * maxRate);
    sonicSetSpeed(sonicer->sonic_handle, 0.5);
    return (uint8_t *)sonicer;
}

void setSpeed(uint8_t* dec, int speed) {
    as_sonic_t *sonicer = (as_sonic_t *)dec;
    sonicSetSpeed(sonicer->sonic_handle, speed);
}

int pcmHandleRateBySonic(uint8_t* dec, uint8_t* input, int input_len, uint8_t* output, int samples) {
    printf(" pcmHandleRateBySonic input_len\n");
    as_sonic_t *sonicer = (as_sonic_t *)dec;
    printf(" before memcpy\n");
    memcpy(sonicer->in_data, input, input_len);
    int write_res = sonicWriteShortToStream(sonicer->sonic_handle, (short *)sonicer->in_data, samples / 2);
    printf(" after memcpy write_res:%d\n", write_res);
    if (write_res != 1) {
        printf("%s pcmHandleRateBySonic write_res:%d\n", TAG, write_res);
        return write_res;
    }
    int read_res = sonicReadShortFromStream(sonicer->sonic_handle, (short *)sonicer->out_data, samples / 2);
    printf(" read res:%d\n", read_res);
    memcpy(output, sonicer->out_data, read_res * 2);
    return read_res * 2;
}

int destory(uint8_t *dec) {
    as_sonic_t *sonicer = (as_sonic_t *)dec;
    if (sonicer) {
        if (sonicer->sonic_handle) {
            sonicDestroyStream(sonicer->sonic_handle);
        }
        if (sonicer->in_data) {
            free(sonicer->in_data);
        }
        if (sonicer->out_data) {
            free(sonicer->out_data);
        }
        free(sonicer);
    }
    return 0;
}

int test() {
    int sampleRate = 48000;
    int numChannels = 1;
    int maxRate = 10;
    FILE *fin = NULL;
    FILE *fout = NULL;
    fin = fopen("/Users/xuxiuquan/mygithub/like-player-lib/lib-sonic/sonic/webassembly-test/test-pcm-web.pcm", "rb");
    fout = fopen("/Users/xuxiuquan/mygithub/like-player-lib/lib-sonic/sonic/webassembly-test/after-sonic-pcm.pcm", "wb");
    // 初始化sonic
    uint8_t* content = (uint8_t*)malloc(1024 * 1024);
    uint8_t* res = (uint8_t *)malloc(1024 * 1024);
    fread(content, 48000, 10, fin);
    uint8_t* son = sonicInit(sampleRate, numChannels, maxRate);
    int read_res = pcmHandleRateBySonic(son, content, 48000 * 10, res, 48000 * 10);
    printf("test res:%d\n", read_res);
    fwrite(res, read_res, 1, fout);
    return 0;
}

int main() {
    test();
    return 0;
}

test()方法为测试api的方法。这里我们二次封装了sonic的接口,初始化接口:sonicInit,倍速接口:setSpeed,处理pcm数据接口:pcmHandleRateBySonic。

需要注意的是fin和fout输入和输出,fin要写一个你本地的pcm路径,fout要写输出到你本地。

编译wrapper_sonic.c,这里我直接使用了gcc编译(如果对gcc使用不熟悉可以学习一下gcc命令,或者参考我上一篇编译libuv)。

 gcc wrapper_sonic.c -o test.a -I/Users/xuxiuquan/mygithub/like-player-lib/lib-sonic/sonic/ -L/Users/xuxiuquan/mygithub/like-player-lib/lib-sonic/sonic/ -lsonic

编译成功后会输出test.a,我们执行test.a: ./test.a

运行完后会将倍速的pcm输出到你定义的fout那个路径,这时候可以通过vlc播放该pcm(你通过其他方式也可以)。如果你有vlc可以这样播放:

/Applications/VLC.app/Contents/MacOS/VLC --demux=rawaud --rawaud-channels 1 --rawaud-samplerate 44800 /Users/xuxiuquan/Downloads/sonic-web-pcm.pcm(你的fout输出pcm路径)

如果正常倍速播放说明你在native层已经调试成功。

编译wasm

首先你要确保你安装了emsdk(一套将c、c++等编译成wasm的编译工具)。如果你没有安装,地址:

http://webassembly.org.cn/getting-started/developers-guide/

编写编译脚本:webassembly.sh

echo "----------------------------"
echo "start building"
# source /Users/xuxiuquan/github/webassembly/emsdk/emsdk_env.sh
rm -rf ../webassembly
mkdir ../webassembly
EMCC_DEBUG=1 \
EMMAKEN_CFLAGS="-I/usr/local/include -DNOPUS_HAVE_RTCD" \
emcc    ./sonic.c ./wrapper_sonic.c\
        -g2 \
        -s EMULATE_FUNCTION_POINTER_CASTS=1 \
        -s ASSERTIONS=2 \
        -s WASM=1 \
        -s ALLOW_MEMORY_GROWTH=1 \
        -s "MODULARIZE=1" \
        -s "EXPORT_NAME='WebSonic'" \
        -s "BINARYEN_METHOD='native-wasm'" \
        -s "EXPORTED_FUNCTIONS=['_sonicInit', '_pcmHandleRateBySonic', '_setSpeed', '_destory']" \
        -s EXTRA_EXPORTED_RUNTIME_METHODS='["ccall", "cwrap"]' \
        -o ../webassembly/sonic.js \
        -Wall \

具体编译参数可以参考下面链接:

https://emscripten.org/

这里主要看一下EXPORTED_FUNCTIONS这个参数,它代表着你要从wrapper_sonic.c和sonic.c中导出哪些函数,我们只需要导出我们自己封装的函数即可。这里注意三个事情,第一emcc相当于gcc,em++相当于g++,所以选好命令。第二main函数会默认被导出,如果你不需要在native代码中要注释掉。第三导出函数的写法要加一个_,这是emsdk的要求。

执行该脚本: sh webassembly.sh
在webassembly文件夹会生成四个文件,我们主要关心:sonic.js(胶水代码)和sonic.wasm(wasm文件)。

在工程中使用

我们需要把sonic.wasm通过网络请求加载到的二进制传递给胶水代码进行初始化,当初始化成功后会触发onRuntimeInitialized该函数。

所以加载代码如下:

    function loadWasm() {
        var req = new XMLHttpRequest();
        req.responseType = 'arraybuffer';
        req.addEventListener('load', () => {
            var wasmBuffer = req.response;
            <!-- 获取公用的内存区域 -->
            WebSonic['wasmBinary'] = wasmBuffer;
            var wasmDsp = WebSonic({ wasmBinary: WebSonic.wasmBinary });
            <!-- 导出native方法给js使用 -->
            var sonicInit = wasmDsp.cwrap('sonicInit', ['number'], ['number', 'number', 'number']);
            var pcmHandleRate = wasmDsp.cwrap('pcmHandleRateBySonic', ['number'], ['number', 'number', 'number', 'number', 'number']);
            var setSpeed = wasmDsp.cwrap('setSpeed', null, ['number', 'number']);
            <!-- 将方法绑定给全局对象 -->
            ModuleSonic.init = sonicInit;
            ModuleSonic.handleRate = pcmHandleRate;
            ModuleSonic.setSpeed = setSpeed;
            ModuleSonic.wasmDsp = wasmDsp;
        });
        req.open('GET', './webassembly/sonic.wasm');
        req.send();
    }

    function onRuntimeInitialized() {
        <!-- 调用emsdk自动导出的_malloc函数在公共区域申请一段内存 -->
        c_write_ptr = ModuleSonic.wasmDsp._malloc(1024 * 1024);
        c_reader_ptr = ModuleSonic.wasmDsp._malloc(1024 * 1024);
        handle = ModuleSonic.init(48000, 1, 10);
    }

我们利用先用的pcm_player这个库进行播放:

    function start() {
        <!-- pcm_player库 -->
        var player = new PCMPlayer({
            encoding: '16bitInt',
            channels: 1,
            sampleRate: 44800,
            flushingTime: 1000
        });

        const needWriteBuffer = new Uint8Array(buffer.slice(0, 48000 * 10));
        <!-- 将pcm数据buffer写入共用内存中 -->
        ModuleSonic.wasmDsp.HEAPU8.set(needWriteBuffer, c_write_ptr);
        <!-- 调用native接口进行倍速 -->
        var writeRes = ModuleSonic.handleRate(handle, c_write_ptr, 48000 * 10, c_reader_ptr, 48000 * 10);
        console.log("writeRes:", writeRes, c_reader_ptr);
        <!-- 取出倍速后的结果,以short类型,对应js就是Uint16Array -->
        const pcm = ModuleSonic.wasmDsp.HEAP16.slice(c_reader_ptr / 2, (c_reader_ptr + writeRes) / 2);
        <!-- 丢数据给player -->
        player.feed(pcm);
    }

以上代码都有注释可以帮助你清晰理解每句话意思。

通过以上操作即可在网页实现sonic库倍速pcm数据通过AudioContext来播放音频。

总结

  1. 介绍了常用调试的native和wasm编译的方法
  2. 介绍了sonic使用
  3. 介绍了sonic编译成wasm过程
  4. 实现了sonic集成网页播放

注:这种cpu密集运算实际工作中要放在webworker中进行。

github

github.com/this-spring…

后续

后面功能:

  1. 重采样(用于AudioContext大多数只支持44100采样,而我的测试数据是48000所以增加重采样功能)

  2. 在webworker中实践