FFmpeg之视频封面提取探索

1,750 阅读4分钟

本文主要介绍Snapshot中videosnap实现的一些技术细节,它用于替换Glide中VideoDecoder实现,提高解码效率、减少内存占用以及降低大分辨率视频崩溃的概率。

编译ffmpeg

编译环境

  • ffmepg version:4.2.2及以上
  • desktop os : Ubuntu 20.04
  • ndk version: android-ndk-r21-linux-x86_64及以上

编译步骤

具体查看Android之FFmpeg 5.0起航,编译内容以及脚本已全部更新,只需自己添加一些配置,比如关闭所有的编码器,关闭不需要的文件协议等等。

IO类型确定

Android层:输入IO方案

Android混乱的文件管理体系,File以及Uri两种方案共同存在,FileSystem API正在被放弃,且一些的版本的Android 系统并未完整支持Uri方案,加之android 10之上的残疾分区存储,不得不做一些考虑。

方案一:Uri转换成File(不推荐)

  • 在早前版本还能通过读取多媒体数据库中读取文件的绝对路径,转换成为File对象,但是后面Android对此进行阻止(失败)
  • 使用文件描述符(android 11中已被限制)
    /**
     * 注意释放资源 
     */
    public static File convertUriToFile(Context context, Uri uri) throws FileNotFoundException {
        int fd = context.getContentResolver().openAssetFileDescriptor(uri, "r").getParcelFileDescriptor().getFd();
        File result = null;
        if (fd != -1) {
            String path = "/proc/self/fd/" + fd;
            result = new File(path);
        }
        return result;
    }

方案二:使用文件描述符

    /**
     * 仅作为Demo使用,try resource会释放资源,导致返回fd无法使用
     * 与此同时需要注意从assets或raw打开的资源afd.getStartOffset()返回值不一样
     */
    public static int getFd(Context context, Uri uri) {
        try (AssetFileDescriptor afd = context.getContentResolver().openAssetFileDescriptor(uri, "r")) {
            return afd.getParcelFileDescriptor().getFd();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return -1;
    }

FFmpeg层:自定义IO

自定义的read_packet、write_packet以及seek方法

static int read_packet(void *opaque, uint8_t *buf, int buf_size) {
    int fd = *(static_cast<int *>(opaque));
    int ret = read(fd, buf, buf_size);
    if (ret == 0) {
        return AVERROR_EOF;
    }
    return (ret == -1) ? AVERROR(errno) : ret;
}

static int write_packet(void *opaque, uint8_t *buf, int buf_size) {
    int fd = *(static_cast<int *>(opaque));
    int ret = write(fd, buf, buf_size);
    return (ret == -1) ? AVERROR(errno) : ret;
}

static int64_t seek(void *opaque, int64_t offset, int whence) {
    int fd = *(static_cast<int *>(opaque));
    int64_t ret;
    if (whence == AVSEEK_SIZE) {
        struct stat64 st;
        ret = fstat64(fd, &st);
        return ret < 0 ? AVERROR(errno) : (S_ISFIFO(st.st_mode) ? 0 : st.st_size);
    }
    ret = lseek64(fd, offset, whence);
    return ret < 0 ? AVERROR(errno) : ret;
}

创建AVIOContext

AVIOContext *avioContext = avio_alloc_context(buffer, bufferSize, 0, &fd, &read_packet,
                                                  &write_packet,
                                                  &seek);

设置pb以及flags

    AVFormatContext *avFormatContext = avformat_alloc_context();
    .....
    avFormatContext->pb = avioContext;
    avFormatContext->flags |= AVFMT_FLAG_CUSTOM_IO;

FFmpeg解码流程

  1. 初始化AVIOContext:avio_alloc_context
  2. 初始化AVFormatContext:avformat_alloc_context
  3. 设置自定义IO:avFormatContext->pb = avioContext;avFormatContext->flags |= AVFMT_FLAG_CUSTOM_IO;
  4. 解封文件并对avFormatContext进行设置:avformat_open_input
  5. 寻找视频轨道:avformat_find_stream_info
  6. 获取视频流和解码器:av_find_best_stream
  7. 创建解码器上下文:avcodec_alloc_context3
  8. 将6步中获取的解码器信息拷贝至AVCodecContext中:avcodec_parameters_to_context
  9. 打开解码器:avcodec_open2
  10. 从流中读取下一帧:av_read_frame
  11. 向解码器输入原始数据:avcodec_send_packet
  12. 从解码器中获取输出数据:avcodec_receive_frame

不断循环10-->11-->12直到数据完毕。

填充AndroidBitmap

在FFmpeg从解码器中输出数据时进行转换,填充AndroidBitmap的pixels内存区域。库中使用libyuv进行对应的裁剪缩放。

  1. 分别计算原始图形与输出图像的长宽缩放比例hScale、wScale。
  2. 如果hScale与wScale相等,直接缩放,否则直接进入下一步。
  3. 如果hScale、wScale最小值大于2,那么对数据进行先缩放、后裁剪的操作,否则就先裁剪后缩放。
  4. 最后将yuv数据转换成rgb数据,填充目标bitmap。
void drawBitmap(AVFrame *frame, int outWidth, int outHeight, uint8_t *data) {
    if (frame->format != AV_PIX_FMT_YUV420P) {
        LOGE("format is not 420 %d", frame->format);
        return;
    }
    if (frame->data[0] == nullptr || frame->data[1] == nullptr || frame->data[2] == nullptr) {
        LOGE("format data is null");
        return;
    }
    int srcW = frame->width;
    int srcH = frame->height;
    if (srcW < 1 || srcH < 1) {
        LOGE("format width %d or height %d not right", srcW, srcH);
        return;
    }
    float wScale = srcW * 1.0F / outWidth;
    float hScale = srcH * 1.0F / outHeight;
    int cropWidth;
    int cropHeight;
    int cropX;
    int cropY;
    bool isCropScale = false;
    bool isScaleCrop = false;
    if (wScale != hScale) {
        float s = fmin(wScale, hScale);
        if (s >= 2.0F) {
            srcW = frame->width / s;
            srcH = frame->height / s;
            if (srcH < outHeight) {
                srcH = outHeight;
            }
            if (srcW < outWidth) {
                srcW = outWidth;
            }
            isScaleCrop = true;
            cropWidth = outWidth;
            cropHeight = outHeight;
        } else {
            isCropScale = true;
            if (srcW > srcH) {
                cropWidth = srcH;
                cropHeight = srcH;
            } else {
                cropWidth = srcW;
                cropHeight = srcW;
            }
        }
        cropX = (srcW - cropWidth) / 2;
        cropY = (srcH - cropHeight) / 2;
        if ((cropX & 0b01) != 0) {
            cropX -= 1;
        }
        if ((cropY & 0b01) != 0) {
            cropY -= 1;
        }
    }

    int halfWidth = outWidth >> 1;
    int outWH = outWidth * outHeight;
    auto *temp = new uint8_t[outWH * 3 / 2];
    uint8_t *temp_u = temp + outWH;
    uint8_t *temp_v = temp + outWH * 5 / 4;

    if (isCropScale) {
        int hw = cropWidth >> 1;
        int wh = cropWidth * cropHeight;
        auto *crop = new uint8_t[wh * 3 / 2];
        uint8_t *crop_u = crop + wh;
        uint8_t *crop_v = crop + wh * 5 / 4;
        uint8_t *src_y = frame->data[0] + (frame->linesize[0] * cropY + cropX);
        uint8_t *src_u = frame->data[1] + frame->linesize[1] * (cropY / 2) + (cropX / 2);
        uint8_t *src_v = frame->data[2] + frame->linesize[2] * (cropY / 2) + (cropX / 2);
        int result = libyuv::I420Rotate(src_y, frame->linesize[0], src_u, frame->linesize[1], src_v,
                                        frame->linesize[2],
                                        crop, cropWidth,
                                        crop_u, hw,
                                        crop_v, hw,
                                        cropWidth, cropHeight, libyuv::kRotate0);
        result = libyuv::I420Scale(
                crop, cropWidth,
                crop_u, hw,
                crop_v, hw,
                cropWidth, cropHeight,
                temp, outWidth,
                temp_u, halfWidth,
                temp_v, halfWidth,
                outWidth, outHeight,
                libyuv::FilterModeEnum::kFilterNone
        );
        delete[]crop;
    } else if (isScaleCrop) {
        int hw = srcW >> 1;
        int wh = srcW * srcH;
        auto *scale = new uint8_t[wh * 3 / 2];
        uint8_t *scale_u = scale + wh;
        uint8_t *scale_v = scale + wh * 5 / 4;
        int result = libyuv::I420Scale(
                frame->data[0], frame->linesize[0],
                frame->data[1], frame->linesize[1],
                frame->data[2], frame->linesize[2],
                frame->width, frame->height,
                scale, srcW,
                scale_u, hw,
                scale_v, hw,
                srcW, srcH,
                libyuv::FilterModeEnum::kFilterNone
        );
        uint8_t *src_y = scale + (srcW * cropY + cropX);
        uint8_t *src_u = scale_u + hw * (cropY / 2) + (cropX / 2);
        uint8_t *src_v = scale_v + hw * (cropY / 2) + (cropX / 2);
        result = libyuv::I420Rotate(src_y, srcW, src_u, hw, src_v, hw,
                                    temp, outWidth,
                                    temp_u, halfWidth,
                                    temp_v, halfWidth,
                                    cropWidth, cropHeight, libyuv::kRotate0);
        delete[]scale;
    } else {
        libyuv::I420Scale(
                frame->data[0], frame->linesize[0],
                frame->data[1], frame->linesize[1],
                frame->data[2], frame->linesize[2],
                frame->width, frame->height,
                temp, outWidth,
                temp_u, halfWidth,
                temp_v, halfWidth,
                outWidth, outHeight,
                libyuv::FilterModeEnum::kFilterNone
        );
    }
    int linesize = outWidth * 4;
    libyuv::I420ToABGR(
            temp, outWidth,//Y
            temp_u, halfWidth,//U
            temp_v, halfWidth,// V
            data, linesize,  // RGBA
            outWidth, outHeight);
    delete[]temp;
}

已知问题

个别情况下,编译完成的FFmpeg库会在ff_init_vlc_from_lengths处发生崩溃,已经在解决中。。。

#00 pc 0010e994 /data/app/com.demo-ld--z-yKfIs5fn0oXW5R-A==/lib/arm64/libffmpeg_core.so(ff_init_vlc_from_lengths) [arm64-v8a::]

参考内容