简单学习 Audio Kit 中的格式转换技巧

2,635 阅读5分钟

格式相关:

支持的输出格式,有 4 个:

["wav", "aif", "caf", "m4a"]

其中 caf, 是 core audio format, 支持非压缩格式,和压缩格式 aac

wav, 非压缩格式

m4a, 压缩格式,里面数据一般格式是 aac 压缩格式

不支持 mp3 转出,

音频转 mp3, 一般采用 lame 这个库

支持的输入格式:

["wav", "aif", "caf", "m4a"

"mp3", "snd", "au", "sd2",

"aif", "aiff", "aifc", "aac",

"mp4", "m4v", "mov" ]

还有不含尾缀

音频格式转换一般分 4 类:

  • 非压缩音频之间的转化
  • 压缩音频之间的转化

不支持 mp3 转出

  • 非压缩音频,转压缩音频

不支持 mp3 转出

  • 压缩音频, 转非压缩音频

Audio Kit 中, 把 压缩音频 -> 非压缩音频 ( 非压缩音频 ( 格式 A ) -> 非压缩音频 ( 格式 B ) ), 合并成一个方法

func convertToPCM

Audio Kit 中, 音频格式转化的 3 个方法:

  • 压缩音频之间的转化

func convertCompressed

  • 非压缩音频,转压缩音频

func convertAsset

  • 音频(含两种),转非压缩音频

func convertToPCM

判断是否为,压缩音频

判断音频是否压缩,就是看文件的尾缀

    internal func isCompressed(url: URL) -> Bool {
      
        let ext = url.pathExtension.lowercased()
        return (ext == "m4a" || ext == "mp3" || ext == "mp4" || ext == "m4v" || ext == "mpg")
    }

实现部分

音频(含两种),转非压缩音频

这里采用了 ExtAudioFile Service

  • 先配置格式

       // 先创建空白的格式模版 ASBD
       // 输入 ASBD
       var srcFormat = AudioStreamBasicDescription()
       // 输出 ASBD
       var dstFormat = AudioStreamBasicDescription()
       // ...
        
       // 获取源文件的 ASBD
       // 也就是从源文件,取 ASBD, 赋值给 srcFormat
       error = ExtAudioFileGetProperty(inputFile,
                                        kExtAudioFileProperty_FileDataFormat,
                                        &thePropertySize, &srcFormat)
                                        
       // ...
       // 配置,输出的 ASBD
        dstFormat.mSampleRate = outputSampleRate
        dstFormat.mFormatID = formatKey
        dstFormat.mChannelsPerFrame = outputChannels
        dstFormat.mBitsPerChannel = outputBitRate
        dstFormat.mBytesPerPacket = outputBytesPerPacket
        dstFormat.mBytesPerFrame = outputBytesPerFrame
        dstFormat.mFramesPerPacket = 1
        dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger
        
        // 可见,AIFF 是非压缩音频格式,pcm 大端朝上
        if format == kAudioFileAIFFType {
            dstFormat.mFormatFlags = dstFormat.mFormatFlags | kLinearPCMFormatFlagIsBigEndian
        }
        
  • 再转换数据

一个 while 循环,读到底完事,即 numFrames 为 0

使用 ExtAudioFileRead 读取数据,

使用 ExtAudioFileWrite 写入数据


        var srcBuffer = [UInt8](repeating: 0, count: Int(bufferByteSize))
        var sourceFrameOffset: UInt32 = 0

        srcBuffer.withUnsafeMutableBytes { ptr in
            while true {
                let mBuffer = AudioBuffer(mNumberChannels: srcFormat.mChannelsPerFrame,
                                          mDataByteSize: bufferByteSize,
                                          mData: ptr.baseAddress)

                var fillBufList = AudioBufferList(mNumberBuffers: 1, mBuffers: mBuffer)
                var numFrames: UInt32 = 0

                if dstFormat.mBytesPerFrame > 0 {
                    numFrames = bufferByteSize / dstFormat.mBytesPerFrame
                }
                // 读数据
                error = ExtAudioFileRead(inputFile, &numFrames, &fillBufList)
                if error != noErr {
                    completionHandler?(createError(message: "Unable to read input file."))
                    return
                }
                // 读完了没有,读完了,
                // 跳出 while 循环
                // 没有数据,也没东西,去写入
                if numFrames == 0 {
                    error = noErr
                    break
                }

                sourceFrameOffset += numFrames
                // 写入数据
                error = ExtAudioFileWrite(outputFile, numFrames, &fillBufList)
                if error != noErr {
                    completionHandler?(createError(message: "Unable to write output file."))
                    return
                }
            }
        }

其余的两个转换方法,当然都可以使用 ExtAudioFile Service

Audio Kit 没有在其余的两个转换方法中,使用 ExtAudioFile Service

Audio Kit,采用的代码,更加的简单

压缩音频格式之间的转换,代码很简单

使用了 AVAssetExportSession


        let asset = AVURLAsset(url: inputURL)
        guard let session = AVAssetExportSession(asset: asset, presetName: AVAssetExportPresetAppleM4A) else { return }

        session.outputURL = outputURL
        session.outputFileType = .m4a
        session.exportAsynchronously {
            completionHandler?(nil)
        }

还剩一个,非压缩音频,转压缩音频

  • 准备好 AVAssetReader 和 AVAssetWriter
        let outputFormat = options?.format ?? outputURL.pathExtension.lowercased()

        let asset = AVAsset(url: inputURL)
        do {
            self.reader = try AVAssetReader(asset: asset)

        } catch let err as NSError {  return         }


        switch outputFormat {
        case "m4a", "mp4":
            format = .m4a
            formatKey = kAudioFormatMPEG4AAC
        case "aif":
            format = .aiff
            formatKey = kAudioFormatLinearPCM
        case "caf":
            format = .caf
            formatKey = kAudioFormatLinearPCM
        case "wav":
            format = .wav
            formatKey = kAudioFormatLinearPCM
        default:
            print("Unsupported output format: \(outputFormat)")
            return
        }

        do {
            self.writer = try AVAssetWriter(outputURL: outputURL, fileType: format)
        } catch let err as NSError {    return   }
  • 使用 AVAssetReaderTrackOutput 完成转换

        // 建立输入
        let writerInput = AVAssetWriterInput(mediaType: .audio, outputSettings: outputSettings)
        writer.add(writerInput)

        guard let track = asset.tracks(withMediaType: .audio).first else {
            
            return
        }
        // 建立输出
        let readerOutput = AVAssetReaderTrackOutput(track: track, outputSettings: nil)
        guard reader.canAdd(readerOutput) else {
            
            return
        }
        reader.add(readerOutput)

        if !writer.startWriting() {
            
            return
        }

        writer.startSession(atSourceTime: CMTime.zero)

        if !reader.startReading() {
            
            return
        }

        let queue = DispatchQueue(label: "com.audiodesigndesk.FormatConverter.convertAsset")

        // 开线程,执行音频格式的转换
        writerInput.requestMediaDataWhenReady(on: queue, using: {
            var processing = true 

            while writerInput.isReadyForMoreMediaData, processing {
                if reader.status == .reading,
                    let buffer = readerOutput.copyNextSampleBuffer() {
                    writerInput.append(buffer)

                } else {
                    writerInput.markAsFinished()

                    switch reader.status {
                    case .failed:
                        print("Conversion failed with error", reader.error ?? "")
                        writer.cancelWriting()
                    case .cancelled:
                        print("Conversion cancelled")
                    case .completed:
                        // writer.endSession(atSourceTime: asset.duration)
                        writer.finishWriting {
                            
                        }
                    default:
                        break
                    }
                    processing = false
                }
            }
        }) 


效果

压缩音频,转非压缩音频 , mp 3 to wav

原始文件 mp3, 7.6 M, 码率 320 kb/s

ffprobe -hide_banner -print_format json 1_Le\ Papillon.mp3
{
Input #0, mp3, from '1_Le Papillon.mp3':
  Metadata:
    encoder         : Lavf57.83.100
  Duration: 00:03:08.73, start: 0.025057, bitrate: 320 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
    Metadata:
      encoder         : Lavc57.10
}

转换后的 wav,66.6 M, 码率 2822 kb/s

ffprobe -hide_banner -print_format json   one.wav
{
Input #0, wav, from 'one.wav':
  Duration: 00:03:08.71, bitrate: 2822 kb/s
    Stream #0:0: Audio: pcm_s32le ([1][0][0][0] / 0x0001), 44100 Hz, 2 channels, s32, 2822 kb/s
}

非压缩音频,转压缩音频, wav to m4a

输入是,上一个的输出。

输出是,3.1 M, 码率 129 kb/s

ffprobe -hide_banner -print_format json one.m4a
{
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'one.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 0
    compatible_brands: M4A isommp42
    creation_time   : 2020-12-14T08:05:13.000000Z
    iTunSMPB        :  00000000 00000840 0000004C 00000000007EFB74 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  Duration: 00:03:08.71, start: 0.047891, bitrate: 129 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
    Metadata:
      creation_time   : 2020-12-14T08:05:13.000000Z
      handler_name    : Core Media Audio

}

压缩音频格式之间的转换, mp3 to m4a

输入是原始 mp3, 即第一步的输入

输出 6.1 M, 码率 256 kb/s

ffprobe -hide_banner -print_format json two.m4a
{
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'two.m4a':
  Metadata:
    major_brand     : M4A 
    minor_version   : 0
    compatible_brands: M4A isommp42
    creation_time   : 2020-12-14T08:44:36.000000Z
    iTunSMPB        :  00000000 00000840 0000004C 00000000007EFB74 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  Duration: 00:03:08.71, start: 0.047891, bitrate: 256 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 254 kb/s (default)
    Metadata:
      creation_time   : 2020-12-14T08:44:36.000000Z
      handler_name    : Core Media Audio

}


github repo