1. 音视频采集阶段
1.1 视频采集(AVFoundation框架)
-
核心类:
AVCaptureSessionlet captureSession = AVCaptureSession() captureSession.sessionPreset = .hd1920x1080 // 设置分辨率 -
详细流程:
-
硬件选择:
AVCaptureDevice.DiscoverySession遍历摄像头设备(前置/后置/多摄)- 通过
AVCaptureDevice.Format选择支持的像素格式(如420f/YUV)
-
输入输出配置:
let cameraInput = try AVCaptureDeviceInput(device: cameraDevice) if captureSession.canAddInput(cameraInput) { captureSession.addInput(cameraInput) } // 视频输出 let videoOutput = AVCaptureVideoDataOutput() videoOutput.setSampleBufferDelegate(self, queue: videoQueue) -
帧率控制:
try cameraDevice.lockForConfiguration() cameraDevice.activeVideoMinFrameDuration = CMTime(value: 1, timescale: 30) // 30fps cameraDevice.unlockForConfiguration() -
动态切换:运行时动态切换摄像头(需处理
AVCaptureSession的beginConfiguration/commitConfiguration)
-
1.2 音频采集(AVAudioEngine框架)
-
核心类:
AVAudioEngine+AVAudioInputNodelet audioEngine = AVAudioEngine() let inputNode = audioEngine.inputNode let bus = 0 // 输入总线 inputNode.installTap(onBus: bus, bufferSize: 1024, format: inputNode.inputFormat(forBus: bus)) { (buffer, time) in // 获取PCM原始数据(buffer.floatChannelData) } audioEngine.prepare() try audioEngine.start() -
关键细节:
- 权限:
AVAudioSession的requestRecordPermission检查麦克风权限 - 采样率:通过
AVAudioFormat设置采样率(如44.1kHz) - 多声道处理:支持立体声采集(
buffer.format.channelCount)
- 权限:
2. 预处理/处理阶段
2.1 视频处理(实时滤镜/美颜)
-
方案一:CoreImage(CPU处理,简单滤镜)
let filter = CIFilter(name: "CIColorInvert") filter?.setValue(CIImage(cvPixelBuffer: pixelBuffer), forKey: kCIInputImageKey) let outputImage = filter?.outputImage -
方案二:Metal(GPU加速,高性能)
-
使用
MTKView+ 自定义着色器(.metal文件) -
核心步骤:
- 创建
CVMetalTextureCache将CMSampleBuffer转为Metal纹理 - 通过
MTLCommandQueue提交渲染指令 - 应用自定义滤镜(如高斯模糊、人脸识别)
- 创建
-
2.2 音频处理(降噪/回声消除)
-
使用AudioUnit(低延迟处理)
// 创建AURadioTap处理音频流 let componentDesc = AudioComponentDescription( componentType: kAudioUnitType_Output, componentSubType: kAudioUnitSubType_RemoteIO, componentManufacturer: kAudioUnitManufacturer_Apple, componentFlags: 0, componentFlagsMask: 0) var audioUnit: AudioUnit? AudioComponentInstanceNew(componentDesc, &audioUnit) // 设置回调函数处理音频Buffer AURenderCallbackStruct callbackStruct; callbackStruct.inputProc = audioProcessingCallback; AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Global, 0, &callbackStruct, sizeof(callbackStruct)); -
典型处理:
- WebRTC音频模块:集成
libwebrtc的AudioProcessingModule(NS、AEC、AGC) - 自定义算法:通过
Accelerate框架做FFT频域处理
- WebRTC音频模块:集成
3. 编码阶段
3.1 视频编码(H.264/H.265硬件编码)
-
VideoToolbox框架:
// 创建编码会话 var compressionSession: VTCompressionSession? VTCompressionSessionCreate( allocator: kCFAllocatorDefault, width: width, height: height, codecType: kCMVideoCodecType_H264, encoderSpecification: nil, imageBufferAttributes: nil, compressedDataAllocator: nil, outputCallback: videoEncodeCallback, refcon: nil, compressionSessionOut: &compressionSession) // 设置关键参数 VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_RealTime, value: kCFBooleanTrue) VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_ExpectedFrameRate, value: 30 as CFNumber) -
编码输出回调:
func videoEncodeCallback(_ outputCallbackRefCon: UnsafeMutableRawPointer?, _ sourceFrameRefCon: UnsafeMutableRawPointer?, _ status: OSStatus, _ infoFlags: VTEncodeInfoFlags, _ sampleBuffer: CMSampleBuffer?) { // 获取NALU数据(包括SPS/PPS/IDR/P帧) let dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer!) var blockBufferLength = 0 CMBlockBufferGetDataLength(dataBuffer!, 0, nil, &blockBufferLength) }
3.2 音频编码(AAC/Opus)
-
AudioConverter(软编) :
var converter: AudioConverterRef? AudioConverterNew(&inFormat, &outFormat, &converter) // 输入输出缓冲区 var inputBufferList = AudioBufferList() var outputBuffer = [UInt8](repeating:0, count:1024) // 执行编码 AudioConverterFillComplexBuffer(converter, inInputDataProc, &inputData, &outputPacketCount, &outputBufferList, nil) -
硬编方案:通过
AVAssetWriter直接写入压缩格式(需配置AVAudioSettings)
4. 传输/存储阶段
4.1 实时传输(WebRTC/RTP)
-
WebRTC核心流程:
-
PeerConnection创建:
let config = RTCConfiguration() config.iceServers = [RTCIceServer(urlStrings: ["stun:stun.l.google.com:19302"])] let pcFactory = RTCPeerConnectionFactory() let peerConnection = pcFactory.peerConnection(with: config, constraints: nil, delegate: nil) -
传输音视频轨道:
let videoTrack = pcFactory.videoTrack(with: videoSource, trackId: "video0") peerConnection.add(videoTrack, streamIds: ["stream1"]) -
NAT穿透:ICE Candidate交换,TURN/STUN服务器配置
-
4.2 本地存储(MP4/MOV封装)
-
AVAssetWriter写入文件:
let writer = try AVAssetWriter(outputURL: outputURL, fileType: .mp4) let videoSettings: [String: Any] = [ AVVideoCodecKey: AVVideoCodecType.h264, AVVideoWidthKey: 1280, AVVideoHeightKey: 720 ] let videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoSettings) writer.add(videoInput) // 写入SampleBuffer videoInput.append(sampleBuffer)
5. 解码与渲染阶段
5.1 视频解码(VideoToolbox硬解)
-
解码会话创建:
var decompressionSession: VTDecompressionSession? VTDecompressionSessionCreate( allocator: kCFAllocatorDefault, formatDescription: formatDesc, decoderSpecification: nil, imageBufferAttributes: nil, outputCallback: videoDecodeCallback, decompressionSessionOut: &decompressionSession) -
渲染到屏幕:
- 方案一:
AVSampleBufferDisplayLayer直接显示CMSampleBuffer - 方案二:通过Metal纹理渲染(
CVMetalTextureRef)
- 方案一:
5.2 音频播放(AudioQueue/AVAudioEngine)
-
低延迟播放(AudioQueue) :
var audioQueue: AudioQueueRef? AudioQueueNewOutput(&asbd, audioQueueOutputCallback, nil, nil, nil, 0, &audioQueue) AudioQueueStart(audioQueue, nil) // 填充PCM数据到Buffer AudioQueueEnqueueBuffer(audioQueue, buffer, 0, nil)
关键问题与优化
-
同步问题:
- 音视频同步:通过
CMTime计算PTS/DTS,对齐时间戳 - 丢帧策略:动态调整视频帧率保证音频连续性
- 音视频同步:通过
-
性能优化:
- 编码参数:CBR/VBR码率控制、GOP长度(影响延迟)
- 渲染线程:Metal渲染必须在主线程外(避免卡顿)
-
设备兼容性:
- 编码格式检查:通过
AVCaptureDevice.formats检测设备支持的格式 - 多机型适配:动态调整分辨率和码率(如iPhone SE vs iPhone 15 Pro)
- 编码格式检查:通过