一、背景

AudioTrack是Android用于播放音频的基础组件，一般我们用来播放音频的原始文件（PCM），在有些应用场景下，我们需要非常精确地知道它的播放开始与结束事件，但是它本身并不像一些三方组件一样，提供了完备的onPlayStart与onPlayEnd回调，想做到监听还是要费点功夫。

二、如何实现？

2.1 监听音频的停止

监听音频的停止相对简单，寻找相应的资料也很容易可以拿到，AudioTrack本身提供了这样的一个机制：

1、支持设置标记大小markerInFrames，当播放长度达到这个标记大小，就可以收到一个回调
2、支持设置（周期）标记大小periodInFrames，每次达到这个大小，都会收到一个回调，也就是说，每隔periodInFrames就会回调一次。

我们想要知道音频结束，只使用markerInFrames就可以了。

源码：


public int setNotificationMarkerPosition(int markerInFrames) {
    if (mState == STATE_UNINITIALIZED) {
        return ERROR_INVALID_OPERATION;
    }
    return native_set_marker_pos(markerInFrames);

}

/**
 * Sets the listener the AudioTrack notifies when a previously set marker is reached or
 * for each periodic playback head position update.
 * Notifications will be received in the same thread as the one in which the AudioTrack
 * instance was created.
 * @param listener
 */
public void setPlaybackPositionUpdateListener(OnPlaybackPositionUpdateListener listener) {
    setPlaybackPositionUpdateListener(listener, null);
}

示例：

fun addEndListener() {
    mAudioTrack?.notificationMarkerPosition = mAudioSize / 2
    mAudioTrack?.setPlaybackPositionUpdateListener(object :
        AudioTrack.OnPlaybackPositionUpdateListener {
        override fun onMarkerReached(track: AudioTrack?) {
            Log.d("TAG","onMarkerReached")
        }

        override fun onPeriodicNotification(track: AudioTrack?) {
            Log.d("TAG","onPeriodicNotification")
        }
    })
}

注意：这里设置的notificationMarkerPosition值等于音频长度的一半，为什么要这样设置呢？

原因是notificationMarkerPosition计算的是音频样本大小(Audio sample)，而不是字节数量，我们这里使用的是16bit，所以结果是mAudioSize / 2

2.2 监听音频的开始

2.2.1 分析

监听音频的开始稍微复杂一点，查了一圈没有找到相关的技术文档，大多数场景下，开始执行Write方法就将其视为音频的开始，大多数情况下，这样做影响不大，但是如果涉及到精密的时间戳，这种做法始终会有些偏差。

没有相关的文档，我们直接从源码中寻找答案。

考虑到音频播放是一个流式的过程，我们将其写入Buffer之后，Audio系统应该是需要等到一定的缓冲大小才开始播放，否则可能会出现毛刺，带着猜测，可以去AudioTrack的代码中寻找一下。

关键词：Frame、Threshold、Start

2.2.2 源码

最终找到一个可疑的返回值，我们看一下它的注释：

/**
 * Returns the streaming start threshold of the <code>AudioTrack</code>.
 * <p> The streaming start threshold is the buffer level that the written audio
 * data must reach for audio streaming to start after {@link #play()} is called.
 * When an <code>AudioTrack</code> is created, the streaming start threshold
 * is the buffer capacity in frames. If the buffer size in frames is reduced
 * by {@link #setBufferSizeInFrames(int)} to a value smaller than the start threshold
 * then that value will be used instead for the streaming start threshold.
 * <p> For compressed streams, the size of a frame is considered to be exactly one byte.
 *
 * @return the current start threshold in frames value. This is
 *         an integer between 1 to the buffer capacity
 *         (see {@link #getBufferCapacityInFrames()}),
 *         and might change if the  output sink changes after track creation.
 * @throws IllegalStateException if the track is not initialized or the
 *         track is not {@link #MODE_STREAM}.
 * @see #setStartThresholdInFrames(int)
 */
public @IntRange (from = 1) int getStartThresholdInFrames() {
    if (mState != STATE_INITIALIZED) {
        throw new IllegalStateException("AudioTrack is not initialized");
    }
    if (mDataLoadMode != MODE_STREAM) {
        throw new IllegalStateException("AudioTrack must be a streaming track");
    }
    return native_getStartThresholdInFrames();
}

注释的含义：返回AudioTrack的流媒体开启阈值，开启阈值的含义是在调用play之后，写入的音频数据必须要达到的数据大小才能开始播放。当AudioTrack创建时，这个阈值是以帧为单位的缓冲区容量。setBufferSizeInFrames可以用来减少这个值。对于压缩后的流，帧的大小被视为一个字节。

看到这里，我们已经知道，这个值就是缓冲大小，写入的值达到这个值之后才真正开始播放，那么从第一次Write开始，我们可以开始累计写入大小，等到大小达到这个值，就代表音频开始了。

那么还有个问题，Write方法真的可以当做是直接写入了吗，这是一个阻塞调用，还是异步调用？

以最简单的write方法为例：

public int write(@NonNull byte[] audioData, int offsetInBytes, int sizeInBytes) {
    return write(audioData, offsetInBytes, sizeInBytes, WRITE_BLOCKING);
}

/**
 * The write mode indicating the write operation will block until all data has been written,
 * to be used as the actual value of the writeMode parameter in
 * {@link #write(byte[], int, int, int)}, {@link #write(short[], int, int, int)},
 * {@link #write(float[], int, int, int)}, {@link #write(ByteBuffer, int, int)}, and
 * {@link #write(ByteBuffer, int, int, long)}.
 */
public final static int WRITE_BLOCKING = 0;

/**
 * The write mode indicating the write operation will return immediately after
 * queuing as much audio data for playback as possible without blocking,
 * to be used as the actual value of the writeMode parameter in
 * {@link #write(ByteBuffer, int, int)}, {@link #write(short[], int, int, int)},
 * {@link #write(float[], int, int, int)}, {@link #write(ByteBuffer, int, int)}, and
 * {@link #write(ByteBuffer, int, int, long)}.
 */
public final static int WRITE_NON_BLOCKING = 1;

可以看到实际上AudioTrack是有两种写入模式的，一种阻塞式，一种非阻塞式，我们直接记录写入数据的方式实际上只适用于阻塞式写入，非阻塞式的话，可以参考使用监听隐僻的开始，设置一个较小的值，从而得到一个粗略的值。

到此为止，阻塞式（最常用）式音频写入的音频播放开始方式我们就知道了。

2.2.3 代码示例

//每次音频写入的时候调用
fun updateStartTime() {
    val threshold = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.S) {
        mAudioTrack.startThresholdInFrames * 2
    } else {
        0
    }
    if (!mNotifyStart && mWriteAudioSize > threshold) {
        mNotifyStart = true
        Log.d("TAG","on audio start")
    }
}

注意：

1、该方法只在大于等于Android S(12)以上才生效

2、startThresholdInFrames同样是以音频样本为单位，所以16Bit需要乘以2

三、拓展

3.1 音频播放过程有毛刺

音频播放过程中的毛刺其实有很多原因会造成，例如实际网络音频返回速度过慢、编码问题导致的锁导致的低效问题等，这里列几个参考解决建议：

1、如果是网络返回的音频的数据，可以做一个缓冲队列，等到缓冲到一定的数据之后再开始实际的播放调用
2、尽量不要使用锁，即便使用锁，也尽量不要使用wait方法，使用不当很有可能造成卡顿，尽量使用自带同步的消息结构，如ConcurrentLinkedQueue等
3、如果其他方面都没有问题，但是还会有一定的毛刺问题，可以考虑提升音频写入线程的优先级：Process.setThreadPriority(Process.THREAD_PRIORITY_AUDIO)

参考

1、stackoverflow.com/questions/7…

如何监听AudioTrack的开始与结束？