AudioRecord 是 Android 基于原始PCM音頻數據錄制的類,WebRCT 對其封裝的代碼位置位于
org/webrtc/audio/WebRtcAudioRecord.JAVA,接下來我們學習一下 AudioRecord 是如何創建啟動,讀取音頻采集數據以及銷毀等功能的。
創建和初始化
private int initRecording(int sampleRate, int channels) {
Logging.d(TAG, "initRecording(sampleRate=" + sampleRate + ", channels=" + channels + ")");
if (audioRecord != null) {
reportWebRtcAudioRecordInitError("InitRecording called twice without StopRecording.");
return -1;
}
final int bytesPerFrame = channels * (BITS_PER_SAMPLE / 8);
final int framesPerBuffer = sampleRate / BUFFERS_PER_SECOND;
byteBuffer = ByteBuffer.allocateDirect(bytesPerFrame * framesPerBuffer);
Logging.d(TAG, "byteBuffer.capacity: " + byteBuffer.capacity());
emptyBytes = new byte[byteBuffer.capacity()];
// Rather than passing the ByteBuffer with every callback (requiring
// the potentially expensive GetDirectBufferAddress) we simply have the
// the native class cache the address to the memory once.
nativeCacheDirectBufferAddress(byteBuffer, nativeAudioRecord);
// Get the minimum buffer size required for the successful creation of
// an AudioRecord object, in byte units.
// Note that this size doesn't guarantee a smooth recording under load.
final int channelConfig = channelCountToConfiguration(channels);
int minBufferSize =
AudioRecord.getMinBufferSize(sampleRate, channelConfig, AudioFormat.ENCODING_PCM_16BIT);
if (minBufferSize == AudioRecord.ERROR || minBufferSize == AudioRecord.ERROR_BAD_VALUE) {
reportWebRtcAudioRecordInitError("AudioRecord.getMinBufferSize failed: " + minBufferSize);
return -1;
}
Logging.d(TAG, "AudioRecord.getMinBufferSize: " + minBufferSize);
// Use a larger buffer size than the minimum required when creating the
// AudioRecord instance to ensure smooth recording under load. It has been
// verified that it does not increase the actual recording latency.
int bufferSizeInBytes = Math.max(BUFFER_SIZE_FACTOR * minBufferSize, byteBuffer.capacity());
Logging.d(TAG, "bufferSizeInBytes: " + bufferSizeInBytes);
try {
audioRecord = new AudioRecord(audIOSource, sampleRate, channelConfig,
AudioFormat.ENCODING_PCM_16BIT, bufferSizeInBytes);
} catch (IllegalArgumentException e) {
reportWebRtcAudioRecordInitError("AudioRecord ctor error: " + e.getMessage());
releaseAudioResources();
return -1;
}
if (audioRecord == null || audioRecord.getState() != AudioRecord.STATE_INITIALIZED) {
reportWebRtcAudioRecordInitError("Failed to create a new AudioRecord instance");
releaseAudioResources();
return -1;
}
if (effects != null) {
effects.enable(audioRecord.getAudioSessionId());
}
logMainParameters();
logMainParametersExtended();
return framesPerBuffer;
}
在初始化的方法中,主要做了兩件事。
- 創建緩沖區
- 由于實際使用數據的代碼在native層,因此這里創建了一個Java的direct buffer,而且AudioRecord也有通過ByteBuffer讀數據的接口,并且實際把數據復制到ByteBuffer的代碼也在native層,所以這里使用direct buffer效率會更高。
- ByteBuffer的容量為單次讀取數據的大小。Android的數據格式是打包格式(packed),在多個聲道時,同一個樣點的不同聲道連續存儲在一起,接著存儲下一個樣點的不同聲道;一幀就是一個樣點的所有聲道數據的合集,一次讀取的幀數是10ms的樣點數(采樣率除以100,樣點個數等于采樣率時對應于1s的數據,所以除以100就是10ms的數據);ByteBuffer的容量為幀數乘聲道數乘每個樣點的字節數(PCM 16 bit表示每個樣點為兩個字節)。
- 這里調用的nativeCacheDirectBufferAddress JNI函數會在native層把ByteBuffer的訪問地址提前保存下來,避免每次讀到音頻數據后,還需要調用接口獲取訪問地址。
- 創建 AudioRecord對象,構造函數有很多參數,分析如下
- audioSource指的是音頻采集模式,默認是 VOICE_COMMUNICATION,該模式會使用硬件AEC(回聲抑制)
- sampleRate采樣率
- channelConfig聲道數
- audioFormat音頻數據格式,這里實際使用的是 AudioFormat.ENCODING_PCM_16BIT,即PCM 16 bit的數據格式。
- bufferSize系統創建AudioRecord時使用的緩沖區大小,這里使用了兩個數值的較大者:通過AudioRecord.getMinBufferSize接口獲取的最小緩沖區大小的兩倍,讀取數據的ByteBuffer的容量。通過注釋我們可以了解到,考慮最小緩沖區的兩倍是為了確保系統負載較高的情況下音頻采集仍能平穩運行,而且這里設置更大的緩沖區并不會增加音頻采集的延遲。
啟動
private boolean startRecording() {
Logging.d(TAG, "startRecording");
assertTrue(audioRecord != null);
assertTrue(audioThread == null);
try {
audioRecord.startRecording();
} catch (IllegalStateException e) {
reportWebRtcAudioRecordStartError(AudioRecordStartErrorCode.AUDIO_RECORD_START_EXCEPTION,
"AudioRecord.startRecording failed: " + e.getMessage());
return false;
}
if (audioRecord.getRecordingState() != AudioRecord.RECORDSTATE_RECORDING) {
reportWebRtcAudioRecordStartError(
AudioRecordStartErrorCode.AUDIO_RECORD_START_STATE_MISMATCH,
"AudioRecord.startRecording failed - incorrect state :"
+ audioRecord.getRecordingState());
return false;
}
audioThread = new AudioRecordThread("AudioRecordJavaThread");
audioThread.start();
return true;
}
在該方法中,首先啟動了 audioRecord,接著判斷了讀取線程事都正在錄制中。
讀數據
private class AudioRecordThread extends Thread {
private volatile boolean keepAlive = true;
public AudioRecordThread(String name) {
super(name);
}
// TODO(titovartem) make correct fix during webrtc:9175
@SuppressWarnings("ByteBufferBackingArray")
@Override
public void run() {
Process.setThreadPriority(Process.THREAD_PRIORITY_URGENT_AUDIO);
Logging.d(TAG, "AudioRecordThread" + WebRtcAudioUtils.getThreadInfo());
assertTrue(audioRecord.getRecordingState() == AudioRecord.RECORDSTATE_RECORDING);
long lastTime = System.nanoTime();
while (keepAlive) {
int bytesRead = audioRecord.read(byteBuffer, byteBuffer.capacity());
if (bytesRead == byteBuffer.capacity()) {
if (microphoneMute) {
byteBuffer.clear();
byteBuffer.put(emptyBytes);
}
// It's possible we've been shut down during the read, and stopRecording() tried and
// failed to join this thread. To be a bit safer, try to avoid calling any native methods
// in case they've been unregistered after stopRecording() returned.
if (keepAlive) {
nativeDataIsRecorded(bytesRead, nativeAudioRecord);
}
if (audioSamplesReadyCallback != null) {
// Copy the entire byte buffer array. Assume that the start of the byteBuffer is
// at index 0.
byte[] data = Arrays.copyOf(byteBuffer.array(), byteBuffer.capacity());
audioSamplesReadyCallback.onWebRtcAudioRecordSamplesReady(
new AudioSamples(audioRecord, data));
}
} else {
String errorMessage = "AudioRecord.read failed: " + bytesRead;
Logging.e(TAG, errorMessage);
if (bytesRead == AudioRecord.ERROR_INVALID_OPERATION) {
keepAlive = false;
reportWebRtcAudioRecordError(errorMessage);
}
}
if (DEBUG) {
long nowTime = System.nanoTime();
long durationInMs = TimeUnit.NANOSECONDS.toMillis((nowTime - lastTime));
lastTime = nowTime;
Logging.d(TAG, "bytesRead[" + durationInMs + "] " + bytesRead);
}
}
try {
if (audioRecord != null) {
audioRecord.stop();
}
} catch (IllegalStateException e) {
Logging.e(TAG, "AudioRecord.stop failed: " + e.getMessage());
}
}
// Stops the inner thread loop and also calls AudioRecord.stop().
// Does not block the calling thread.
public void stopThread() {
Logging.d(TAG, "stopThread");
keepAlive = false;
}
}
從 AudioRecord去數據的邏輯在 AudioRecordThread 線程的 Run函數中。
- 在線程啟動的地方,先設置線程的優先級為URGENT_AUDIO,這里調用的是Process.setThreadPriority。
- 在一個循環中不停地調用audioRecord.read讀取數據,把采集到的數據讀到ByteBuffer中,然后調用nativeDataIsRecorded JNI函數通知native層數據已經讀到,進行下一步處理。
停止和銷毀
private boolean stopRecording() {
Logging.d(TAG, "stopRecording");
assertTrue(audioThread != null);
audioThread.stopThread();
if (!ThreadUtils.joinUninterruptibly(audioThread, AUDIO_RECORD_THREAD_JOIN_TIMEOUT_MS)) {
Logging.e(TAG, "Join of AudioRecordJavaThread timed out");
WebRtcAudioUtils.logAudioState(TAG);
}
audioThread = null;
if (effects != null) {
effects.release();
}
releaseAudioResources();
return true;
}
可以看到,這里首先把AudioRecordThread讀數據循環的keepAlive條件置為false,接著調用
ThreadUtils.joinUninterruptibly等待AudioRecordThread線程退出。
這里有一點值得一提,keepAlive變量加了volatile關鍵字進行修飾,這是因為修改和讀取這個變量的操作可能發生在不同的線程,使用volatile關鍵字進行修飾,可以保證修改之后能被立即讀取到。
AudioRecordThread線程退出循環后,會調用audioRecord.stop()停止采集;線程退出之后,會調用audioRecord.release()釋放AudioRecord對象。
以上,就是 Android WebRTC 音頻采集 Java 層的大致流程。
參考《WebRTC 開發實戰》
https://chromium.googlesource.com/external/webrtc/+/HEAD/sdk/android/src/java/org/webrtc/audio/WebRtcAudioRecord.java