语音识别功能
通过识别语音转文字搭配NLU(人机对话),可实现通过语音页面跳转、转账、理财等功能,做到无需点击即可完成相关流程。语音转文字使用speechRecognizer模块,NLU使用第三方提供的接口不做介绍。
speechRecognizer
通过对着设备说话识别的需要开启麦克风权限(ohos.permission.MICROPHONE)
1、初始化引擎
const asrEngine = await speechRecognizer.createEngine(
{
language: 'zh-CN',
online: 1,
extraParams: { locate: "CN", recognizerMode: "short" }
}
)
- language:识别的语言,当前只支持识别中文
- online:当中只有一个值1,表示离线模式,离线模型已下载到本地,没有网络也可识别语音
- extraParams:
- locate:应用区域,当前只支持CN
- recognizerMode:识别说话的时间。short:最多60秒。long:最多8小时
2、设置语音识别回调
let setListener: speechRecognizer.RecognitionListener = {
// 开始识别成功回调
onStart(sessionId: string, eventMessage: string) {
},
// 事件回调
onEvent(sessionId: string, eventCode: number, eventMessage: string) {
},
// 识别结果回调,包括中间结果和最终结果
onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
},
// 识别完成回调
onComplete(sessionId: string, eventMessage: string) {
},
// 错误回调,错误码通过本方法返回
onError(sessionId: string, errorCode: number, errorMessage: string) {
},
}
asrEngine.setListener(setListener)
开始监听时依次触发对应的回调函数,识别的文字在onResult回调中获取。onResult回调函数中识别的文字为依次识别,例如语音为夕阳西下,会触发4次onResult回调,识别结果分别为夕、夕阳、夕阳西、夕阳西下,通过result.isFinal判断是否是完整的一句话
3、开始监听
let recognizerParams: speechRecognizer.StartParams = {
sessionId: '666888',
audioInfo: {
audioType: 'pcm',
sampleRate: 16000,
soundChannel: 1,
sampleBit: 16
},
extraParams: {
recognitionMode: 0,
maxAudioDuration: 60000
}
}
asrEngine.startListening(recognizerParams);
- sessionId:会话id,结束识别或中断识别时用这个标识
- audioInfo:这些配置内容都是当前仅支持的,无法配置为其它值,依次为音频类型、采样率、通道位数1的信息、采样位数
- extraParams:
- recognitionMode:语音识别类型。0:打开麦克风权限直接对着设备说话即可识别。1:通过writeAudio方法传入待识别的音频流识别。
- maxAudioDuration:识别时长。当recognizerMode的值为short时,识别时间可设置20s~60s,为long时,识别时间可设置20s~8h
4、完整代码
class SpeechRecognizerManager {
private extraParam: Record<string, Object> = {
"locate": "CN", "recognizerMode": "short"
};
private initParamsInfo: speechRecognizer.CreateEngineParams = {
language: 'zh-CN',
online: 1,
extraParams: this.extraParam
};
private asrEngine: speechRecognizer.SpeechRecognitionEngine | null = null
private sessionId: string = "asr" + Date.now()
private static instance: SpeechRecognizerManager
static getInstance() {
if (!SpeechRecognizerManager.instance) {
return new SpeechRecognizerManager()
}
return SpeechRecognizerManager.instance
}
// 私有化构造,无法通过new创建实例
private constructor() {
}
// 初始化引擎
private async createEngine() {
if (!this.asrEngine) {
this.asrEngine = await speechRecognizer.createEngine(this.initParamsInfo)
}
}
// 语音识别回调
private setListener(callback: (srr: speechRecognizer.SpeechRecognitionResult) => void = () => {
}) {
// 创建回调对象
let setListener: speechRecognizer.RecognitionListener = {
// 开始识别成功回调
onStart(sessionId: string, eventMessage: string) {
},
// 事件回调
onEvent(sessionId: string, eventCode: number, eventMessage: string) {
},
// 识别结果回调,包括中间结果和最终结果
onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
// 获取完整的一句话
if (result.isFinal) {
callback && callback(result)
}
},
// 识别完成回调
onComplete(sessionId: string, eventMessage: string) {
},
// 错误回调,错误码通过本方法返回
onError(sessionId: string, errorCode: number, errorMessage: string) {
},
}
// 设置回调
this.asrEngine?.setListener(setListener);
}
// 语音开启监听
private startListening() {
let recognizerParams: speechRecognizer.StartParams = {
sessionId: this.sessionId,
audioInfo: {
audioType: 'pcm',
sampleRate: 16000,
soundChannel: 1,
sampleBit: 16
},
extraParams: {
recognitionMode: 0,
maxAudioDuration: 60000
}
}
this.asrEngine?.startListening(recognizerParams);
};
// 中断识别
cancel() {
this.asrEngine?.cancel(this.sessionId)
}
// 释放引擎
shutDown() {
this.asrEngine?.shutdown()
}
// 判断引擎是否在识别中 true:识别中 false:空闲
isBusy() {
return this.asrEngine?.isBusy()
}
// 识别方法
speak(callback: (srr: speechRecognizer.SpeechRecognitionResult) => void) {
this.createEngine().then(() => {
this.setListener(callback)
this.startListening()
})
}
}