Web Speech API 是一种浏览器内置的 API,用于语音识别和语音合成。它允许开发者在网页中实现语音输入和语音输出功能,提供更自然和直观的用户交互方式。Web Speech API 包括两个主要部分:
- Speech Recognition API:用于语音识别,将用户的语音输入转换为文本。
- Speech Synthesis API:用于语音合成,将文本转换为语音输出。
使用场景
- 语音控制和导航:通过语音命令控制网页的导航和操作,例如在智能家居控制面板中使用语音控制设备。
- 辅助技术:帮助视力或行动不便的用户通过语音进行网页操作。
- 语音输入:在表单和聊天应用中使用语音输入,提高输入效率。
- 语音反馈:在教育和培训应用中提供语音反馈,增强用户体验。
1. Speech Recognition API
Speech Recognition API 用于将用户的语音输入转换为文本。以下是一个基本的示例:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Speech Recognition Example</title>
</head>
<body>
<h1>Speech Recognition Example</h1>
<button id="start-recognition">Start Recognition</button>
<p id="result"></p>
<script>
// 检查浏览器是否支持 SpeechRecognition API
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
if (!SpeechRecognition) {
alert('Your browser does not support Speech Recognition API');
} else {
const recognition = new SpeechRecognition();
recognition.lang = 'en-US'; // 设置识别语言
recognition.interimResults = false; // 是否返回临时结果
recognition.maxAlternatives = 1; // 返回结果的最大数量
const startButton = document.getElementById('start-recognition');
const resultParagraph = document.getElementById('result');
startButton.addEventListener('click', () => {
recognition.start();
});
recognition.addEventListener('result', (event) => {
const transcript = event.results[0][0].transcript;
resultParagraph.textContent = `You said: ${transcript}`;
});
recognition.addEventListener('speechend', () => {
recognition.stop();
});
recognition.addEventListener('error', (event) => {
resultParagraph.textContent = `Error occurred in recognition: ${event.error}`;
});
}
</script>
</body>
</html>
2. Speech Synthesis API
Speech Synthesis API 用于将文本转换为语音输出。以下是一个基本的示例:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Speech Synthesis Example</title>
</head>
<body>
<h1>Speech Synthesis Example</h1>
<textarea id="text-to-speak" rows="4" cols="50">Hello, how are you?</textarea>
<button id="speak-button">Speak</button>
<script>
const speakButton = document.getElementById('speak-button');
const textToSpeak = document.getElementById('text-to-speak');
speakButton.addEventListener('click', () => {
const utterance = new SpeechSynthesisUtterance(textToSpeak.value);
utterance.lang = 'en-US'; // 设置语音语言
utterance.pitch = 1; // 设置语音音调
utterance.rate = 1; // 设置语音速度
window.speechSynthesis.speak(utterance);
});
</script>
</body>
</html>
综合示例
结合 Speech Recognition API 和 Speech Synthesis API,可以实现一个简单的语音助手:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Speech Assistant Example</title>
</head>
<body>
<h1>Speech Assistant Example</h1>
<button id="start-assistant">Start Assistant</button>
<p id="assistant-result"></p>
<script>
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
if (!SpeechRecognition) {
alert('Your browser does not support Speech Recognition API');
} else {
const recognition = new SpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
const startButton = document.getElementById('start-assistant');
const resultParagraph = document.getElementById('assistant-result');
startButton.addEventListener('click', () => {
recognition.start();
});
recognition.addEventListener('result', (event) => {
const transcript = event.results[0][0].transcript;
resultParagraph.textContent = `You said: ${transcript}`;
respondToSpeech(transcript);
});
recognition.addEventListener('speechend', () => {
recognition.stop();
});
recognition.addEventListener('error', (event) => {
resultParagraph.textContent = `Error occurred in recognition: ${event.error}`;
});
function respondToSpeech(transcript) {
let response = '';
if (transcript.toLowerCase().includes('hello')) {
response = 'Hello! How can I help you today?';
} else if (transcript.toLowerCase().includes('time')) {
response = `The current time is ${new Date().toLocaleTimeString()}`;
} else {
response = 'Sorry, I did not understand that.';
}
const utterance = new SpeechSynthesisUtterance(response);
utterance.lang = 'en-US';
window.speechSynthesis.speak(utterance);
}
}
</script>
</body>
</html>
代码讲解
- 检查浏览器支持:首先检查浏览器是否支持
SpeechRecognition和SpeechSynthesisAPI。 - 初始化识别和合成对象:创建
SpeechRecognition和SpeechSynthesisUtterance对象。 - 事件监听:为按钮添加点击事件监听器,开始语音识别。为
recognition对象添加result、speechend和error事件监听器,处理识别结果和错误。 - 响应语音输入:根据识别结果生成响应文本,并使用
SpeechSynthesisAPI 将响应文本转换为语音输出。
通过上述示例,可以在网页中实现基本的语音识别和语音合成功能,使用户能够通过语音进行交互,提供更自然和直观的用户体验。