对话系统里,首先要对用户的输入进行领域、意图识别和槽抽取。深度学习发展以后,意图识别的方法多使用深度学习的算法,使用CNN对意图进行多分类,领域分类和意图分类较为类似。而槽的预测可以看成是标签序列预测问题。例如句子“我想听周杰伦的菊花台”,标签可以定义为“O O O B-singer M-singer E-singer O B-song M-song E-song”。标签序列预测多使用CRF,RNN,LSTM,LSTM+crf的模型。
链接:www.zhihu.com/question/22…
ATIS数据集包含4978训练数据和893个测试数据,文本内容为客服对话,意图一共有26类。查询话语中的每个标记与填充IOB标签的插槽对齐,也就是上面图片中Sentence和Slots都是一一对齐的。
##4 Pytext实战
本部分内容主要参考官方的文档Train Intent-Slot model on ATIS Dataset,有些地方稍微出入。
import sys
import flask
import pytext
config_file = sys.argv[1]
model_file = sys.argv[2]
config = pytext.load_config(config_file)
predictor = pytext.create_predictor(config, model_file)
app = flask.Flask(__name__)
@app.route('/get_flight_info', methods=['GET', 'POST'])
def get_flight_info():
text = flask.request.data.decode()
# Pass the inputs to PyText's prediction API
result = predictor({"raw_text": text})
# Results is a list of output blob names and their scores.# The blob names are different for joint models vs doc models# Since this tutorial is for both, let's check which one we should look at.
doc_label_scores_prefix = (
'scores:'if any(r.startswith('scores:') for r in result)
else'doc_scores:'
)
# For now let's just output the top document label!
best_doc_label = max(
(label for label in result if label.startswith(doc_label_scores_prefix)),
key=lambda label: result[label][0],
# Strip the doc label prefix here
)[len(doc_label_scores_prefix):]
return flask.jsonify({"question": f"Are you asking about {best_doc_label}?"})
app.run(host='0.0.0.0', port='8080', debug=True)
执行
python flask_app.py "$CONFIG" exported_model.c2
然后打开另一个Terminal,我们测试下服务:
测试1
curl http://localhost:8080/get_flight_info -H "Content-Type: text/plain"-d"I am looking for flights from San Francisco to Minneapolis"
{
"question": "Are you asking about flight?"
}
测试2
curl http://localhost:8080/get_flight_info -H "Content-Type: text/plain"-d"How much does a trip to NY cost?"
{
"question": "Are you asking about airfare?"
}
测试3
curl http://localhost:8080/get_flight_info -H "Content-Type: text/plain"-d"Which airport should I go to?"
{
"question": "Are you asking about airport?"
}