近期接客户需求,需要对接阿里云的NLP基础服务,为用户提供包括分词、词性标注、命名实体、情感分析、中心词提取等自然语言处理基础服务,可用于智能问答、对话机器人、舆情分析、内容推荐、电商评价分析等场景中。
官方文档在这:NLP基础服务2.0。
因为只有JAVA和Python的相关文档,我也没有在网上找到php版的,所以我参照Python的写出了PHP的版本,可能写的不是很好(PHP半桶水),但是实测可以用,仅供大家参考;
function curl_url($str){
$name = str_replace(' ','',$str);
$name = str_replace('/','',$name);
$name = str_replace('~','',$name);
$urls = "http://alinlp.cn-hangzhou.aliyuncs.com/";
$AccessKeyId = "LTAI5tDwt***********";
$AccessKeySecret = "tMkWoAc2deD***************";
# 准备公共参数和API对应的参数,这里以词性标注--通用为例
# 注意,这里并没有Signature参数哦
$param = [
"Format"=>"json",
"Version"=>"2020-06-29",
"AccessKeyId"=>$AccessKeyId,
"SignatureMethod"=>"HMAC-SHA1",
"Timestamp"=>gmdate(DATE_ATOM,time()),
"SignatureVersion"=>"1.0",
"SignatureNonce"=>uuid(),
"Text"=>$name,
"TokenizerId"=>"GENERAL_CHN",
"Action"=>"GetWsChGeneral",
"OutType"=>"1",
"ServiceCode"=>"alinlp"
];
# 对参数按照key进行排序
ksort($param);
# 将key-value转化为url的形式
$StringToSign = http_build_query($param);
# AccessKeyId=LTXXXXkey&Action=GetPosChEcom&Format=json&ServiceCode=alinlp&SignatureMethod=HMAC-SHA1&SignatureNonce=5c901f6ebac94f7196ba651b838c13d9&SignatureVersion=1.0&Text=%E4%BB%8A%E5%A4%A9%E6%B5%8B%E8%AF%95%E4%B8%80%E4%B8%8B&Timestamp=2020-08-26T14%3A01%3A48Z&TokenizerId=MAINSE&Version=2020-06-29
# 做urlencode
$StringToSign = urlencode($StringToSign);
# AccessKeyId%3DLTXXXXkey&Action%3DGetPosChEcom&Format%3Djson&ServiceCode%3Dalinlp&SignatureMethod%3DHMAC-SHA1&SignatureNonce%3D5c901f6ebac94f7196ba651b838c13d9&SignatureVersion%3D1.0&Text%3D%25E4%25BB%258A%25E5%25A4%25A9%25E6%25B5%258B%25E8%25AF%2595%25E4%25B8%2580%25E4%25B8%258B&Timestamp%3D2020-08-26T14%253A01%253A48Z&TokenizerId%3DMAINSE&Version%3D2020-06-29
# 拼接头
$StringToSign = "GET&%2F&" . $StringToSign;
# 拼接SHA1对应的key
$secret = $AccessKeySecret . "&";
# 计算SHA1值,并做base64
$sig = base64_encode(hash_hmac("sha1",$StringToSign, $secret,true));
$param["Signature"] = $sig;
# 将Signature添加到参数中
# 发送http请求
$ch = curl_init();
$urls = $urls.'?'.http_build_query($param);
curl_setopt($ch, CURLOPT_URL, $urls); //这里是请求url
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);//这里是请求带一个ssl证书的时候开启,他默认会去查ssl证书,你这个接口没有不需要 所以是false
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//这里是设置内容不直接输出页面,我们要存一下变量的意思
$ret = curl_exec($ch);//执行
curl_close($ch);//关闭
$data_g = json_decode($ret, true);
//到此已拿到想要的分词数据了,根据自己的需要进行格式转换就行了,以下是我为了兼容以前的格式进行改写
$result = array();
$getdata = json_decode($data_g['Data'],true);
$getdata = $getdata['result'];
for($i=0;$i<count($getdata);$i++){
if($getdata[$i]['word']!='+'){
$result[] = $getdata[$i]['word'];
}
}
$return_arr = [
[
"word"=>$result
]
];
return $return_arr;
}
function uuid()
{
$chars = md5(uniqid(mt_rand(), true));
$uuid = substr ( $chars, 0, 8 ) . '-'
. substr ( $chars, 8, 4 ) . '-'
. substr ( $chars, 12, 4 ) . '-'
. substr ( $chars, 16, 4 ) . '-'
. substr ( $chars, 20, 12 );
return $uuid ;
}
```
第一次在掘金写文章,分享自己用到的一点点干货。