[quote="huan666, post:1, topic:987, full:true"]
1 报错描述
1.1 系统环境
Hardware Environment(Ascend/GPU/CPU): Ascend Software Environment: -- MindSpore version (source or binary): 1.5.0 -- Python version (e.g., Python 3.7.5): 3.7.6 -- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic -- GCC/Compiler version (if compiled from source):
1.2 基本信息
1.2.1 脚本
训练脚本是构建了ScatterAdd的单算子网络,通过加法运算更新输入张量的值。脚本如下:
01 class Net(nn.Cell):
02 def __init__(self,x):
03 super(Net, self).__init__()
04 self.x = Parameter(x,name="x")
05 self.scatter_add = ops.ScatterAdd()
06
07 def construct(self, indices, updates):
08 output = self.scatter_add(self.x, indices, updates)
09 return output
10
11 input_x = Parameter(Tensor(np.array([[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), mindspore.float32), name="x")
12 indices = Tensor(np.array([[0, 1], [1, 1]]), mindspore.int32)
13 updates = Tensor(np.ones([2, 2, 3]), mindspore.float32)
14
15 grad_op = ops.GradOperation(get_all=True)
16 gradient_function = grad_op(Net(input_x))
17 g = gradient_function(indices, updates)
18 print('反向输出:',g)
1.2.2 报错
这里报错信息如下:
Traceback (most recent call last):
File "demo.py", line 17, in <module>
g = gradient_function(indices, updates)
File "/lib/python3.7/site-packages/mindspore/common/api.py", line 363, in staging_specialize
out = _MindsporeFunctionExecutor(func, ms_create_time, input_signature, process_obj)(*args)
File " /lib/python3.7/site-packages/mindspore/common/api.py", line 62, in wrapper
results = fn(*arg, **kwargs)
File " /lib/python3.7/site-packages/mindspore/common/api.py", line 276, in __call__
phase = self.compile(args_list, self.fn.__name__)
File " /lib/python3.7/site-packages/mindspore/common/api.py", line 259, in compile
is_compile = self._graph_executor.compile(self.fn, args_list, phase, True)
RuntimeError: mindspore/ccsrc/pipeline/jit/validator.cc:73 ValidateOperation] Illegal primitive: Primitive ScatterAdd's bprop not defined.
2 原因分析
我们看报错信息,在RuntimeError中,写到 Illegal primitive: Primitive ScatterAdd’s bprop not defined ,意思是ScatterAdd的反向并未定义,但是你却在求ScatterAdd算子的反向。这是由于该算子反向并未实现,建议使用ops.TensorScatterAdd算子替换,该算子是支持反向的。其功能跟ops.ScatterNdAdd是一样的,而ScatterNdAdd算子是ScatterAdd算子的增强版,不过目前和ScatterAdd算子一样,均暂不支持反向计算。区别是ScatterNdAdd 会直接更新input x,而TensorScatterAdd是通过output返回计算结果。此外,算子之间的用法也不一样。因此,使用TensorScatterAdd替换时,需要注意根据算子要求重新传参。
3 解决方法
基于上面已知的原因,可以参考如下计算TensorScatterAdd反向的示例:
01 super(Net, self).__init__()
02 self.x = Parameter(x,name="x")
03 self.scatter_add = ops.TensorScatterAdd()
04
05 def construct(self, indices, updates):
06 output = self.scatter_add(self.x, indices, updates)
07 return output
08
09 input_x = Tensor(np.array([[-0.1, 0.3, 3.6], [0.4, 0.5, -3.2]]), mindspore.float32)
10 indices = Tensor(np.array([[0, 0], [0, 0]]), mindspore.int32)
11 updates = Tensor(np.array([1.0, 2.2]), mindspore.float32)
12 grad_op = ops.GradOperation(get_all=True)
13 gradient_function = grad_op(Net(input_x))
14 g = gradient_function(indices, updates)
15 print('反向输出:',g)
此时执行成功,输出如下: 反向输出: (Tensor(shape=[2, 2], dtype=Int32, value=([[0, 0],[0, 0]]), Tensor(shape=[2], dtype=Float32, value= [ 1.00000000e+00, 1.00000000e+00]))
4 总结
定位报错问题的步骤: 1、找到报错的用户代码行: g = gradient_function(indices, updates) ; 2、 根据日志报错信息中的关键字,缩小分析问题的范围: Illegal primitive: Primitive ScatterAdd’s bprop not defined ; 3、根据该报错信息,可在MindSpore社区提问或提问题单,转由开发人员定位问题根因,并找到解决方案。
5 参考文档
5.1 TensorScatterAdd算子API
5.2 ScatterAdd算子API
5.3 ScatterNdAdd算子API
[/quote]