行优先的多维数组中线性索引和逻辑索引的关系

42 阅读1分钟

请移步zhuanlan.zhihu.com/p/302208545… 观看原文。

命题:如果有一个n维行优先的数组的shape为shapeo×shape1×...×shapen1shape_o\times shape_1 \times ... \times shape_{n-1},对于线性索引为LL,,如果其满足

L=index0×stride0+index1×stride1+index2×stride2+...+indexn1×striden1L = index_0 \times stride_0 + index_1 \times stride_1 + index_2 \times stride_2 + ... + index_{n-1} \times stride_{n-1}

,则有:

index0=Lstride0index_0 = \lfloor \frac{L}{stride_0} \rfloor \\

证明:

假如index0>Lstride0index_0 > \lfloor \frac{L}{stride_0} \rfloor ,不妨设index0=Lstride0+1index_0 = \lfloor \frac{L}{stride_0} \rfloor +1,,则有index0stride0=(Lstride0+1)stride0>Lindex_0 *stride_0 =( \lfloor \frac{L}{stride_0} \rfloor +1)*stride_0 > L,这和 L=index0×stride0+index1×stride1+index2×stride2+...+indexn1×striden1L = index_0 \times stride_0 + index_1 \times stride_1 + index_2 \times stride_2 + ... + index_{n-1} \times stride_{n-1}的前提相矛盾,故假设不成立。

假如index0<Lstride0index_0 < \lfloor \frac{L}{stride_0} \rfloor ,不妨设index0=Lstride01index_0 = \lfloor \frac{L}{stride_0} \rfloor -1,则有:

index0stride0+index1stride1+index2stride2+...+indexn1striden1=(Lstride01)stride0+index1stride1+index2stride2+...+indexn1striden1index_0 *stride_0 + index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1}= \\ ( \lfloor \frac{L}{stride_0} \rfloor -1)*stride_0 + index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} \\

由于有(Lstride01)stride0Lstride0(\lfloor{ \frac{L}{stride_0}} \rfloor - 1) * stride_0 \leq L- stride_0,故上式有:

(Lstride01)stride0+index1stride1+index2stride2+...+indexn1striden1Lstride0+index1stride1+index2stride2+...+indexn1striden1( \lfloor \frac{L}{stride_0} \rfloor -1)*stride_0 + index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} \\ \leq L- stride_0 +index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} \\

假如index1...indexn1index_1 ...index_{n-1}都取最大值,那么对于第k个维度而言, 由于stridek=m=k+1n1shapemstride_k = \prod_{m=k+1}^{n-1}shape_m,所以有:

(shapek1)stridek=(shapek1)m=k+1n1shapem=m=kn1shapemm=k+1n1shapem,kn2(shapek1)stridek=shapen11,k=n1(shape_k-1)stride_k = (shape_k -1)\prod_{m=k+1}^{n-1}shape_m = \prod_{m=k}^{n-1}shape_m - \prod_{m=k+1}^{n-1}shape_m, k\leq n-2 \\ (shape_k-1)stride_k = shape_{n-1} -1, k = n-1 \\

故将第二个维度及其之后的维度的shape最大值及其stride相乘有下式:

k=1n1(shapek1)stridek=k=1n2(m=kn1shapemm=k+1n1shapem)+shapen11=m=1n1shapem1<stride0\sum_{k=1}^{n-1}(shape_k -1)stride_k \\ =\sum_{k=1}^{n-2}(\prod_{m=k}^{n-1}shape_m - \prod_{m=k+1}^{n-1}shape_m) + shape_{n-1} -1 \\ =\prod_{m=1}^{n-1}shape_m -1 < stride_0\\

所以

Lstride0+index1stride1+index2stride2+...+indexn1striden1<Lstride0+m=1n1shapem1<Lstride0+stride0=LL- stride_0 +index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} <L-stride_0 + \prod_{m=1}^{n-1}shape_m -1 < L-stride_0 + stride_0 = L

亦即:

(Lstride01)stride0+index1stride1+index2stride2+...+indexn1striden1Lstride0+index1stride1+index2stride2+...+indexn1striden1<L( \lfloor \frac{L}{stride_0} \rfloor -1)*stride_0 + index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} \leq L- stride_0 +index_1 *stride_1 +index_2 *stride_2 + ... + index_{n-1} * stride_{n-1} < L \\

这和前提相矛盾,故假设不成立。

所以命题得证。

通过这个命题,我们可以从线性索引求得逻辑索引,首先通过Lstride0\lfloor \frac{L}{stride_0} \rfloor求index_0,有:

L=Lstride0×stride0+index1×stride1+index2×stride2+...+indexn1×striden1L = \lfloor \frac{L}{stride_0} \rfloor \times stride_0 + index_1 \times stride_1 + index_2 \times stride_2 + ... + index_{n-1} \times stride_{n-1}

之后,设offset=LLstride0×stride0=L%stride0offset = L - \lfloor \frac{L}{stride_0} \rfloor \times stride_0 = L \% stride_0,则有:

offset=index1×stride1+index2×stride2+...+indexn1×striden1offset = index_1 \times stride_1 + index_2 \times stride_2 + ... + index_{n-1} \times stride_{n-1} \\

用同样的方法,我们可以求得index1index2.....indexn1index_1、index_2、.....index_{n-1}, 这在写CUDA算子,如cat算子等的时候特别有用,从tid算得本线程要负责的元素的逻辑索引,从而进行相关拷贝操作。