This document is relevant for: Inf2
, Trn1
, Trn2
nki.language.matmul#
- nki.language.matmul(x, y, *, transpose_x=False, mask=None, **kwargs)[source]#
x @ y
matrix multiplication ofx
andy
.((Similar to numpy.matmul))
Note
For optimal performance on hardware, use
nki.isa.nc_matmul()
or callnki.language.matmul
withtranspose_x=True
. Usenki.isa.nc_matmul
also to access low-level features of the Tensor Engine.Note
Implementation details:
nki.language.matmul
callsnki.isa.nc_matmul
under the hood.nc_matmul
is neuron specific customized implementation of matmul that computesx.T @ y
, as a result,matmul(x, y)
lowers tonc_matmul(transpose(x), y)
. To avoid this extra transpose instruction being inserted, usex.T
andtranspose_x=True
inputs to thismatmul
.- Parameters:
x – a tile on SBUF (partition dimension
<= 128
, free dimension<= 128
),x
’s free dimension must matchy
’s partition dimension.y – a tile on SBUF (partition dimension
<= 128
, free dimension<= 512
)transpose_x – Defaults to False. If
True
,x
is treated as already transposed. IfFalse
, an additional transpose will be inserted to makex
’s partition dimension the contract dimension of the matmul to align with the Tensor Engine.mask – (optional) a compile-time constant predicate that controls whether/how this instruction is executed (see NKI API Masking for details)
- Returns:
x @ y
orx.T @ y
iftranspose_x=True
This document is relevant for: Inf2
, Trn1
, Trn2