Functions to accelerate InnerProduct16bLayer in Synet Framework. More...
Functions | |
| SIMD_API void * | SimdSynetInnerProduct16bInit (size_t M, size_t N, size_t K, SimdTensorDataType typeA, SimdTensorDataType typeB, SimdTensorDataType typeC, SimdBool transB, SimdBool constB, SimdBool bias, SimdConvolutionActivationType activation) |
| Initializes a BF16/FP32 inner-product (matrix multiplication) context. More... | |
| SIMD_API size_t | SimdSynetInnerProduct16bInternalBufferSize (const void *context) |
| Gets the size in bytes of internal storage used by a BF16 inner-product context. More... | |
| SIMD_API size_t | SimdSynetInnerProduct16bExternalBufferSize (const void *context) |
| Gets the size in bytes of caller-provided temporary buffer for BF16 inner product. More... | |
| SIMD_API const char * | SimdSynetInnerProduct16bInfo (const void *context) |
| Gets a short description of the selected BF16 inner-product implementation. More... | |
| SIMD_API void | SimdSynetInnerProduct16bSetParams (void *context, const float *weight, const float *bias, const float *params) |
| Sets weights, bias and activation parameters for BF16 inner product. More... | |
| SIMD_API void | SimdSynetInnerProduct16bForward (void *context, const uint8_t *A, const uint8_t *B, uint8_t *buf, uint8_t *C) |
| Performs BF16/FP32 inner-product forward propagation. More... | |
Detailed Description
Functions to accelerate InnerProduct16bLayer in Synet Framework.
Function Documentation
◆ SimdSynetInnerProduct16bInit()
| void * SimdSynetInnerProduct16bInit | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| SimdTensorDataType | typeA, | ||
| SimdTensorDataType | typeB, | ||
| SimdTensorDataType | typeC, | ||
| SimdBool | transB, | ||
| SimdBool | constB, | ||
| SimdBool | bias, | ||
| SimdConvolutionActivationType | activation | ||
| ) |
Initializes a BF16/FP32 inner-product (matrix multiplication) context.
The context computes C = A*B with FP32 accumulation, optionally adds bias and applies activation. A, B and C can be FP32 or BF16 according to typeA, typeB and typeC:
for(i = 0; i < M; ++i)
for(j = 0; j < N; ++j)
{
sum = bias ? bias[j] : 0;
for(k = 0; k < K; ++k)
sum += A[i, k] * (transB ? B[j, k] : B[k, j]);
C[i, j] = ConvertToTypeC(Activate(sum, activation, params));
}
When constB is SimdTrue, matrix B must be supplied to SimdSynetInnerProduct16bSetParams and is converted or reordered into internal storage.
- Parameters
-
[in] M - a height of A and C matrices. [in] N - a width of B and C matrices. [in] K - a width of A and height of B matrices. [in] typeA - a type of A matrix. It can be FP32 or BF16. [in] typeB - a type of B matrix. It can be FP32 or BF16. [in] typeC - a type of C matrix. It can be FP32 or BF16. [in] transB - a flag indicating that B is stored as N*K instead of K*N. [in] constB - a flag indicating that matrix B is constant and can be set once. [in] bias - a flag to add bias to output matrix C. [in] activation - an activation function type used after inner product.
- Returns
- a pointer to BF16 inner product context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetInnerProduct16bInternalBufferSize, SimdSynetInnerProduct16bExternalBufferSize, SimdSynetInnerProduct16bInfo, SimdSynetInnerProduct16bSetParams and SimdSynetInnerProduct16bForward.
◆ SimdSynetInnerProduct16bInternalBufferSize()
| size_t SimdSynetInnerProduct16bInternalBufferSize | ( | const void * | context | ) |
Gets the size in bytes of internal storage used by a BF16 inner-product context.
The returned value reports internal temporary storage, reordered constant weights, copied bias and copied activation parameters.
- Parameters
-
[in] context - a pointer to BF16 inner product context. It must be created by function SimdSynetInnerProduct16bInit and released by function SimdRelease.
- Returns
- a number of bytes used by internal buffers.
◆ SimdSynetInnerProduct16bExternalBufferSize()
| size_t SimdSynetInnerProduct16bExternalBufferSize | ( | const void * | context | ) |
Gets the size in bytes of caller-provided temporary buffer for BF16 inner product.
The returned value depends on matrix types and implementation. It covers temporary BF16 copies of FP32 inputs, packed non-constant B matrices, FP32 accumulation buffers and optional post-processing buffers. It can be used to allocate the buf argument of SimdSynetInnerProduct16bForward.
- Parameters
-
[in] context - a pointer to BF16 inner product context. It must be created by function SimdSynetInnerProduct16bInit and released by function SimdRelease.
- Returns
- a number of bytes required for external temporary buffer.
◆ SimdSynetInnerProduct16bInfo()
| const char * SimdSynetInnerProduct16bInfo | ( | const void * | context | ) |
Gets a short description of the selected BF16 inner-product implementation.
The returned string contains the implementation extension, algorithm name and parameter summary. The returned pointer is owned by the context and remains valid until the next call of this function for the same context or until the context is released.
- Parameters
-
[in] context - a pointer to BF16 inner product context. It must be created by function SimdSynetInnerProduct16bInit and released by function SimdRelease.
- Returns
- a string with description of internal implementation of BF16 inner product algorithm.
◆ SimdSynetInnerProduct16bSetParams()
| void SimdSynetInnerProduct16bSetParams | ( | void * | context, |
| const float * | weight, | ||
| const float * | bias, | ||
| const float * | params | ||
| ) |
Sets weights, bias and activation parameters for BF16 inner product.
This function must be called before SimdSynetInnerProduct16bForward. If constB was SimdTrue during initialization, weight provides matrix B in FP32 form and the implementation converts it to BF16 and may reorder it into internal storage. Bias is copied to an internal FP32 array; when bias is NULL, zeros are used. Activation parameters are copied or expanded to the internal FP32 array according to SimdConvolutionActivationType.
- Parameters
-
[in,out] context - a pointer to BF16 inner product context. It must be created by function SimdSynetInnerProduct16bInit and released by function SimdRelease. [in] weight - a pointer to FP32 matrix B weights. Can be NULL only when B is not constant. [in] bias - a pointer to FP32 bias array with N elements. Can be NULL. [in] params - a pointer to FP32 parameters of activation function (see SimdConvolutionActivationType). Can be NULL when activation does not require parameters.
◆ SimdSynetInnerProduct16bForward()
| void SimdSynetInnerProduct16bForward | ( | void * | context, |
| const uint8_t * | A, | ||
| const uint8_t * | B, | ||
| uint8_t * | buf, | ||
| uint8_t * | C | ||
| ) |
Performs BF16/FP32 inner-product forward propagation.
The function converts FP32 A or B inputs to BF16 when requested by the context, uses BF16 inputs directly otherwise, accumulates the matrix product in FP32, adds bias, applies activation and writes FP32 or BF16 output according to typeC.
- Parameters
-
[in] context - a pointer to BF16 inner product context. It must be created by function SimdSynetInnerProduct16bInit and released by function SimdRelease. [in] A - a pointer to A matrix. Actual element type is defined by typeA in initialization. [in] B - a pointer to B matrix. Can be NULL if B is constant; in that case B must be set by function SimdSynetInnerProduct16bSetParams. Actual element type is defined by typeB in initialization for non-constant B. [out] buf - a pointer to external temporary byte buffer. The required size is determined by function SimdSynetInnerProduct16bExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] C - a pointer to output matrix. Actual element type is defined by typeC in initialization.