Functions to accelerate InnerProductLayer in Synet Framework. More...
Functions | |
| SIMD_API void * | SimdSynetInnerProduct32fInit (size_t M, size_t N, size_t K, SimdBool transB, SimdBool constB, SimdBool bias, SimdConvolutionActivationType activation) |
| Initializes an FP32 inner-product (matrix multiplication) context. More... | |
| SIMD_API size_t | SimdSynetInnerProduct32fInternalBufferSize (const void *context) |
| Gets the size of internal storage used by an FP32 inner-product context. More... | |
| SIMD_API size_t | SimdSynetInnerProduct32fExternalBufferSize (const void *context) |
| Gets the size of caller-provided temporary buffer for FP32 inner product. More... | |
| SIMD_API void | SimdSynetInnerProduct32fSetParams (void *context, const float *weight, SimdBool *internal, const float *bias, const float *params) |
| Sets weights, bias and activation parameters for FP32 inner product. More... | |
| SIMD_API void | SimdSynetInnerProduct32fForward (void *context, const float *A, const float *B, float *buf, float *C) |
| Performs FP32 inner-product forward propagation. More... | |
| SIMD_API void | SimdSynetInnerProductLayerForward (const float *src, const float *weight, const float *bias, size_t count, size_t size, float *dst) |
| Performs FP32 forward propagation of a single inner-product layer. More... | |
| SIMD_API void | SimdSynetInnerProduct8i (size_t M, size_t N, size_t K, const uint8_t *src, const int8_t *weight, int32_t *dst, SimdSynetCompatibilityType compatibility) |
| Performs UINT8-by-INT8 inner product with INT32 output. More... | |
Detailed Description
Functions to accelerate InnerProductLayer in Synet Framework.
Function Documentation
◆ SimdSynetInnerProduct32fInit()
| void * SimdSynetInnerProduct32fInit | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| SimdBool | transB, | ||
| SimdBool | constB, | ||
| SimdBool | bias, | ||
| SimdConvolutionActivationType | activation | ||
| ) |
Initializes an FP32 inner-product (matrix multiplication) context.
The context computes C = A*B, optionally adds bias and applies activation:
for(i = 0; i < M; ++i)
for(j = 0; j < N; ++j)
{
sum = bias ? bias[j] : 0;
for(k = 0; k < K; ++k)
sum += A[i, k] * (transB ? B[j, k] : B[k, j]);
C[i, j] = Activate(sum, activation, params);
}
When constB is SimdTrue, matrix B must be supplied to SimdSynetInnerProduct32fSetParams and can be reordered or cached inside the context.
- Parameters
-
[in] M - a height of A and C matrices. [in] N - a width of B and C matrices. [in] K - a width of A and height of B matrices. [in] transB - a flag indicating that B is stored as N*K instead of K*N. [in] constB - a flag indicating that matrix B is constant and can be set once. [in] bias - a flag to add bias to output matrix C. [in] activation - an activation function type used after inner product.
- Returns
- a pointer to FP32 inner product context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetInnerProduct32fInternalBufferSize, SimdSynetInnerProduct32fExternalBufferSize, SimdSynetInnerProduct32fSetParams and SimdSynetInnerProduct32fForward.
◆ SimdSynetInnerProduct32fInternalBufferSize()
| size_t SimdSynetInnerProduct32fInternalBufferSize | ( | const void * | context | ) |
Gets the size of internal storage used by an FP32 inner-product context.
The returned value is a number of FP32 elements. It reports implementation-specific storage such as reordered constant weights and copied bias.
- Parameters
-
[in] context - a pointer to FP32 inner product context. It must be created by function SimdSynetInnerProduct32fInit and released by function SimdRelease.
- Returns
- a number of FP32 elements used by internal buffers.
◆ SimdSynetInnerProduct32fExternalBufferSize()
| size_t SimdSynetInnerProduct32fExternalBufferSize | ( | const void * | context | ) |
Gets the size of caller-provided temporary buffer for FP32 inner product.
The returned value is a number of FP32 elements. The current FP32 implementations do not require an external buffer and return 0, but callers can use this value when allocating the buf argument of SimdSynetInnerProduct32fForward.
- Parameters
-
[in] context - a pointer to FP32 inner product context. It must be created by function SimdSynetInnerProduct32fInit and released by function SimdRelease.
- Returns
- a number of FP32 elements required for external temporary buffer.
◆ SimdSynetInnerProduct32fSetParams()
| void SimdSynetInnerProduct32fSetParams | ( | void * | context, |
| const float * | weight, | ||
| SimdBool * | internal, | ||
| const float * | bias, | ||
| const float * | params | ||
| ) |
Sets weights, bias and activation parameters for FP32 inner product.
This function must be called before SimdSynetInnerProduct32fForward. If constB was SimdTrue during initialization, weight provides matrix B and the implementation may reorder and store it internally. If internal is not NULL, SimdTrue means the weights were copied/reordered into the context; SimdFalse means the original weight pointer can be used by later forward calls and must remain valid. Bias and activation parameters are stored or referenced according to the selected implementation.
- Parameters
-
[in,out] context - a pointer to FP32 inner product context. It must be created by function SimdSynetInnerProduct32fInit and released by function SimdRelease. [in] weight - a pointer to FP32 matrix B weights. [out] internal - a pointer to a flag receiving weight storage mode. Can be NULL. [in] bias - a pointer to FP32 bias array with N elements. Can be NULL. [in] params - a pointer to FP32 parameters of activation function (see SimdConvolutionActivationType). Can be NULL when activation does not require parameters.
◆ SimdSynetInnerProduct32fForward()
| void SimdSynetInnerProduct32fForward | ( | void * | context, |
| const float * | A, | ||
| const float * | B, | ||
| float * | buf, | ||
| float * | C | ||
| ) |
Performs FP32 inner-product forward propagation.
- Parameters
-
[in] context - a pointer to FP32 inner product context. It must be created by function SimdSynetInnerProduct32fInit and released by function SimdRelease. [in] A - a pointer to FP32 A matrix with M*K elements. [in] B - a pointer to FP32 B matrix. Can be NULL if B is constant; in that case B must be set by function SimdSynetInnerProduct32fSetParams. [out] buf - a pointer to external temporary FP32 buffer. The required number of elements is determined by function SimdSynetInnerProduct32fExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] C - a pointer to FP32 output matrix with M*N elements.
◆ SimdSynetInnerProductLayerForward()
| void SimdSynetInnerProductLayerForward | ( | const float * | src, |
| const float * | weight, | ||
| const float * | bias, | ||
| size_t | count, | ||
| size_t | size, | ||
| float * | dst | ||
| ) |
Performs FP32 forward propagation of a single inner-product layer.
Algorithm's details:
for(i = 0; i < count; ++i)
{
dst[i] = (bias ? bias[i] : 0);
for(j = 0; j < size; ++j)
dst[i] += src[j]*weight[i*size + j];
}
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the input FP32 array with size elements. [in] weight - a pointer to FP32 weight coefficients with count*size elements, stored as count rows. [in] bias - a pointer to FP32 bias coefficients with count elements. Can be NULL. [in] count - a number of output elements. [in] size - a number of input elements. [out] dst - a pointer to the output FP32 array with count elements.
◆ SimdSynetInnerProduct8i()
| void SimdSynetInnerProduct8i | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| const uint8_t * | src, | ||
| const int8_t * | weight, | ||
| int32_t * | dst, | ||
| SimdSynetCompatibilityType | compatibility | ||
| ) |
Performs UINT8-by-INT8 inner product with INT32 output.
Algorithm's details:
for (i = 0; i < M; ++i)
{
for (j = 0; j < N; ++j)
{
sum = 0;
for (k = 0; k < K; ++k)
sum += int(src[i*K + k]) * int(weight[j*K + k]);
dst[i*N + j] = sum;
}
}
When compatibility flags allow overflow-compatible multiplication, adjacent products can be accumulated with 16-bit saturation before being added to the INT32 sum. Use SimdSynetCompatibility8iPrecise to request the precise product accumulation path.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] M - a batch size, or a number of input rows. [in] N - an output size, or a number of weight rows. [in] K - an input size, or a row length. [in] src - a pointer to the UINT8 input matrix with M*K elements. [in] weight - a pointer to the INT8 weight matrix with N*K elements, stored by output row. [out] dst - a pointer to the INT32 output matrix with M*N elements. [in] compatibility - calculation compatibility flags (see SimdSynetCompatibilityType).