Simd Library Documentation.

Home | Release Notes | Download | Documentation | Issues | GitHub
FP32 merged convolution framework

A framework to accelerate FP32 merged convolution in Synet Framework. More...

Functions

SIMD_API void * SimdSynetMergedConvolution32fInit (size_t batch, const SimdConvolutionParameters *convs, size_t count, SimdBool add)
 Initializes an FP32 merged convolution context. More...
 
SIMD_API size_t SimdSynetMergedConvolution32fExternalBufferSize (const void *context)
 Gets the size of the optional external temporary buffer for FP32 merged convolution. More...
 
SIMD_API size_t SimdSynetMergedConvolution32fInternalBufferSize (const void *context)
 Gets the size of internal storage used by an FP32 merged convolution context. More...
 
SIMD_API const char * SimdSynetMergedConvolution32fInfo (const void *context)
 Gets a textual description of the selected FP32 merged convolution implementation. More...
 
SIMD_API void SimdSynetMergedConvolution32fSetParams (void *context, const float *const *weight, SimdBool *internal, const float *const *bias, const float *const *params)
 Sets weights, biases and activation parameters for FP32 merged convolution. More...
 
SIMD_API void SimdSynetMergedConvolution32fForward (void *context, const float *src, float *buf, float *dst)
 Performs forward propagation through the fused FP32 convolution sequence. More...
 

Detailed Description

A framework to accelerate FP32 merged convolution in Synet Framework.

Function Documentation

◆ SimdSynetMergedConvolution32fInit()

void * SimdSynetMergedConvolution32fInit ( size_t  batch,
const SimdConvolutionParameters convs,
size_t  count,
SimdBool  add 
)

Initializes an FP32 merged convolution context.

The context fuses a sequence of two or three NHWC convolutions into one forward call: convolution + depthwise convolution, depthwise convolution + convolution, or convolution + depthwise convolution + convolution. The first and last tensors must be FP32. Supported kernels are 1x1 or 3x3 for ordinary convolutions, 3x3, 5x5 or 7x7 for depthwise convolutions; dilation must be 1 and stride must be 1, 2 or 3. If add is SimdTrue for a three-convolution sequence, the source tensor is added to the final output and therefore must have the same shape as the final destination tensor.

Parameters
[in]batch- a batch size.
[in]convs- an array with convolution parameters in execution order.
[in]count- a number of merged convolutions. It must be 2 or 3.
[in]add- a flag that enables adding the source tensor to the final output tensor.
Returns
a pointer to FP32 merged convolution context. On error it returns NULL. It must be released with function SimdRelease. This pointer is used in functions SimdSynetMergedConvolution32fExternalBufferSize, SimdSynetMergedConvolution32fInternalBufferSize, SimdSynetMergedConvolution32fInfo, SimdSynetMergedConvolution32fSetParams and SimdSynetMergedConvolution32fForward.

◆ SimdSynetMergedConvolution32fExternalBufferSize()

size_t SimdSynetMergedConvolution32fExternalBufferSize ( const void *  context)

Gets the size of the optional external temporary buffer for FP32 merged convolution.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
Returns
a number of FP32 elements required for the external temporary buffer passed to SimdSynetMergedConvolution32fForward.

◆ SimdSynetMergedConvolution32fInternalBufferSize()

size_t SimdSynetMergedConvolution32fInternalBufferSize ( const void *  context)

Gets the size of internal storage used by an FP32 merged convolution context.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
Returns
a number of FP32 elements stored inside the context (temporary buffer, reordered weights, biases and activation parameters).

◆ SimdSynetMergedConvolution32fInfo()

const char * SimdSynetMergedConvolution32fInfo ( const void *  context)

Gets a textual description of the selected FP32 merged convolution implementation.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
Returns
a zero-terminated string with the selected implementation name.

◆ SimdSynetMergedConvolution32fSetParams()

void SimdSynetMergedConvolution32fSetParams ( void *  context,
const float *const *  weight,
SimdBool internal,
const float *const *  bias,
const float *const *  params 
)

Sets weights, biases and activation parameters for FP32 merged convolution.

Parameters
[in,out]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
[in]weight- an array of pointers to FP32 convolution weights. The array size must be equal to the number of merged convolutions.
[out]internal- an array of flags set to SimdTrue when the corresponding weights were reordered and copied to the context, or SimdFalse when they are used directly. The array size must be equal to the number of merged convolutions. Can be NULL.
[in]bias- an array of pointers to FP32 bias arrays, one per convolution. Each pointer can be NULL.
[in]params- an array of pointers to activation parameters (see SimdConvolutionActivationType), one per convolution. Each pointer can be NULL for activations that do not use parameters.

◆ SimdSynetMergedConvolution32fForward()

void SimdSynetMergedConvolution32fForward ( void *  context,
const float *  src,
float *  buf,
float *  dst 
)

Performs forward propagation through the fused FP32 convolution sequence.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
[in]src- a pointer to the FP32 input tensor with batch*convs[0].srcC*convs[0].srcH*convs[0].srcW elements.
[out]buf- a pointer to an external temporary FP32 buffer. Its size is determined by function SimdSynetMergedConvolution32fExternalBufferSize. Can be NULL (it causes usage of internal buffer).
[out]dst- a pointer to the FP32 output tensor with batch*convs[count - 1].dstC*convs[count - 1].dstH*convs[count - 1].dstW elements.