Simd Library Release Notes (2024).

Home | Release Notes | Download | Documentation | Issues | GitHub

2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013
Home

December 2, 2024 (version 6.1.144)

Algorithms

New features
  • SSE4.1, AVX2 optimizations of function Yuv444pToRgbaV2.
  • SSE4.1 optimizations of class ImageJpegLoader.
  • isRgb parameter of function Simd::SynetSetInput.
Bug fixing
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemm.

Python wrapper

New features
  • isRgb parameter of function Simd.SynetSetInput.
Home

November 4, 2024 (version 6.1.143)

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetConvolution16bNhwcDepthwise.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w4 for class SynetConvolution32fNhwcDepthwise.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w4 for class SynetMergedConvolution16b.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w6 for class SynetConvolution32fNhwcDepthwise.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w8 for class SynetConvolution32fNhwcDepthwise.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w6 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w8 for class SynetMergedConvolution16b.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w4 for framework SynetMergedConvolution32f.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w6 for framework SynetMergedConvolution32f.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w8 for framework SynetMergedConvolution32f.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w8 for class SynetMergedConvolution16b.
  • Base implementation of function Yuv444pToRgbaV2.
Improving
  • AVX-512BW optimizations of function Convolution32fNhwcDepthwiseDefault.
  • AMX-BF16 optimizations of function DepthwiseConvolutionLargePad.
Bug fixing
  • Error in Base implementation of class SynetDeconvolution16bNhwcGemm.

Test framework

New features
  • Tests for verifying functionality of function SimdYuv444pToRgbaV2.
Home

October 1, 2024 (version 6.1.142)

Algorithms

New features
  • Base implementation of class SynetDeconvolution16bGemm.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetDeconvolution16bNhwcGemm.
  • AMX-BF16 (AVX-512VBMI) optimizations of function DeinterleaveUv.
  • AMX-BF16 (AVX-512VBMI) optimizations of function DeinterleaveBgr.
  • AMX-BF16 (AVX-512VBMI) optimizations of function DeinterleaveBgra.
Improving
  • AVX-512BW optimizations of function ConvolutionDirectNhwcConvolutionBiasActivationDepthwise.
Removing
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution32fBf16NhwcGemm.
  • Base implementation of class SynetConvolution32fBf16Gemm.
  • Parameter 'compatibility' from function SynetConvolution32fInit.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Cdc.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Cd.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Dc.
  • Base implementation of class SynetMergedConvolution32fBf16.
  • Parameter 'compatibility' from function SynetMergedConvolution32fInit.

Test framework

New features
  • Tests for verifying functionality of SynetDeconvolution16b framework.
Home

September 2, 2024 (version 6.1.141)

Algorithms

New features
  • Support of BFloat16 in Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of class ResizerNearest.
Bug fixing
  • Compiler warning in function Simd::LitterCpuCache.
  • Error in AVX-512BW optimizations of class SynetInnerProduct16bGemmNN.
Home

August 19, 2024 (version 6.1.140)

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetRelu16b.
  • API of SynetAdd16b framework.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetAdd16bUniform.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations, AMX-BF16 of class SynetConvolution16bNchwGemm.
Improving
  • AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
Bug fixing
  • Error in Base implementation of class SynetMergedConvolution16bCdc.
  • Error in Base implementation of class SynetMergedConvolution16bDc.
  • Error in Base implementation of class SynetInnerProduct16bGemmNN.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function Float32ToBFloat16.

Test framework

New features
  • Tests for verifying functionality of function SynetRelu16b.
  • Tests for verifying functionality of SynetAdd16b framework.
Home

July 1, 2024 (version 6.1.139)

Algorithms

New features
  • API of SynetInnerProduct16b framework.
  • Base implementation of class SynetInnerProduct16bRef.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
Bug fixing
  • Error in AVX-512BF16 optimizations of class SynetConvolution16bNhwcDirect.
  • Error in Base implementation of class SynetConvolution16bNhwcGemm.
  • Error in SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of function Convert16bNhwcDirect.
  • Error in SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of function Reorder16bNhwcDirect.
  • Error in Base implementation of class SynetMergedConvolution16bCdc.
  • Error in Base implementation of class SynetMergedConvolution16bDc.
  • Error in Base implementation of class SynetMergedConvolution16bCd.
  • Error in AMX-BF16 optimizations of class SynetMergedConvolution16bDc.

Test framework

New features
  • Tests for verifying functionality of SynetInnerProduct16b framework.
Home

June 3, 2024 (version 6.1.138)

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcDirect.
  • SimdCpuInfoCurrentFrequency in SimdCpuInfoType enumeration.
  • API of SynetMergedConvolution16b framework.
  • Base implementation of class SynetMergedConvolution16b.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution16bDc.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution16bCd.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution16bCdc.
  • Support of YUV420P format to Simd::Frame.
Improving
  • AVX-512BF16 optimizations of class SynetConvolution16bNhwcGemm.
Bug fixing
  • Errors in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemm.
  • Error in Base implementation of class SynetMergedConvolution8i.

Test framework

New features
  • -wu command line option to set CPU warm up time in milliseconds.
  • Tests for verifying functionality of SynetMergedConvolution16b framework.

Infrastructure

Bug fixing
  • Errors in build_and_test_gcc section in Github actions script for CMake.
Home

May 2, 2024 (version 6.1.137)

Algorithms

New features
  • AMX-BF16 (AVX-512VBMI) optimizations of function DescrIntCosineDistance.
  • AMX-BF16 (AVX-512VBMI, AMX-INT8) optimizations of function DescrIntCosineDistancesMxNa.
  • AMX-BF16 (AVX-512VBMI, AMX-INT8) optimizations of function DescrIntCosineDistancesMxNp.
  • API of SynetConvolution16b framework.
  • Base implementation of class SynetConvolution16bGemm.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemm.
Improving
  • AVX-512VNNI optimizations of function DescrIntCosineDistance.
  • AVX-512VNNI optimizations of function DescrIntCosineDistancesMxNa.
  • AVX-512VNNI optimizations of function DescrIntCosineDistancesMxNp.

Test framework

New features
  • Tests for verifying functionality of SynetConvolution16b framework.
Home

April 2, 2024 (version 6.1.136)

Algorithms

New features
  • AMX-BF16 (AVX-512VBMI) optimizations of function ChangeColors.
  • AMX-BF16 (AVX-512VBMI) optimizations of function NormalizeHistogram.
Improving
  • AMX-BF16 optimizations of class SynetConvolution32fBf16NhwcGemm.
Bug fixing
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution32fBf16NhwcGemm.

Test framework

New features
  • Command line parameter to disable testing of some SIMD extensions.
Bug fixing
  • Error in test of function Nv12SaveAsJpegToMemory.
Home

March 1, 2024 (version 6.1.135)

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution32fBf16NhwcGemm.
  • AMX-BF16 optimizations of function Float32ToBFloat16.
  • Support of SimdSynetUnaryOperation32fCos in function SynetUnaryOperation32f.
  • Support of SimdSynetUnaryOperation32fSin in function SynetUnaryOperation32f.
Bug fixing
  • Error in function SimdCpuInfo (wrong AMX-BF16 detection).
  • Error in AVX-512BF16 optimization of function Float32ToBFloat16.
  • Error in AMX initialization in function AmxBf16::SupportedByOS.
  • Crash in function AmxBf16::ConvolutionBf16NhwcConv_2.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Cdc.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Cd.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetMergedConvolution32fBf16Dc.
Removing
  • AVX-512BF16 optimizations of function Float32ToBFloat16.
  • AVX-512BF16 optimizations of SynetConvolution32fBf16Nhwc.
  • AVX-512BF16 optimizations of class SynetMergedConvolution32fBf16Cdc.
  • AVX-512BF16 optimizations of class SynetMergedConvolution32fBf16Cd.
  • AVX-512BF16 optimizations of class SynetMergedConvolution32fBf16Dc.
  • Stopping of separate support of AVX-512BF16 extension (only together with AMX-BF16).

Test framework

Bug fixing
  • Error in test of SynetMergedConvolution32f framework.

Infrastructure

Removing
  • Avx512Bf16 project for MSVS-2022.
  • Avx512Bf16 project for MSVS-2019.
  • Avx512Bf16 project for MSVS-2015.
  • Avx512Bf16 project for MSVS-2017.
  • Avx512Bf16 project for CMake.
Home

February 1, 2024 (version 6.0.134)

Algorithms

New features
  • SSE4.1 optimizations of ResizerFloatBilinear class.
Improving
  • Improve AVX2 optimizations of ResizerFloatBilinear class (AMD CPU).
  • Improve AVX2 optimizations of ResizerShortBilinear class (AMD CPU).
Bug fixing
  • MSVS compiler bug in file SimdAvx512bwYuvToBgraV2.
  • Linux, GCC-13 - crash in function SimdSynetInnerProduct32fForward.
  • MSVS compiler bug (Cmake, Windows for ARM64) with functions Extract4Sums.
  • GCC-9/10 - compiler error in AVX-512BW optimization of function YToGray.
  • GCC-9/10 - compiler error in AVX-512BW optimization of function GrayToY.
Replacing
  • Replace AVX optimizations to AVX2 for function CosineDistance32f.
  • Replace AVX optimizations to AVX2 for function Fill32f.
  • Replace AVX optimizations to AVX2 for ResizerFloatBilinear class.
  • Replace AVX optimizations to AVX2 for function SquaredDifferenceSum32f.
  • Replace AVX optimizations to AVX2 for function SquaredDifferenceKahanSum32f.
  • Replace AVX optimizations to AVX2 for function HogLiteFilterFeatures.
  • Replace AVX optimizations to AVX2 for function HogLiteResizeFeatures.
  • Replace AVX optimizations to AVX2 for function HogLiteCompressFeatures.
  • Replace AVX optimizations to AVX2 for function HogLiteFilterSeparable.
  • Replace AVX optimizations to AVX2 for function NeuralPooling2x2Max2x2.
  • Replace AVX optimizations to AVX2 for function NeuralProductSum.
  • Replace AVX optimizations to AVX2 for function NeuralAddVectorMultipliedByValue.
  • Replace AVX optimizations to AVX2 for function NeuralAddVector.
  • Replace AVX optimizations to AVX2 for function NeuralAddValue.
  • Replace AVX optimizations to AVX2 for function NeuralRoughSigmoid.
  • Replace AVX optimizations to AVX2 for function NeuralRoughSigmoid2.
  • Replace AVX optimizations to AVX2 for function NeuralRoughTanh.
  • Replace AVX optimizations to AVX2 for function NeuralDerivativeRelu.
  • Replace AVX optimizations to AVX2 for function NeuralDerivativeTanh.
  • Replace AVX optimizations to AVX2 for function NeuralDerivativeSigmoid.
  • Replace AVX optimizations to AVX2 for function NeuralUpdateWeights.
  • Replace AVX optimizations to AVX2 for function NeuralAdaptiveGradientUpdate.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution2x2Forward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution3x3Forward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution4x4Forward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution5x5Forward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution2x2Backward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution3x3Backward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution4x4Backward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution5x5Backward.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution2x2Sum.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution3x3Sum.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution4x4Sum.
  • Replace AVX optimizations to AVX2 for function NeuralAddConvolution5x5Sum.
  • Replace AVX optimizations to AVX2 for function NeuralConvolutionForward.
  • Replace AVX optimizations to AVX2 for function SynetAddBias.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward0.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward1.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward2.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward3.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward4.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward8.
  • Replace AVX optimizations to AVX2 for function SynetFusedLayerForward9.
  • Replace AVX optimizations to AVX2 for function SynetPoolingAverage.
  • Replace AVX optimizations to AVX2 for function SynetShuffleLayerForward.
  • Replace AVX optimizations to AVX2 for function SynetHardSigmoid32f.
  • Replace AVX optimizations to AVX2 for function SynetHswish32f.
  • Replace AVX optimizations to AVX2 for function SynetPreluLayerForward.
  • Replace AVX optimizations to AVX2 for function SynetRelu32f.
  • Replace AVX optimizations to AVX2 for function SynetRestrictRange32f.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x3Block1x4SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x3Block1x4SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x3Block1x4SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x5Block1x4SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x5Block1x4SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel1x5Block1x4SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block2x2SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block2x2SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block2x2SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block4x4SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block4x4SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel2x2Block4x4SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block2x2SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block2x2SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block2x2SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block3x3SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block3x3SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block3x3SetOutput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block4x4SetFilter.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block4x4SetInput.
  • Replace AVX optimizations to AVX2 for function WinogradKernel3x3Block4x4SetOutput.
  • Replace AVX optimizations to AVX2 for function GemmPackA.
  • Replace AVX optimizations to AVX2 for function GemmPackB.
  • Replace AVX optimizations to AVX2 for function GemmScaleC.
  • Replace AVX optimizations to AVX2 for function SynetScaleLayerForward.
  • Replace AVX optimizations to AVX2 for function SynetInnerProductLayerForward.
  • Replace AVX optimizations to AVX2 for function SynetInnerProduct32fInit.
  • Replace AVX optimizations to AVX2 for function SynetEltwiseLayerForward.
  • Replace AVX optimizations to AVX2 for function SynetDeconvolution32fInit.
  • Replace AVX optimizations to AVX2 for function SynetMergedConvolution32fInit.
  • Replace AVX optimizations to AVX2 for SynetInnerProduct32fGemm class.
  • Replace AVX optimizations to AVX2 for SynetInnerProduct32fProd class.
  • Replace AVX optimizations to AVX2 for SynetDeconvolution32fGemmNN class.
  • Replace AVX optimizations to AVX2 for SynetDeconvolution32fNhwcDirect2x2 class.
  • Replace AVX optimizations to AVX2 for SynetMergedConvolution32fCdc class.
  • Replace AVX optimizations to AVX2 for SynetMergedConvolution32fCd class.
  • Replace AVX optimizations to AVX2 for SynetMergedConvolution32fDc class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fDepthwiseDotProduct class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fDirectNchw class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fDirectNhwc class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fNhwcDirect class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fGemmNT class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fGemmNТ class.
  • Replace AVX optimizations to AVX2 for SynetConvolution32fWinograd class.
  • Replace AVX optimizations to AVX2 for function SynetConvolution32fInit.
  • Replace AVX optimizations to AVX2 for function SynetMergedConvolution32fInit.
  • Replace AVX optimizations to AVX2 for function SynetMergedConvolution32fInit.
Removing
  • Base implementation, SSE4.1, AVX, AVX-512BW, NEON, VSX optimizations of function SvmSumLinear.
  • Stopping of separate support of AVX extension (only together with AVX2).
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundGrowRangeSlow.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundGrowRangeFast.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundIncrementCount.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundAdjustRange.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundAdjustRangeMasked.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundShiftRange.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function EdgeBackgroundShiftRangeMasked.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function InterferenceIncrement.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function InterferenceIncrementMasked.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function InterferenceDecrement.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function InterferenceDecrementMasked.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward0.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward1.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward2.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward3.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward4.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward8.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function SynetFusedLayerForward9.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgraToYuv420p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgraToYuv422p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgraToYuv444p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function BgraToYuva420p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgrToYuv420p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgrToYuv422p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function BgrToYuv444p.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuva420pToBgra.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv420pToBgra.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv422pToBgra.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv444pToBgra.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv420pToBgr.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv422pToBgr.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function Yuv444pToBgr.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuv420pToRgb.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuv422pToRgb.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuv444pToRgb.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON, VMX optimizations of function ResizeBilinear.
  • Function Simd::ResizeAreaGray.
  • Function Simd::ResizeArea.
  • VSX optimizations of function HogDirectionHistograms.
  • VSX optimizations of function NeuralConvert.
  • VSX optimizations of function NeuralProductSum.
  • VSX optimizations of function NeuralRoughSigmoid.
  • VSX optimizations of function SquaredDifferenceSum32f.
  • VSX optimizations of function SquaredDifferenceKahanSum32f.
  • VSX optimizations of function Yuv420pToHue.
  • VSX optimizations of function Yuv444pToHue.
  • Stopping of support of VSX(Power7) extension.
  • VMX optimizations of function AbsDifferenceSum.
  • VMX optimizations of function AbsDifferenceSumMasked.
  • VMX optimizations of function AbsDifferenceSums3x3.
  • VMX optimizations of function AbsDifferenceSums3x3Masked.
  • VMX optimizations of function LbpEstimate.
  • VMX optimizations of function FillBgr.
  • VMX optimizations of function FillBgra.
  • VMX optimizations of function AbsGradientSaturatedSum.
  • VMX optimizations of function AlphaBlending.
  • VMX optimizations of function BgraToBayer.
  • VMX optimizations of function BgrToBayer.
  • VMX optimizations of function BgraToBgr.
  • VMX optimizations of function BgraToGray.
  • VMX optimizations of function BgrToBgra.
  • VMX optimizations of function Bgr48pToBgra32.
  • VMX optimizations of function BgrToGray.
  • VMX optimizations of function GaussianBlur3x3.
  • VMX optimizations of function GrayToBgr.
  • VMX optimizations of function GrayToBgra.
  • VMX optimizations of function StretchGray2x2.
  • VMX optimizations of function Binarization.
  • VMX optimizations of function AveragingBinarization.
  • VMX optimizations of function DeinterleaveUv.
  • VMX optimizations of function InterleaveUv.
  • VMX optimizations of function Laplace.
  • VMX optimizations of function LaplaceAbs.
  • VMX optimizations of function LaplaceAbsSum.
  • VMX optimizations of function MeanFilter3x3.
  • VMX optimizations of function Reorder16bit.
  • VMX optimizations of function Reorder32bit.
  • VMX optimizations of function Reorder64bit.
  • VMX optimizations of function ShiftBilinear.
  • VMX optimizations of function ReduceGray2x2.
  • VMX optimizations of function ReduceGray3x3.
  • VMX optimizations of function ReduceGray4x4.
  • VMX optimizations of function ReduceGray5x5.
  • VMX optimizations of function HistogramMasked.
  • VMX optimizations of function AbsSecondDerivativeHistogram.
  • VMX optimizations of function SquaredDifferenceSum.
  • VMX optimizations of function SquaredDifferenceSumMasked.
  • VMX optimizations of function OperationBinary8u.
  • VMX optimizations of function OperationBinary16i.
  • VMX optimizations of function VectorProduct.
  • VMX optimizations of function AddFeatureDifference.
  • VMX optimizations of function MedianFilterRhomb3x3.
  • VMX optimizations of function MedianFilterRhomb5x5.
  • VMX optimizations of function MedianFilterSquare3x3.
  • VMX optimizations of function MedianFilterSquare5x5.
  • VMX optimizations of function SegmentationChangeIndex.
  • VMX optimizations of function SegmentationFillSingleHoles.
  • VMX optimizations of function SegmentationPropagate2x2.
  • VMX optimizations of function SegmentationShrinkRegion.
  • VMX optimizations of function TextureBoostedSaturatedGradient.
  • VMX optimizations of function TextureGetDifferenceSum.
  • VMX optimizations of function TexturePerformCompensation.
  • VMX optimizations of function TextureBoostedUv.
  • VMX optimizations of function ConditionalCount8u.
  • VMX optimizations of function ConditionalCount16i.
  • VMX optimizations of function ConditionalSum.
  • VMX optimizations of function ConditionalSquareSum.
  • VMX optimizations of function ConditionalSquareGradientSum.
  • VMX optimizations of function ConditionalFill.
  • VMX optimizations of function SobelDx.
  • VMX optimizations of function SobelDxAbs.
  • VMX optimizations of function SobelDxAbsSum.
  • VMX optimizations of function SobelDy.
  • VMX optimizations of function SobelDyAbs.
  • VMX optimizations of function SobelDyAbsSum.
  • VMX optimizations of function ContourMetrics.
  • VMX optimizations of function ContourMetricsMasked.
  • VMX optimizations of function ContourAnchors.
  • VMX optimizations of function GetStatistic.
  • VMX optimizations of function GetMoments.
  • VMX optimizations of function GetRowSums.
  • VMX optimizations of function GetColSums.
  • VMX optimizations of function GetAbsDyRowSums.
  • VMX optimizations of function GetAbsDxColSums.
  • VMX optimizations of function ValueSum.
  • VMX optimizations of function SquareSum.
  • VMX optimizations of function CorrelationSum.
  • VMX optimizations of function BackgroundGrowRangeSlow.
  • VMX optimizations of function BackgroundGrowRangeFast.
  • VMX optimizations of function BackgroundIncrementCount.
  • VMX optimizations of function BackgroundAdjustRange.
  • VMX optimizations of function BackgroundAdjustRangeMasked.
  • VMX optimizations of function BackgroundShiftRange.
  • VMX optimizations of function BackgroundShiftRangeMasked.
  • VMX optimizations of function BackgroundInitMask.
  • Stopping of support of VMX(Altivec) extension.
  • Stopping of support of PowerPC platform.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function NeuralRoughSigmoid.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function NeuralRoughSigmoid2.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function NeuralRoughTanh.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteExtractFeatures.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteFilterFeatures.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteResizeFeatures.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteCompressFeatures.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteFilterSeparable.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteFindMax7x7.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function HogLiteCreateMask.

Python wrapper

New features
  • Include test filters in Python Wrapper Test Framework.
  • Exclude test filters in Python Wrapper Test Framework.
  • Method Simd.Image.CopyToNumpyArray.

Test framework

Removing
  • Tests for verifying functionality of function SvmSumLinear.
  • Tests for verifying functionality of function EdgeBackgroundGrowRangeSlow.
  • Tests for verifying functionality of function EdgeBackgroundGrowRangeFast.
  • Tests for verifying functionality of function EdgeBackgroundIncrementCount.
  • Tests for verifying functionality of function EdgeBackgroundAdjustRange.
  • Tests for verifying functionality of function EdgeBackgroundAdjustRangeMasked.
  • Tests for verifying functionality of function EdgeBackgroundShiftRange.
  • Tests for verifying functionality of function EdgeBackgroundShiftRangeMasked.
  • Tests for verifying functionality of function InterferenceIncrement.
  • Tests for verifying functionality of function InterferenceIncrementMasked.
  • Tests for verifying functionality of function InterferenceDecrement.
  • Tests for verifying functionality of function InterferenceDecrementMasked.
  • Tests for verifying functionality of function SynetFusedLayerForward0.
  • Tests for verifying functionality of function SynetFusedLayerForward1.
  • Tests for verifying functionality of function SynetFusedLayerForward2.
  • Tests for verifying functionality of function SynetFusedLayerForward3.
  • Tests for verifying functionality of function SynetFusedLayerForward4.
  • Tests for verifying functionality of function SynetFusedLayerForward8.
  • Tests for verifying functionality of function SynetFusedLayerForward9.
  • Tests for verifying functionality of function BgraToYuv420p.
  • Tests for verifying functionality of function BgraToYuv422p.
  • Tests for verifying functionality of function BgraToYuv444p.
  • Tests for verifying functionality of function BgraToYuva420p.
  • Tests for verifying functionality of function BgrToYuv420p.
  • Tests for verifying functionality of function BgrToYuv422p.
  • Tests for verifying functionality of function BgrToYuv444p.
  • Tests for verifying functionality of function Yuva420pToBgra.
  • Tests for verifying functionality of function Yuv420pToBgra.
  • Tests for verifying functionality of function Yuv422pToBgra.
  • Tests for verifying functionality of function Yuv444pToBgra.
  • Tests for verifying functionality of function Yuv420pToBgr.
  • Tests for verifying functionality of function Yuv422pToBgr.
  • Tests for verifying functionality of function Yuv444pToBgr.
  • Tests for verifying functionality of function Yuv420pToRgb.
  • Tests for verifying functionality of function Yuv422pToRgb.
  • Tests for verifying functionality of function Yuv444pToRgb.
  • Tests for verifying functionality of function ResizeBilinear.
  • Tests for verifying functionality of function NeuralRoughSigmoid.
  • Tests for verifying functionality of function NeuralRoughSigmoid2.
  • Tests for verifying functionality of function NeuralRoughTanh.
  • Tests for verifying functionality of function HogLiteExtractFeatures.
  • Tests for verifying functionality of function HogLiteFilterFeatures.
  • Tests for verifying functionality of function HogLiteResizeFeatures.
  • Tests for verifying functionality of function HogLiteCompressFeatures.
  • Tests for verifying functionality of function HogLiteFilterSeparable.
  • Tests for verifying functionality of function HogLiteFindMax7x7.
  • Tests for verifying functionality of function HogLiteCreateMask.

Infrastructure

New features
  • SIMD_PYTHON CMake option.
  • build_and_test_gcc section in Github actions script for CMake supports different GCC versions (10, 11, 13).
Improving
  • Parallelization of Github actions script for MSBuild.
  • Parallelization of Github actions script for CMake.
Removing
  • Avx1 project for MSVS-2022.
  • Avx1 project for MSVS-2019.
  • Avx1 project for MSVS-2015.
  • Avx1 project for MSVS-2017.
  • Avx1 project for CMake.
  • Vsx project for MSVS-2022.
  • Vsx project for MSVS-2019.
  • Vsx project for MSVS-2015.
  • Vsx project for MSVS-2017.
  • Vsx project for CMake.
  • Vmx project for MSVS-2022.
  • Vmx project for MSVS-2019.
  • Vmx project for MSVS-2015.
  • Vmx project for MSVS-2017.
  • Vmx project for CMake.
  • ppc project for CMake.
  • build_ppc64_cross section in Github actions script for CMake.
Home

January 3, 2024 (version 5.4.133)

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetNormalizeLayerForwardV4.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetConvolution32fNhwcGroupedBlock1x2.
  • Function ImageSaveToFile can choose output file format (if it is undefined) by file extension.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function GrayToY.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function YToGray.
  • Add yuvType parameter in Frame structure.
  • Support of SimdSynetUnaryOperation32fCeil in function SynetUnaryOperation32f.
  • Support of SimdSynetUnaryOperation32fFloor in function SynetUnaryOperation32f.
  • Function of Simd::Yuva444pToBgra.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuva422pToBgraV2.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuva420pToBgraV2.
  • The mark of function SimdYuva420pToBgra as deprecated.

Python wrapper

New features
  • Wrapper for enumeration Simd.WarpAffineFlags.
  • Wrapper for function SimdWarpAffineInit.
  • Wrapper for function SimdWarpAffineRun.
  • Function Simd.WarpAffine.
  • Wrapper for function SimdAbsGradientSaturatedSum.
  • Function Simd.AbsGradientSaturatedSum.
  • Wrapper for function SimdBgraToBgr.
  • Method Simd.Image.Copy.
  • Method Simd.Image.Convert.
  • Method Simd.Image.Converted.
  • Wrapper for function SimdBgraToGray.
  • Wrapper for function SimdBgraToRgb.
  • Wrapper for function SimdBgraToRgba.
  • Wrapper for function SimdCopy.
  • Wrapper for function SimdBgrToBgra.
  • Wrapper for function SimdBgrToGray.
  • Wrapper for function SimdBgrToRgb.
  • Wrapper for function SimdRgbToBgra.
  • Wrapper for function SimdRgbToGray.
  • Wrapper for function SimdRgbaToGray.
  • Wrapper for function SimdBgraToYuv420pV2.
  • Wrapper for enumeration Simd.FrameFormat.
  • Class Simd.ImageFrame.
  • Method Simd.ImageFrame.Copy.
  • Method Simd.ImageFrame.Convert.
  • Method Simd.ImageFrame.Converted.
  • Wrapper for function SimdDeinterleaveUv.
  • Wrapper for function SimdInterleaveUv.
  • Wrapper for function SimdBgrToYuv420pV2.
  • Wrapper for function SimdGrayToBgra.
  • Wrapper for function SimdGrayToBgr.
  • Method Simd.Image.Fill.
  • Wrapper for function SimdYuv420pToBgraV2.
  • Wrapper for function SimdYuv420pToBgrV2.
  • Wrapper for function SimdYuv420pToRgbV2.
  • Wrapper for function SimdYToGray.
  • Wrapper for function SimdGrayToY.

Test framework

New features
  • Tests for verifying functionality of function SynetNormalizeLayerForwardV4.
  • Special test for verifying functionality of function Yuv420pToRgbV2.
  • Tests for verifying functionality of function GrayToY.
  • Tests for verifying functionality of function YToGray.
  • Tests for verifying functionality of function Yuva422pToBgraV2.
  • Tests for verifying functionality of function Yuva420pToBgraV2.

Infrastructure

Bug fixing
  • Error in CMake for ARM platform.
Home
2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013