Simd Library Release Notes (2026).

Home | Release Notes | Download | Documentation | Issues | GitHub

2026 | 2025 | 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013
Home

May X, 2026 (version 7.0.161)

Algorithms

New features
  • SSE4.1, AVX2, AVX-512BW, NEON optimizations of function BgrToHsv.
  • SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuv44pToHsl.
  • SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Yuv44pToHsv.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetPoolingMax16b.
  • AVX-512BW optimizations of function AlphaBlending2x.
  • AVX2 optimizations of function BgrToBayer.
  • AVX2 optimizations of function BgraToBayer.
  • Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV2.
  • NEON optimizations of function BgrToHsl.
  • NEON optimizations of function BgrToLab.
  • NEON optimizations of function GrayToY.
  • NEON optimizations of function YToGray.
  • NEON optimizations of function Yuv444pToRgbaV2.
  • NEON optimizations of function SynetTiledScale2D32f.
  • NEON optimizations of function SynetNormalizeLayerForward.
  • NEON optimizations of function SynetNormalizeLayerForwardV2.
  • NEON optimizations of function SynetNormalizeLayerForwardV3.
  • NEON optimizations of function SynetNormalizeLayerForwardV4.
  • NEON optimizations of function SynetAdd8i.
  • NEON optimizations of function SynetConvert8uTo32f.
  • NEON optimizations of function SynetDequantizeLinear.
  • NEON optimizations of function SynetQuantizeLinear.
  • NEON optimizations of class SynetQuantizedAdd.
  • NEON optimizations of function SynetQuantizedConcatLayerForward.
  • NEON optimizations of function SynetQuantizedPreluLayerForward.
  • NEON optimizations of function SynetQuantizedScaleLayerForward.
  • NEON optimizations of function SynetQuantizedShuffleLayerForward.
  • NEON optimizations of class SynetScale8i.
Bug fixing
  • Error in NEON optimization of function TransformImage (BGR, T0 transform).
  • Error in AVX-512BW optimizations of function SynetSoftmax16b (MSVS, Debug, Win32).
  • Possible aligned load of unaligned memory in AVX2 optimizations of function AbsDifferenceSums3 (Windows 7, x64, gcc 8.1.0).
  • Possible aligned store in file SimdExtract.h for SSE4.1, AVX2, AVX-512BW.
  • Wrong assert condition in function Simd::Uyvy422ToBgr.
  • Wrong assert condition in function Simd::Uyvy422ToYuv420p.
  • Wrong assert condition in function Simd::Yuv420pToUyvy422.
  • Error in function Base::CpuCacheSize on ARM64 platform.

Test framework

New features
  • Github action 'Test Python' step in build_and_test_gcc_new in cmake.yml.
  • Tests for verifying functionality of function SynetPoolingMax16b.
Bug fixing
  • Error in Python test ShiftDetectorFunctionsTest.
  • Error in Python test ShiftDetectorClassTest.
  • Error in test SquaredDifferenceSum32f.
  • Error in test NeuralAddConvolution2x2ForwardAutoTest.
  • Error in parsing of 'testThreads' command line options.

Infrastructure

New features
  • Job build_and_test_arm64 in Github actions script for CMake.
  • test.yml github action to test dev branch
  • Documentation

    Bug fixing
    • Syntax and lexical errors in description of Python wrapper.
    Home

    April 1, 2026 (version 7.0.160)

    Algorithms

    New features
    • Possibility to use non constant B matrix in framework SynetInnerProduct32f.
    • Function SimdSynetInnerProduct32fExternalBufferSize.
    • Parameter 'activation' to function SimdSynetInnerProduct16bInit.
    • Parameter 'params' to function SimdSynetInnerProduct16bSetParams.
    • Base implementation of class SynetGatherElements.
    • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetNormalizeLayerForward16bV2.
    • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetSoftmax16b.
    • Support of HVX extension (Hexagon platform).
    • HVX optimizations of function AbsDifference.
    • HVX optimizations of function AbsDifferenceSum.
    • HVX optimizations of function AbsGradientSaturatedSum.
    • HVX optimizations of function AddFeatureDifference.
    • HVX optimizations of function BgrToGray.
    • HVX optimizations of function BgrToRgb.
    • HVX optimizations of function FillBgra.
    • HVX optimizations of function FillPixel.
    • HVX optimizations of function AbsSecondDerivativeHistogram.
    • HVX optimizations of function HistogramMasked.
    • HVX optimizations of function HistogramConditional.
    • HVX optimizations of function OperationBinary8u.
    • HVX optimizations of function GetStatistic.
    • HVX optimizations of function GetRowSums.
    • HVX optimizations of function GetColSums.
    • HVX optimizations of function GetAbsDyRowSums.
    • HVX optimizations of function GetAbsDxColSums.
    • HVX optimizations of function ValueSum.
    • HVX optimizations of function SquareSum.
    • HVX optimizations of function ValueSquareSum.
    • HVX optimizations of function ValueSquareSums.
    • HVX optimizations of function CorrelationSum.
    • SSE4.1, AVX2, AVX-512BW optimizations of function BgrToHsl.
    Improving
    • AMX-BF16 optimizations of class SynetMergedConvolution16bCdc.
    • AMX-BF16 optimizations of class SynetMergedConvolution16bCd.
    • AMX-BF16 optimizations of class SynetMergedConvolution16bDc.
    • AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
    Bug fixing
    • Error in SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedPreluLayerForward (possible aligned read of unaligned memory).
    • Error in SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedScaleLayerForward (possible aligned read of unaligned memory).
    • Error in SSE4.1 optimizations of class ResizerFloatBilinear (possible aligned read of unaligned memory).
    • Error in SSE4.1 optimizations of class ResizerBf16Bilinear (possible aligned read of unaligned memory).
    • Error in SSE4.1, AVX2 optimizations of class ResizerByteBilinear (possible aligned read of unaligned memory).
    • Error in SSE4.1 optimizations of class ResizerByteBilinear (possible aligned write to unaligned memory).
    • Error in SSE4.1 optimizations of class ResizerFloatBilinear (possible aligned write to unaligned memory).
    • Memory leak in function Simd::Detection::LoadStringXml.
    • Possible crash in function Simd::ImageLoadFromFile.
    • Possible crash in function Simd::Base64Decode.
    • Possible crash in AVX-512BW optimization of function TransformImageRotate270.
    • Performance bug in AVX-512BW optimization of class SynetMergedConvolution32fCdc.
    • Performance bug in AMX-BF16 optimization of class SynetMergedConvolution16bCdc.
    • Compiler error in assert conditions in function Simd::DeinterleaveBgra.
    • Compiler error in assert conditions in function Simd::DeinterleaveRgb.
    • Compiler error in assert conditions in function Simd::DeinterleaveRgba.
    • Compiler error in assert conditions in function Simd::GetObjectMoments.
    • MSVS compiler bug in Base implementation of function BgrToHsl (Release, x64).
    • MSVS compiler bug in Base implementation of function SynetQuantizedScaleLayerForward (Release, Win32).
    Renaming
    • Function SynetSoftmaxLayerForward to SynetSoftmax32f.

    Test framework

    New features
    • Tests for verifying functionality of class SynetGatherElements.
    • Tests for verifying functionality of function SynetNormalizeLayerForward16bV2.
    • Tests for verifying functionality of function SynetSoftmax16b.
    Improving
    • Add thread save state to functions Test::Rand and Test::Srand.
    • Add smoothing to function Test::CreateTestImage.
    Bug fixing
    • Wrong parsing of 'testStatistics' command line option.
    • Wrong parsing of 'testRepeats' command line option.
    • Too long github action test step in msbuild.yml.
    • Too long github action test step in cmake.yml.

    Documentation

    Bug fixing
    • Syntax and lexical errors in project documentation.
    Home

    March 3, 2026 (version 6.2.159)

    Algorithms

    New features
    • Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcSpecV2.
    • Support of SimdSynetUnaryOperation32fRound in function SynetUnaryOperation32f.
    • Support of SimdSynetUnaryOperation32fSign in function SynetUnaryOperation32f.
    Bug fixing
    • Error in AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1 (kernel Convolution16bNhwcGemm_Macro32x32).
    • Error in Base implementation of class SynetQuantizedConvolutionNhwcDepthwiseV2 (multithread using of SimdSynetQuantizedConvolutionForward).
    • Error in Base implementation of class SynetQuantizedConvolutionNhwcDepthwiseV3 (multithread using of SimdSynetQuantizedConvolutionForward).
    • Error in AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1 (Inv2x2, kernel Convolution16bNhwcGemm_MacroNx32, unaligned dstH*dstW).
    • Error in AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1 (Inv2x2, kernel Convolution16bNhwcGemm_MacroNx32, unaligned dstC).
    • Error in SSE4.1, AVX2, AVX-512BW, NEON optimizations of function AbsDifference (wrong alignment checking).
    • Error in AVX-512BW optimizations of class SynetConvolution32fGemmNN (case of extra large padding).
    • Error in AVX-512BW optimizations of class ResizerNearest.
    • Error in AMX-BF16 optimizations of class SynetMergedConvolution16bCdc (batch > 1, small input size).
    • Error in functions BodyH, BodyW (file SimdSynetConvParam.h).
    Home

    February 3, 2026 (version 6.2.158)

    Algorithms

    New features
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare3x3.
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare5x5.
    • Base implementation of class SynetConvolution16bNhwcSpecV2.
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare3x3.
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare5x5.
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare3x3.
    • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare5x5.
    Improving
    • AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.

    Test framework

    New features
    • Tests for verifying functionality of function MidpointFilterSquare3x3.
    • Tests for verifying functionality of function MidpointFilterSquare5x5.
    • Tests for verifying functionality of function MinFilterSquare3x3.
    • Tests for verifying functionality of function MinFilterSquare5x5.
    • Tests for verifying functionality of function MaxFilterSquare3x3.
    • Tests for verifying functionality of function MaxFilterSquare5x5.
    Home

    January 2, 2026 (version 6.2.157)

    Algorithms

    New features
    • Function Simd::Resize for Simd::Frame.
    • Base implementation of function DrawLine.
    • Base implementation of function DrawRectangle.
    • Base implementation of function FontInit.
    • Base implementation of function FontResize.
    • Base implementation of function FontHeight.
    • Base implementation of function FontMeasure.
    • Base implementation of function FontDraw.
    Improving
    • Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
    • AVX-512BW optimizations of function SynetPoolingMax32f (case of SynetPoolingMax32f2DNhwcSolid2x2).
    • AVX-512BW optimizations of function SynetMergedConvolution32f (InputConvolution1x1).
    • AVX-512BW optimizations of function SynetMergedConvolution32f (DepthwiseConvolution_k3p1d1s1w6).
    • Simd::DrawLine uses SimdDrawLine instead its own implementation.
    • Simd::DrawRectangle uses SimdDrawRectangle instead its own implementation.
    • Simd::Font uses functions SimdFontInit, SimdFontResize, SimdFontHeight, SimdFontMeasure, SimdFontDraw instead of its own implementation.

    Python wrapper

    New features
    • Function Simd.ResizeFrame.
    • Function Simd.ResizedFrame.
    • Yuv444p member to Simd.FrameFormat enumeration.
    • Method Simd.ImageFrame.Save.
    • Method Simd.ImageFrame.Load.
    • Function Simd.Lib.StretchGray2x2.
    • Function Simd.StretchGray2x2.
    • Function Simd.Lib.BgraToYuv444p.
    • Function Simd.Lib.Yuv444pToRgb.
    • Function Simd.Lib.ReduceGray2x2.
    • Function Simd.ReduceGray2x2.
    • Function Simd.Lib.BgrToYuv444p.
    • Function Simd.Lib.BgraToYuv444p.
    • Function Simd.Lib.Yuv444pToBgr.
    • Function Simd.Lib.Yuv444pToRgba.
    • Function Simd.Lib.DrawLine.
    • Method Simd.Image.DrawLine.
    • Function Simd.Lib.DrawRectangle.
    • Method Simd.Image.DrawRectangle.
    • Function Simd.Lib.FontInit.
    • Function Simd.Lib.FontResize.
    • Function Simd.Lib.FontHeight.
    • Function Simd.Lib.FontMeasure.
    • Function Simd.Lib.FontDraw.
    • Class Simd.TextFont.
    • Method Simd.Image.DrawFilledRectangle.
    Improving
    • Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Recreate.
    • Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Convert.
    Bug fixing
    • Error in method Simd.Frame.Convert.
    Renaming
    • Function Simd.Resize to Simd.ResizeImage.
    • Function Simd.Resized to Simd.ResizedImage.

    Test framework

    New features
    • Tests for verifying functionality of function DrawLine.
    • Tests for verifying functionality of function DrawRectangle.
    Bug fixing
    • Error in method Test::PerformanceMeasurerStorage::Clear.
    Home
    2026 | 2025 | 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013