Simd Library Documentation.

Home | Release Notes | Download | Documentation | Issues | GitHub

The Detection structure is a high-level C++ wrapper for object detection with HAAR and LBP cascade classifiers. More...

#include <SimdDetection.hpp>

Data Structures

struct  Object
 The Object structure describes one grouped detection result. More...
 

Public Types

typedef A< uint8_t > Allocator
 
typedef Simd::View< A > View
 
typedef Simd::Point< ptrdiff_t > Size
 
typedef std::vector< SizeSizes
 
typedef Simd::Rectangle< ptrdiff_t > Rect
 
typedef std::vector< RectRects
 
typedef int Tag
 
typedef std::vector< ObjectObjects
 

Public Member Functions

 Detection ()
 
 ~Detection ()
 
bool LoadStringXml (const std::string &xml, Tag tag=UNDEFINED_OBJECT_TAG)
 
bool Load (const std::string &path, Tag tag=UNDEFINED_OBJECT_TAG)
 
bool Init (const Size &imageSize, double scaleFactor=1.1, const Size &sizeMin=Size(0, 0), const Size &sizeMax=Size(INT_MAX, INT_MAX), const View &roi=View(), ptrdiff_t threadNumber=-1)
 
bool Detect (const View &src, Objects &objects, int groupSizeMin=3, double sizeDifferenceMax=0.2, bool motionMask=false, const Rects &motionRegions=Rects())
 

Static Public Attributes

static const Tag UNDEFINED_OBJECT_TAG = -1
 

Detailed Description

template<template< class > class A>
struct Simd::Detection< A >

The Detection structure is a high-level C++ wrapper for object detection with HAAR and LBP cascade classifiers.

The structure loads one or more OpenCV-format cascades, prepares a scale pyramid for a fixed input image size and then detects objects in images of this size. The input image can be gray or color; color images are converted to gray internally. If several cascades are loaded, every cascade is applied independently and detections are grouped separately for every user tag.

Using example (face detection in the image):

#include "Simd/SimdDetection.hpp"
#include "Simd/SimdDrawing.hpp"
int main()
{
image.Load("../../data/image/face/lena.pgm");
Detection detection;
detection.Load("../../data/cascade/haar_face_0.xml");
detection.Init(image.Size());
detection.Detect(image, objects);
for (size_t i = 0; i < objects.size(); ++i)
Simd::DrawRectangle(image, objects[i].rect, uint8_t(255));
image.Save("result.pgm");
return 0;
}
SIMD_INLINE void DrawRectangle(View< A > &canvas, ptrdiff_t left, ptrdiff_t top, ptrdiff_t right, ptrdiff_t bottom, const Color &color, size_t width=1)
Draws a clipped rectangle frame on an image.
Definition: SimdDrawing.hpp:107
The Detection structure is a high-level C++ wrapper for object detection with HAAR and LBP cascade cl...
Definition: SimdDetection.hpp:163
bool Load(const std::string &path, Tag tag=UNDEFINED_OBJECT_TAG)
Definition: SimdDetection.hpp:268
std::vector< Object > Objects
Definition: SimdDetection.hpp:211
bool Detect(const View &src, Objects &objects, int groupSizeMin=3, double sizeDifferenceMax=0.2, bool motionMask=false, const Rects &motionRegions=Rects())
Definition: SimdDetection.hpp:326
Detection()
Definition: SimdDetection.hpp:216
bool Init(const Size &imageSize, double scaleFactor=1.1, const Size &sizeMin=Size(0, 0), const Size &sizeMax=Size(INT_MAX, INT_MAX), const View &roi=View(), ptrdiff_t threadNumber=-1)
Definition: SimdDetection.hpp:299
The View structure provides storage and manipulation of images.
Definition: SimdView.hpp:70
Point< ptrdiff_t > Size() const
Definition: SimdView.hpp:1105
bool Load(const std::string &path, Format format=None)
Definition: SimdView.hpp:1312
bool Save(const std::string &path, SimdImageFileType type=SimdImageFileUndefined, int quality=100) const
Definition: SimdView.hpp:1336

Using example (face detection in the video captured by OpenCV):

#include <iostream>
#include <string>
#include "opencv2/opencv.hpp"
#include "opencv2/core/utils/logger.hpp"
#ifndef SIMD_OPENCV_ENABLE
#define SIMD_OPENCV_ENABLE
#endif
#include "Simd/SimdDetection.hpp"
#include "Simd/SimdDrawing.hpp"
int main(int argc, char * argv[])
{
if (argc < 2)
{
std::cout << "You have to set video source! It can be 0 for camera or video file name." << std::endl;
return 1;
}
std::string source = argv[1], output = argc > 2 ? argv[2] : "";
cv::VideoCapture capture;
cv::utils::logging::setLogLevel(cv::utils::logging::LOG_LEVEL_ERROR);
if (source == "0")
capture.open(0);
else
capture.open(source);
if (!capture.isOpened())
{
std::cout << "Can't capture '" << source << "' !" << std::endl;
return 1;
}
int W = (int)capture.get(cv::CAP_PROP_FRAME_WIDTH);
int H = (int)capture.get(cv::CAP_PROP_FRAME_HEIGHT);
cv::VideoWriter writer;
if (output.size())
{
writer.open(output, cv::VideoWriter::fourcc('F', 'M', 'P', '4'), capture.get(cv::CAP_PROP_FPS), cv::Size(W, H));
if (!writer.isOpened())
{
std::cout << "Can't open output file '" << output << "' !" << std::endl;
return 1;
}
}
Detection detection;
detection.Load("../../data/cascade/haar_face_0.xml");
detection.Init(Detection::Size(W, H), 1.2, Detection::Size(W, H) / 20);
const char * WINDOW_NAME = "FaceDetection";
cv::namedWindow(WINDOW_NAME, 1);
for (;;)
{
cv::Mat frame;
if (!capture.read(frame))
break;
Detection::View image = frame;
detection.Detect(image, objects);
for (size_t i = 0; i < objects.size(); ++i)
Simd::DrawRectangle(image, objects[i].rect, Simd::Pixel::Bgr24(0, 255, 255));
cv::imshow(WINDOW_NAME, frame);
if (writer.isOpened())
writer.write(frame);
if (cv::waitKey(1) == 27)// "press 'Esc' to break video";
break;
}
return 0;
}
Simd::Point< ptrdiff_t > Size
Definition: SimdDetection.hpp:166
Simd::View< A > View
Definition: SimdDetection.hpp:165
24-bit BGR pixel.
Definition: SimdPixel.hpp:55
Note
This is wrapper around low-level Object Detection API.

Member Typedef Documentation

◆ Allocator

typedef A<uint8_t> Allocator

Allocator used by temporary images and buffers.

◆ View

typedef Simd::View<A> View

Image view type used for input images, ROI masks and internal pyramid levels.

◆ Size

typedef Simd::Point<ptrdiff_t> Size

Two-dimensional size or point type.

◆ Sizes

typedef std::vector<Size> Sizes

Vector of sizes.

◆ Rect

typedef Simd::Rectangle<ptrdiff_t> Rect

Rectangle type used for object bounds and search regions.

◆ Rects

typedef std::vector<Rect> Rects

Vector of rectangles.

◆ Tag

typedef int Tag

User tag type used to identify detections from different cascades.

◆ Objects

typedef std::vector<Object> Objects

Vector of detected objects.

Constructor & Destructor Documentation

◆ Detection()

Detection ( )

Creates a new empty Detection structure. Load at least one cascade and call Init before Detect.

◆ ~Detection()

~Detection ( )

Releases all loaded cascades.

Member Function Documentation

◆ LoadStringXml()

bool LoadStringXml ( const std::string &  xml,
Tag  tag = UNDEFINED_OBJECT_TAG 
)

Loads a classifier cascade from XML text. Supports OpenCV HAAR and LBP cascade types. You can call this function more than once if you want to use several object detectors at the same time.

Note
Tree based cascades and old cascade formats are not supported!
Parameters
[in]xml- a string containing XML with cascade.
[in]tag- a user defined tag. This tag will be inserted into output Object structures.
Returns
a result of this operation.

◆ Load()

bool Load ( const std::string &  path,
Tag  tag = UNDEFINED_OBJECT_TAG 
)

Loads a classifier cascade from an XML file. Supports OpenCV HAAR and LBP cascade types. You can call this function more than once if you want to use several object detectors at the same time.

Note
Tree based cascades and old cascade formats are not supported!
Parameters
[in]path- a path to XML cascade file.
[in]tag- a user defined tag. This tag will be inserted into output Object structures.
Returns
a result of this operation.

◆ Init()

bool Init ( const Size imageSize,
double  scaleFactor = 1.1,
const Size sizeMin = Size(0, 0),
const Size sizeMax = Size(INT_MAX, INT_MAX),
const View roi = View(),
ptrdiff_t  threadNumber = -1 
)

Prepares Detection structure to work with images of the given size.

The function builds internal pyramid levels for all loaded cascades and object sizes that fit into [sizeMin, sizeMax]. It must be called after loading cascades and before Detect. After successful initialization every input image passed to Detect must have the same size as imageSize.

Parameters
[in]imageSize- a size of input image.
[in]scaleFactor- a multiplier between neighboring pyramid levels. Values close to 1 improve scale precision but create more levels and reduce performance.
[in]sizeMin- a minimal size of detected objects in input image coordinates. It strongly affects performance.
[in]sizeMax- a maximal size of detected objects in input image coordinates.
[in]roi- an optional 8-bit mask which defines a Region Of Interest. Non-zero mask pixels allow detection; zero pixels reject it. The mask is applied to detected object centers.
[in]threadNumber- a number of worker threads. Use value -1 to choose this number automatically.
Returns
a result of this operation.

◆ Detect()

bool Detect ( const View src,
Objects objects,
int  groupSizeMin = 3,
double  sizeDifferenceMax = 0.2,
bool  motionMask = false,
const Rects motionRegions = Rects() 
)

Detects objects in the given image.

The function runs all initialized pyramid levels, collects elementary detections and then groups similar rectangles separately for every cascade tag. The output vector is cleared before grouped objects are added.

Parameters
[in]src- an input image. Its size must be equal to imageSize passed to Init.
[out]objects- detected and grouped objects.
[in]groupSizeMin- a minimal number of elementary detections required to keep a grouped object. If this value is zero, no objects are returned.
[in]sizeDifferenceMax- a relative rectangle difference used to group elementary detections.
[in]motionMask- a flag to restrict detection by motionRegions in addition to the ROI mask.
[in]motionRegions- rectangles in input image coordinates that dynamically restrict detection. The regions are applied to detected object centers.
Returns
a result of this operation.

Field Documentation

◆ UNDEFINED_OBJECT_TAG

const Tag UNDEFINED_OBJECT_TAG = -1
static

Default tag assigned to objects when no user tag is specified.