Darknet

Darknet is a state-of-the-art object detector that uses the YOLO (You Only Look Once) framework. It is built on a singel-stage algorithm to achieve both speed and accuracy.

YOLOv7 is currently the most accurate and fastest model and has hardware acceleration support on both GPUs and CPUs.

If CUDA is available on your system, darknet will run on your GPU.

note

darknet component uses the official Darknet implementation when running on a GPU. When running on a CPU, it uses OpenCV's implementation of Darknet.

info

YOLOv7 is the default model used by darknet in all images.

Configuration

Configuration example

darknet:
  object_detector:
    cameras:
      viseron_camera1:
        fps: 1
        scan_on_motion_only: true
        log_all_objects: false
        labels:
          - label: dog
            confidence: 0.7
            trigger_recorder: false
          - label: cat
            confidence: 0.8
        zones:
          - name: zone1
            coordinates:
              - x: 0
                y: 500
              - x: 1920
                y: 500
              - x: 1920
                y: 1080
              - x: 0
                y: 1080
            labels:
              - label: person
                confidence: 0.8
                trigger_recorder: true
        mask:
          - coordinates:
              - x: 400
                y: 200
              - x: 1000
                y: 200
              - x: 1000
                y: 750
              - x: 400
                y: 750

darknetmap required

Darknet configuration.

object_detectormap required

Object detector domain config.

camerasmap required

Camera-specific configuration. All subordinate keys corresponds to the camera_identifier of a configured camera.

<CAMERA_IDENTIFIER>map required

Camera identifier. Valid characters are lowercase a-z, numbers and underscores.

fpsfloat (optional, default: 1)

The FPS at which the object detector runs.
Higher values will result in more scanning, which uses more resources.

Lowest value: 0

scan_on_motion_onlyboolean (optional, default: true)

When set to true and a motion_detector is configured, the object detector will only scan while motion is detected.

labelslist (optional)

A list of labels (objects) to track.

labelstring required

The label to track.

confidencefloat (optional, default: 0.8)

Lowest confidence allowed for detected objects. The lower the value, the more sensitive the detector will be, and the risk of false positives will increase.

Lowest value: 0

Highest value: 1

height_minfloat (optional, default: 0)

Minimum height allowed for detected objects, relative to stream height.

Lowest value: 0

Highest value: 1

height_maxfloat (optional, default: 1)

Maximum height allowed for detected objects, relative to stream height.

Lowest value: 0

Highest value: 1

width_minfloat (optional, default: 0)

Minimum width allowed for detected objects, relative to stream width.

Lowest value: 0

Highest value: 1

width_maxfloat (optional, default: 1)

Maximum width allowed for detected objects, relative to stream width.

Lowest value: 0

Highest value: 1

trigger_recorderboolean (optional, default: true)

If set to true, objects matching this filter will start the recorder.

storeboolean (optional, default: true)

If set to true, objects matching this filter will be stored in the database, as well as having a snapshot saved. Labels with trigger_recorder set to true will always be stored when a recording starts, regardless of this setting.

store_intervalinteger (optional, default: 60)

The interval at which the label should be stored in the database, in seconds. If set to 0, the label will be stored every time it is detected.

require_motionboolean (optional, default: false)

If set to true, the recorder will stop as soon as motion is no longer detected, even if the object still is. This is useful to avoid never ending recordings of stationary objects, such as a car on a driveway

max_frame_agefloat (optional, default: 2)

Drop frames that are older than the given number. Specified in seconds.

Lowest value: 0

log_all_objectsboolean (optional, default: false)

When set to true and loglevel is DEBUG, all found objects will be logged, including the ones not tracked by labels.

masklist (optional)

A mask is used to exclude certain areas in the image from object detection.

coordinateslist required

List of X and Y coordinates to form a polygon

Minimum items: 3

xinteger required

X-coordinate (horizontal axis).

yinteger required

Y-coordinate (vertical axis).

zoneslist (optional)

Zones are used to define areas in the cameras field of view where you want to look for certain objects (labels).

namestring required

Name of the zone. Has to be unique per camera.

coordinateslist required

List of X and Y coordinates to form a polygon

Minimum items: 3

xinteger required

X-coordinate (horizontal axis).

yinteger required

Y-coordinate (vertical axis).

labelslist (optional)

A list of labels (objects) to track.

labelstring required

The label to track.

confidencefloat (optional, default: 0.8)

Lowest confidence allowed for detected objects. The lower the value, the more sensitive the detector will be, and the risk of false positives will increase.

Lowest value: 0

Highest value: 1

height_minfloat (optional, default: 0)

Minimum height allowed for detected objects, relative to stream height.

Lowest value: 0

Highest value: 1

height_maxfloat (optional, default: 1)

Maximum height allowed for detected objects, relative to stream height.

Lowest value: 0

Highest value: 1

width_minfloat (optional, default: 0)

Minimum width allowed for detected objects, relative to stream width.

Lowest value: 0

Highest value: 1

width_maxfloat (optional, default: 1)

Maximum width allowed for detected objects, relative to stream width.

Lowest value: 0

Highest value: 1

trigger_recorderboolean (optional, default: true)

If set to true, objects matching this filter will start the recorder.

storeboolean (optional, default: true)

store_intervalinteger (optional, default: 60)

The interval at which the label should be stored in the database, in seconds. If set to 0, the label will be stored every time it is detected.

require_motionboolean (optional, default: false)

model_pathstring (optional, default: /detectors/models/darknet/default.weights)

Path to model (YOLO *.weights file)

model_configstring (optional, default: /detectors/models/darknet/default.cfg)

Path to config (YOLO *.cfg file)

label_pathstring (optional, default: /detectors/models/darknet/coco.names)

Path to file containing trained labels.

suppressionfloat (optional, default: 0.4)

Non-maxima suppression, used to remove overlapping detections.
You can read more about how this works here.

Lowest value: 0

Highest value: 1

dnn_backendselect (optional)

OpenCV DNN Backend.

Valid values:

null
default
opencv
openvino

dnn_targetselect (optional)

OpenCV DNN Target.

Valid values:

null
cpu
opencl
opencl_fp16

half_precisionboolean (optional, default: false)

Enable/disable half precision accuracy.
If your GPU supports FP16, enabling this might give you a performance increase.

Object detector

An object detector scans an image to identify multiple objects and their position.

tip

Object detectors can be taxing on the system, so it is wise to combine it with a motion detector

Labels

Labels are used to tell Viseron what objects to look for and keep recordings of. The available labels depends on what detection model you are using.

The max/min width/height is used to filter out any unreasonably large/small objects to reduce false positives.
Objects can also be filtered out with the use of an optional mask.

tip

To see the default available labels you can inspect the label_path file.

docker exec -it viseron cat /detectors/models/darknet/coco.names

Zones

Zones are used to define areas in the cameras field of view where you want to look for certain objects (labels).
Say you have a camera facing the sidewalk and have labels setup to record the label person.
This would cause Viseron to start recording people who are walking past the camera on the sidewalk. Not ideal.
To remedy this you define a zone which covers only the area that you are actually interested in, excluding the sidewalk.

darknet:
  object_detector:
    cameras:
      camera_one:
        ...
        zones:
          - name: sidewalk
            coordinates:
              - x: 522
                y: 11
              - x: 729
                y: 275
              - x: 333
                y: 603
              - x: 171
                y: 97
            labels:
              - label: person
                confidence: 0.8
                trigger_recorder: true

tip

See Mask for how to get the coordinates for a zone.

Mask

Masks are used to exclude certain areas in the image from object detection. If a detected object has its lower portion inside of the mask it will be discarded.

The coordinates form a polygon around the masked area.
To easily generate coordinates you can use a tool like image-map.net.
Just upload an image from your camera, choose the Poly shape and start drawing your mask.
Then click Show me the code! and adapt it to the config format.
Coordinates coords="522,11,729,275,333,603,171,97" should be turned into this:

darknet:
  object_detector:
    cameras:
      camera_one:
        ...
        mask:
          - coordinates:
              - x: 522
                y: 11
              - x: 729
                y: 275
              - x: 333
                y: 603
              - x: 171
                y: 97

Paste your coordinates here and press Get config to generate a config example

Pre-trained models

The included models are placed inside the /detectors/models/darknet folder.

Included models:

yolov3-tiny.weights
yolov3.weights
yolov4-tiny.weights
yolov4.weights
yolov7-tiny.weights
yolov7.weights
yolov7x.weights

tip

This GitHub issue explains the models quite well.

To make an educated guess of what model to use, you can reference this image.
It will help you find the perfect trade-off between accuracy and latency.

danger

The image roflcoopter/rpi3-viseron only includes the yolov7-tiny.weights model.

tip

The containers also has *-tiny.weights model included in the image. The tiny-models can be used to reduce CPU and RAM usage. If you want to swap to a tiny-model you can change these configuration options:

YOLOv7
YOLOv4
YOLOv3

darknet:
  object_detector:
    model_path: /detectors/models/darknet/yolov7-tiny.weights
    model_config: /detectors/models/darknet/yolov7-tiny.cfg

darknet:
  object_detector:
    model_path: /detectors/models/darknet/yolov4-tiny.weights
    model_config: /detectors/models/darknet/yolov4-tiny.cfg

darknet:
  object_detector:
    model_path: /detectors/models/darknet/yolov3-tiny.weights
    model_config: /detectors/models/darknet/yolov3-tiny.cfg

note

The tiny-models have significantly worse accuracy than their larger counterparts.

Hardware acceleration

Hardware accelerated object detection is supported on NVIDIA GPUs and Intel CPUs with integrated GPUs. If you dont have a GPU available, darknet will run on the CPU.

NVIDIA GPUs

If your system supports CUDA it is recommended to use the roflcoopter/amd64-cuda-viseron image. It will automatically use CUDA for object detection.

info

When running on CUDA, native Darknet is used.

tip

If you want to force darknet to run on OpenCL even if you have an NVIDIA GPU you can set these config options:

darknet:
  object_detector:
    dnn_backend: opencv
    dnn_target: opencl

Intel CPUs with integrated GPUs

If you are running on an Intel CPU with integrated GPU, you can use the roflcoopter/amd64-viseron image. It will automatically use OpenCV with OpenCL for object detection.

The dnn_backend and dnn_target controls how the model runs.

danger

Since upgrading to Ubuntu 22.04, OpenCV 4.9.0 and OpenVINO 2023.3, the openvino backend is broken and causes segmentation faults. Hopefully this will be resolved in future updates of the libraries.

info

When not running on CUDA, OpenCVs implementation of Darknet is used.

Troubleshooting

To enable debug logging for darknet, add the following to your config.yaml

/config/config.yaml
logger:
  logs:
    viseron.components.darknet: debug

Darknet

Configuration​

Object detector​

Labels​

Zones​

Mask​

Pre-trained models​

Hardware acceleration​

NVIDIA GPUs​

Intel CPUs with integrated GPUs​

Troubleshooting​

Configuration

Object detector

Labels

Zones

Mask

Pre-trained models

Hardware acceleration

NVIDIA GPUs

Intel CPUs with integrated GPUs

Troubleshooting