Different solutions will be used to detect the type of waste disposed in the recycler. The main one will be a neural network running in a server. This neural network will need to be trained in a custom dataset acquired directly through the recycler waste detection compartment.

There are two main requirements in this situation:

Due to above requirements, the neural network configuration used for detection will be Alexey’s YOLOv4

GitHub - AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

YOLO types networks are focused on detection speed, YOLOv4 is the latest iteration with interesting results as shown in its whitepaper:

https://user-images.githubusercontent.com/4096485/82835867-f1c62380-9ecd-11ea-9134-1598ed2abc4b.png

Showing that with an Tesla V100 GPU, we have almost 70 fps (frames per second) detections with an AP50 (Average Precision) of almost 66%. With other GPU models shown compared based on tFlops.

Even with an GPU of reduced tFlops, the requirements are met with a certain margin, confirming the feasibility of using this type of neural network configuration.

Network training

The network must be trained since our application is specific in the types of objects and materials that we must detect.

Based o Alexey’s recommendation, we would need 1000 classifications per object for a high level of accuracy. But since our detection will occur in a controlled environment, we will use a pre annotated dataset for each object and 50 classifications per object detected. Due to the nature of our application, this will be enough for high precision in detections.

The annotated dataset will be created with CVAT tool.

GitHub - openvinotoolkit/cvat: Powerful and efficient Computer Vision Annotation Tool (CVAT)

Example

Code snippet of image detection (from AlexeyAB/darknet github)

def image_detection(image_path, network, class_names, class_colors, thresh):
    # Darknet doesn't accept numpy images.
    # Create one with image we reuse for each detect
    width = darknet.network_width(network)
    height = darknet.network_height(network)
    darknet_image = darknet.make_image(width, height, 3)

    image = cv2.imread(image_path)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image_resized = cv2.resize(image_rgb, (width, height),
                               interpolation=cv2.INTER_LINEAR)

    darknet.copy_image_from_bytes(darknet_image, image_resized.tobytes())
    detections = darknet.detect_image(network, class_names, darknet_image, thresh=thresh)
    darknet.free_image(darknet_image)
    image = darknet.draw_boxes(detections, image_resized, class_colors)
    return cv2.cvtColor(image, cv2.COLOR_BGR2RGB), detections

And a video for detection reference

https://www.youtube.com/watch?v=SK5VxQdWDcg