INTRODUCTION

Unmanned Aerial Vehicles (UAVs) are increasingly utilized in crop cultivation, supporting both crop monitoring and precision treatments. Advances in deep learning and machine vision enable the integration of these technologies, enhancing the autonomy of drones. However, challenges such as complex backgrounds, varying object scales, diverse viewing angles, and diversified crop management system hinder efficient detection. Although most treatments in orchard cultivation are performed when trees have developed leaves, growers also pay great attention to protecting trees in the leafless phase, achieving significant long-term benefits. One-stage algorithms from the YOLO (You Only Look Once) series, based on convolutional neural network architectures, have become a leading trend in object detection and image segmentation in recent years. The objective of this study was to evaluate the effectiveness of the YOLO model in detecting pear trees in the leafless phase using data collected by drones.

MATERIAL AND METHODS

The data were collected by flying along the rows of trees in an orchard of pear trees of the Lukasówka variety on the Pigwa S1 rootstock. To enhance the diversity of the training data, images were captured under varying lighting conditions using two DJI drone models at flight altitudes of 5 m and 6 m relative to the takeoff point. To enable efficient and large-scale image processing simultaneously, a Python-based PiranhaPix [1] program was developed in the PyCharm environment, utilizing the OpenCV (cv2) library. After resizing the images to a resolution of 640 × 480 while preserving the original 4:3 aspect ratio, the LabelImg annotation tool was used to label them in two distinct ways: either as individual pear trees or as entire visible rows of trees, which were then used to train two separate models. The YOLOv11s [2] models was trained in a Google Colab environment [3] with an NVIDIA A100 40 GB Tensor Core GPU, using data augmentation and an initial learning rate of 0.001. Training was performed for 300 epochs with a batch size of 16. The dataset, consisting of 1400 images, was divided into training (80%), validation (10%), and test (10%) sets.

PiranhaPix interface
Labeling individual trees in LabelImg
Labeling tree rows in LabelImg
Experimental pear orchard
View from the DJI Mavic 3 Multispectral drone at a flight altitude of 5 m
View from the DJI Mavic 3 Multispectral drone at a flight altitude of 6 m
View from the DJI Mini 3 Pro drone at a flight altitude of 5 m

The performance of the trained model was evaluated using the following parameters:

View from the DJI Mini 3 Pro drone at a flight altitude of 6 m
DJI Mavic 3 Multispectral drone
DJI Mini 3 Pro drone

The performance of the trained model was evaluated using the following parameters:

RESULTS

Performance metrics on the test set

Performance metrics on the validation set during models training

Images labeled by trained models

CONCLUSION

The obtained results confirm the high efficacy of the YOLOv11s model, built upon a convolutional neural network architecture, in detecting pear trees in drone-acquired imagery. In both labeling approaches—individual trees and entire rows—the trained models exhibited high Precision, Recall, and mAP@50 values. The most pronounced variability emerged in mAP@50:95, indicating the need for further optimization under more stringent IoU thresholds. Looking ahead, extending the analysis to other fruit tree species and incorporating complementary deep learning techniques will be crucial for enhancing the system’s flexibility and robustness in the face of diverse data conditions.

REFERENCES 

1.https://github.com/kamilczynski/PiranhaPix

2.https://docs.ultralytics.com/models/yolo11/#supported-tasks-and-modes

3.https://github.com/kamilczynski/Detection-of-pear-trees-using-convolutional-neural-network-based-on-data-collected-by-drones