鲍文霞, 谢文杰, 胡根生, 杨先军, 苏彪彪. 基于TPH-YOLO的无人机图像麦穗计数方法[J]. 农业工程学报, 2023, 39(1): 155-161. DOI: 10.11975/j.issn.1002-6819.202210020
    引用本文: 鲍文霞, 谢文杰, 胡根生, 杨先军, 苏彪彪. 基于TPH-YOLO的无人机图像麦穗计数方法[J]. 农业工程学报, 2023, 39(1): 155-161. DOI: 10.11975/j.issn.1002-6819.202210020
    BAO Wenxia, XIE Wenjie, HU Gensheng, YANG Xianjun, SU Biaobiao. Wheat ear counting method in UAV images based on TPH-YOLO[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(1): 155-161. DOI: 10.11975/j.issn.1002-6819.202210020
    Citation: BAO Wenxia, XIE Wenjie, HU Gensheng, YANG Xianjun, SU Biaobiao. Wheat ear counting method in UAV images based on TPH-YOLO[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(1): 155-161. DOI: 10.11975/j.issn.1002-6819.202210020

    基于TPH-YOLO的无人机图像麦穗计数方法

    Wheat ear counting method in UAV images based on TPH-YOLO

    • 摘要: 在无人机上安装光学传感器捕捉农作物图像是一种经济高效的方法,它有助于产量预测、田间管理等。该研究以无人机小麦作物图像为研究对象,针对图像中麦穗分布稠密、重叠现象严重、背景信息复杂等特点,设计了一种基于TPH-YOLO(YOLO with transformer prediction heads)的麦穗检测模型,提高无人机图像麦穗计数的精度。首先,为了减小光照不均匀对无人机图像质量造成的影响,该研究采用Retinex算法进行图像增强处理。其次,在YOLOv5的骨干网络中添加坐标注意力机制(coordinate attention,CA),使模型细化特征,更加关注麦穗信息,抑制麦秆、麦叶等一些背景因素的干扰。再次,将YOLOv5中原始的预测头转换为Transformer预测头(transformer prediction heads,TPH),该预测头具有多头注意力机制的预测潜力,可以在高密度场景下准确定位到麦穗。最后,为了提高模型的泛化能力和检测精度,采用了迁移学习的训练策略,先使用田间采集的小麦图像数据集对模型进行预训练,接着再使用无人机采集的小麦图像数据集对模型进行参数更新和优化训练,并在无人机采集的小麦图像数据集上进行了试验。结果表明,该研究方法精确率、召回率及平均精确率分别为87.2%、84.1%和88.8%,相较于基础的YOLOv5平均精确率提高4.1个百分点,性能优于SSD、Faster-RCNN、CenterNet、YOLOv5等目标检测模型。此外,该研究利用公开数据集Global Wheat Head Detection(GWHD)在不同目标检测模型上进行对比试验,该数据集的小麦样本是多样的和典型的,与SSD、Faster-RCNN、CenterNet和YOLOv5等模型相比,平均精确率分别提升11.1、5.4、6.9和3.3个百分点,进一步验证了该研究所提方法的可靠性和有效性,研究结果可以为小麦的产量预测提供支撑。

       

      Abstract: Optical sensors have been widely installed on unmanned aerial vehicle (UAV) to capture images of all kinds of crops in recent years. The economic and effective way can greatly contribute to yield prediction and field management in modern agriculture. However, the great challenge of wheat ear counting still remains in the dense distribution of wheat ears, the serious overlap phenomenon, and the complex background information in the images. In this study, a detection model of the wheat ear was designed to improve the accuracy of the wheat ear counting in the UAV images using the transformer prediction heads "you only look once" (TPH-YOLO). The UAV wheat ear images were also taken as the research object. Firstly, the Retinex algorithm was used to deal with the enhancement of the wheat ear images that collected by the UAV, in order to reduce the influence of the uneven illumination on the image quality. Secondly, the coordinate attention mechanism (CA) was added to the backbone network of YOLOv5. In this way, the improved model was utilized to refine the features after treatment. As a result, the TPH-YOLO network was focused mainly on the wheat ear information, at the same time to avoid the interference of some background factors, such as the wheat stalk, and the wheat leaf. Once more, the original prediction head in the YOLOv5 was converted into the transformer prediction head (TPH) in this case. Correspondingly, the improved prediction head was obtained for the prediction potential of multiple head attention mechanism, in order to accurately fix the position of the wheat ears in a high-density scene. In the end, the training strategy was adopted to improve the generalization ability and the detection accuracy of the TPH-YOLO network using transfer learning. The image dataset of the wheat ear that was collected in the field was used to pre-train the model, and then the wheat ear image dataset collected by the UAV was used to update and optimize the model parameters. A series of experiments were conducted on the wheat ear images collected by the UAV. The performance of the target detection model was evaluated by the three indicators: Precision, recall, and average precision (AP). The experimental results show that the precision, recall, and average precision (AP) of the improved model were 87.2%, 84.1%, and 88.8%, respectively. The average precision of the improved model was 4.1% higher than the original YOLOv5 one. The performance was also better than the SSD, Fast RCNN, CenterNet, and Yolov5 target detection models. In addition, Global Wheat Head Detection (GWHD) dataset was selected to carry out the comparative experiments on the different target detection models, due to the diverse and typical wheat samples from the GWHD dataset. Compared with the target detection models such as SSD, Faster-RCNN, CenterNet and YOLOv5, the average precision increased by 11.1, 5.4, 6.9 and 3.3 percentage points respectively. The comparative analysis of the detection further verified the reliability and effectiveness of the improved model. Consequently, the finding can also provide strong support for the wheat yield prediction.

       

    /

    返回文章
    返回