banner

Comparative analysis of machine learning classification algorithms for predicting olive anthracnose disease

Klimentia Kottaridi, Anna Milionis, Vasilis Demopoulos, Vasileios Nikolaidis, Polina C. Tsalgatidou, Athanasios Tsafouros, Anastasios Kotsiras, Alexandros Vithoulkas

Abstract


Olive anthracnose (OA) is the most damaging fungal disease of the olive tree worldwide. In the context of integrated pest management, the development of predictive models could be used for early diagnosis and control. In the current study, a dataset consisting of 58 cases, coming from 5 locations and 12 olive cultivars, was used to study the relationship between ΟΑ incidence (OAI) and 35 heterogeneous variables. These variables include orchard characteristics, olive fruit parameters, foliar and soil nutrients, soil parameters and soil texture classes. The Random Forest-Recursive Feature Elimination with Cross Validation (RF-RFECV) feature selection method identified Location, water content, P, Ca, Mg, exchangeable Mg, trace Zn, trace Cu as possible new indicators associated with OAI. The objective of this study was to investigate whether these variables have a predictive value for OAI. Six different machine learning classification algorithms, namely decision tree (DT), gradient boosting (GB), logistic regression (LR), random forest (RF), k-nearest neighbors (KNN) and support vector machine (SVM), were developed for predicting conditions leading to OAI > 0% and 10%. Grid search hyperparameter optimization was employed to optimize model parameters. The final models were evaluated in terms of several standard metrics, such as accuracy, sensitivity, specificity and ROC AUC score. Findings suggested that GB performance was superior compared to the other models for the prediction of the occurrence of OA disease (OAI > 0%) with an accuracy of 86.7%, a sensitivity of 100%, a specificity of 75% and a ROC-AUC score of 93%, while for the prediction of the spread of the disease (OAI > 10%), DT stood out with an accuracy of 86.7%, a sensitivity of 81.8%, a specificity of 100% and a ROC-AUC score of 91%.


Keywords


olive anthracnose; machine learning; forecast models; classification algorithms; soil nutrients

Full Text:

PDF

References


1. Romero J, Santa-Bárbara AE, Moral J, et al. Effect of latent and symptomatic infections by Colletotrichum godetiae on oil quality. European Journal of Plant Pathology 2022;163(2): 545-556. doi: 10.1007/s10658-022-02494-x

2. Kolainis S, Koletti A, Lykogianni M, et al. An integrated approach to improve plant protection against olive anthracnose caused by the Colletotrichum acutatum species complex. PLoS One 2020; 15(5): e0233916. doi: 10.1371/journal.pone.0233916

3. Petrogiannis A. Anthracnose has depleted 30% of this year’s production in Messinia. Available online: https://www.tharrosnews.gr/2023/02/to-gloiosporio-efage-fetos-to-30-paragogis-sti-messinia/ (accessed on 12 March 2023).

4. Peres F, Talhinhas P, Afonso H, et al. Olive Oils from Fruits Infected with Different Anthracnose Pathogens Show Sensory Defects Earlier Than Chemical Degradation. Agronomy 2021; 11(6): 1041. doi: 10.3390/agronomy11061041

5. Carvalho MT, Simoes-Lopes P, Silva MJM. Influence of different olive infection rates of Colletotrichum acutatum on some important olive oil chemical parameters. Acta Horticulturae 2008; 791: 555-559. doi: 10.17660/ActaHortic.2008.791.85

6. Moral J, Xaviér C, Roca LF, et al. Olive Anthracnose and its effect on oil quality. Grasas Aceites 2014; 65(2): e028. doi: 10.3989/gya.110913

7. Moral J, Oliveira R, Trapero-Casas A. Elucidation of the Disease Cycle of Olive Anthracnose Caused by Colletotrichum acutatum. Phytopathology 2009; 99: 548-556. doi: 10.1094/PHYTO-99-5-0548

8. Moral J, Xaviér CJ, Viruega JR, et al. Variability in susceptibility to anthracnose in the World Collection of Olive Cultivars of Cordoba (Spain). Frontiers in Plant Science 2017; 8: 1892. doi: 10.3389/fpls.2017.01892

9. Talhinhas P, Loureiro A, Oliveira H. Olive anthracnose: A yield- and oil quality-degrading disease caused by several species of Colletotrichum that differ in virulence, host preference and geographical distribution. Molecular Plant Pathology 2018; 19: 1797-1807. doi: 10.1111/mpp.12676

10. Cacciola SO, Faedda R, Sinatra F, et al. Olive anthracnose. Journal of Plant Pathology 2012; 94(1): 29-44.

11. Sergeeva V. The role of epidemiology data in developing integrated management of anthracnose in olives—A review. Acta Horticulturae 2014; 1057: 163-168. doi: 10.17660/ActaHortic.2014.1057.19

12. Moral J, Agustí-Brisach C, Raya MC, et al. Diversity of Colletotrichum Species Associated with Olive Anthracnose Worldwide. Journal of Fungi 2021; 7: 741. doi: 10.3390/jof7090741

13. Romero J, Moral J, González-Domínguez E, et al. Logistic models to predict olive anthracnose under field conditions. Crop Protection 2021; 148: 105714. doi: 10.1016/j.cropro.2021.105714

14. Sergeeva V. Anthracnose in olives: symptoms, disease cycle, and management. In: Proceedings of the 4th International Conference Olivebioteq; 2011.

15. Sergeeva V. Integrated pest management of diseases in olives. Australian and New Zealand Olive Grower and Processor 2011; 80: 16-21.

16. Sergeeva V. Anthracnose management factors influencing yield and quality of olives. In: Proceedings of the Australian National Conference; 17th-19th September 2014.

17. Shoaib M, Shah B, El-Sappagh S, et al. An advanced deep learning models-based plant disease detection: A review of recent research. Frontiers in Plant Science 2023; 14: 1158933. doi: 10.3389/fpls.2023.1158933

18. Fenu G, Malloci F. Forecasting Plant and Crop Disease: An Explorative Study on Current Algorithms. Big Data and Cognitive Computing 2021; 5(2). doi: 10.3390/bdcc5010002

19. Alruwaili M, Alanazi S, Abd ElGhany S, Shehab A. An Efficient Deep Learning Model for Olive Diseases Detection. International Journal of Advanced Computer Science and Applications 2019; 10. doi: 10.14569/IJACSA.2019.0100863

20. Fazari A, Pellicer-Valero O, Gómez-Sanchís J, et al. Application of deep convolutional neural networks for the detection of anthracnose in olives using VIS/NIR hyperspectral images. Computers and Electronics in Agriculture 2021; 187: 106252. doi: 10.1016/j.compag.2021.106252

21. Alves L, Silva R, Bernardino J. Using Data Mining to Predict Diseases in Vineyards and Olive Groves. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management; 2017; pp. 282-287.

22. Olivares B, Lobo Luján D, Rey JC, et al. Identification of Soil Properties Associated with the Incidence of Banana Wilt Using Supervised Methods. Plants 2022; 11(15): 2070. doi: 10.3390/plants11152070

23. Uceda M, Frias L. Harvest dates: Evolution of the fruit oil content, oil composition and oil quality. In: Proceedings of the II Seminario Oleicola Internacional; 1975; Cordoba, Spain. pp. 125-130.

24. Tsalgatidou PC, Thomloudi EE, Baira E, et al. Integrated Genomic and Metabolomic Analysis Illuminates Key Secreted Metabolites Produced by the Novel Endophyte Bacillus halotolerans Cal.l.30 Involved in Diverse Biological Control Activities. Microorganisms 2022; 10(2): 399. doi: 10.3390/microorganisms10020399

25. Klages MG. Reproducibility of saturation percentage of soils. In: Proceedings of the Montana Academy of Sciences; 1984; 44: 67-69.

26. Kalra YP. Determination of pH of soils by different methods: collaborative study. Journal of AOAC International 1995; 78: 310-321. doi: 10.1007/BF02348343

27. Van Reeuwijk LP. Procedures for soil analysis, 6th ed. Technical Paper International Soil Reference and Information Centre; FAO/ISRIC; Wageningen the Netherlands. 2002.

28. Warncke D, Brown JR. Potassium and other basic cations. In: Recommended Chemical Soil Test Procedures for the North Central Region; Missouri Agricultural Experimental Station SB1001; Columbia, MO USA. 1982. pp. 31-33.

29. Olsen SR, Cole CV, Watanabe FS, Dean LA. Estimation of Available Phosphorus in Soils by Extraction with Sodium Bicarbonate. USDA Circular 1954; 939: 18.

30. Walkley A, Black IA. An examination of the Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Science 1934; 37(1): 29–38. doi: 10.1097/00010694‐193401000‐00003

31. Miller RO, Kotuby‐Amacher J, Rodriguez JB. Western States Laboratory Proficiency Testing Program Soil and Plant Analytical Methods; 1998.

32. Murphy J, Riley JP. A Modified Single Solution Method for the Determination of Phosphate in Natural Waters. Analytica Chimica Acta 1962; 27: 31-36. doi: 10.1016/S0003-2670(00)88444-5

33. Greweling T. Chemical analysis of plant tissue. Search 1976; 6(8): 1-35.

34. Fan C, Chen M, Wang X, et al. A Review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Frontiers in Energy Research 2021; 9: 652801. doi: 10.3389/fenrg.2021.652801

35. Darst BF, Malecki KC, Engelman CD. Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genetics 2018; 19(Suppl 1): 65. doi: 10.1186/s12863-018-0633-8

36. Reif DM, Motsinger AA, McKinney BA, et al. Feature Selection using a Random Forests Classifier for the Integrated Analysis of Multiple Data Types. In: Proceedings of the 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology; pp. 1-8.

37. Pal M, Foody G. Feature Selection for Classification of Hyperspectral Data by SVM. IEEE Transactions on Geoscience and Remote Sensing 2010; 48: 2297-2307. doi: 10.1109/TGRS.2009.2039484

38. Akkaya B. The Effect of Recursive Feature Elimination with Cross-Validation Method on Classification Performance with Different Sizes of Datasets. In: Proceedings of the 4th International Conference on Data Science & Applications; 2021; Istanbul, Turkey.

39. Singhi S, Liu H. Feature subset selection bias for classification learning. In: Proceedings of the 23rd International Conference on Machine Learning—ICML. pp. 849-856.

40. Sarker IH. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science 2021; 2: 160. doi: 10.1007/s42979-021-00592-x

41. Li Y, Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020; 415: 295-316. doi: 10.1016/j.neucom.2020.07.061

42. Montesinos López OA, Montesinos López A, Crossa J. Overfitting, Model Tuning, and Evaluation of Prediction Performance. In: Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer; 2022. pp. 109-139.

43. Raschka S. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. arXiv 2018.

44. Sebastian Raschka. Available online: https://sebastianraschka.com/blog/2016/model-evaluation-selection-part1.html (accessed on 3 November 2023).

45. Charbuty B, Abdulazeez A. Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends 2021; 2(1): 20-28. doi: 10.38094/jastt20165

46. Schapire RE. The Boosting Approach to Machine Learning: An Overview. In: Denison DD, Hansen MH, Holmes CC, et al. (editors). Nonlinear Estimation and Classification. Lecture Notes in Statistics; Springer; 2003. pp. 37-64.

47. Zhu N, Zhu C, Zhou L, et al. Optimization of the Random Forest Hyperparameters for Power Industrial Control Systems Intrusion Occurrence Using an Improved Grid Search Algorithm. Applied Sciences 2022; 12: 10456. doi: 10.3390/app122010456

48. Manish S, Parul G. A Review on Analysis of K-Nearest Neighbor Classification Machine Learning Algorithms based on Supervised Learning. International Journal of Engineering Trends and Technology 2022; 70(7): 43-48. doi: 10.14445/22315381/IJETT-V70I7P205

49. Peng J, Lee K, Ingersoll G. An Introduction to Logistic Regression Analysis and Reporting. Journal of Educational Research 2002; 96(1): 3-14. doi: 10.1080/00220670209598786

50. Nayak J, Naik B, Behera H. A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges. International Journal of Database Theory and Applications 2015; 8: 169-186. doi: 10.14257/ijdta.2015.8.1.18

51. Cichosz P. Assessing the Quality of Classification Models: Performance Measures and Evaluation Procedures. Open Engineering 2011; 1: 132-158. doi: 10.2478/s13531-011-0022-9

52. Gogtay NJ, Thatte UM. Statistical Evaluation of Diagnostic Tests (Part 1): Sensitivity, Specificity, Positive and Negative Predictive Values. Journal of the Association of Physicians of India 2017; 65(6): 80-84.

53. Fawcett T. An Introduction to ROC Analysis. Pattern Recognition Letters 2006; 27: 861-874. doi: 10.1016/j.patrec.2005.10.010

54. Nahm FS. Receiver operating characteristic curve: overview and practical use for clinicians. Korean Journal of Anesthesiology 2022; 75(1): 25–36. doi: 10.4097/kja.21209

55. Braga-Neto UM, Dougherty ER. Is cross-validation valid for small-sample microarray classification? Bioinformatics 2004; 20(3): 374-380. doi: 10.1093/bioinformatics/btg419

56. Ojala M, Garriga GC. Permutation Tests for Studying Classifier Performance. Journal of Machine Learning Research 2010; 11:1833-1863.

57. Almeida RND, Greenberg M, Bundalovic-Torma C, et al. Predictive modeling of Pseudomonas syringae virulence on bean using gradient boosted decision trees. PLOS Pathogens 2022; 18(7): e1010716. doi: 10.1371/journal.ppat.1010716

58. Olivares Campos BO. Evaluation of the Incidence of Banana Wilt and its Relationship with Soil Properties. In: Banana Production in Venezuela. The Latin American Studies Book Series; Springer; 2023.

59. Ahmed K, Shahidi TR, Irfanul Alam SM, Momen AS. Rice Leaf Disease Detection Using Machine Learning Techniques. In: Proceedings of the 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI); 24-25 December 2019; Dhaka, Bangladesh. pp. 1-5.

60. Moral J, Trapero A. Assessing the Susceptibility of Olive Cultivars to Anthracnose Caused by Colletotrichum acutatum. Plant Disease 2009; 93(10): 1028-1036. doi: 10.1094/PDIS-93-10-1028

61. Sergeeva V. Using copper sprays to control olive diseases. Australian & New Zealand Olivegrower & Processor 2010; 72: 41-42.

62. Roca L, Moral JR, Viruega A, et al. Copper fungicides in the control of olive diseases. Olea 2007; 26: 48-50.

63. Fernández-Escobar R. Olive Nutritional Status and Tolerance to Biotic and Abiotic Stresses. Frontiers in Plant Science 2019; 10: 1151. doi: 10.3389/fpls.2019.01151

64. Sergeeva V. Balanced plant nutrition may help reduce anthracnose. The Olive Press: Pests and Diseases 2011; pp. 23-24.




DOI: https://doi.org/10.32629/jai.v7i5.1466

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Klimentia Kottaridi, Anna Milionis, Vasilis Demopoulos, Vasileios Nikolaidis, Polina C. Tsalgatidou, Athanasios Tsafouros, Anastasios Kotsiras, Alexandros Vithoulkas

License URL: https://creativecommons.org/licenses/by-nc/4.0/