Machine learning and experimental design for optimizing nitrogen-rich extract from cassava leaves via liquid hot water extraction
Main Article Content
Abstract
Cassava leaves are a significant source of nitrogen; however, the severity of the physicochemical extraction processes negatively affects nitrogen release. The objective of this study was to enhance nitrogen-rich extract recovery from cassava leaves through a comparative analysis of various experimental designs and machine learning (ML) techniques. Using the Plackett–Burman design, central composite design, and response surface methodology, the optimal extraction conditions were established: 20 min extraction time, 40% solid loading, and 150 mL extraction volume. The predicted amino nitrogen content reached 209 mg of N, showing a 6% deviation from the experimentally measured value. ML models—specifically, the support vector machine with a radial basis function kernel and random forest (RF)—were subsequently employed to refine the extraction conditions. The RF model showed a 6.6% deviation from the actual value, while both models identified the positive impact of increased solid loading on the total nitrogen recovery. These findings suggest that ML approaches offer promising potential for maximizing the amino nitrogen yield from cassava leaves.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Aftab, M. N., Iqbal, I., Riaz, F., Karadag, A., & Tabatabaei, M. (2019). Different pretreatment methods of lignocellulosic biomass for use in biofuel production. In A. Abomohra (Ed.), Biomass bioenergy - Recent trends and future challenges (pp. 15–38). IntechOpen. https://doi.org/10.5772/intechopen.84995
Arboretti, R., Ceccato, R., Pegoraro, L., & Salmaso, L. (2022). Design of experiments and machine learning for product innovation: A systematic literature review. Quality and Reliability Engineering International, 38(2), 1131–1156. https://doi.org/10.1002/qre.3025
Ashokkumar, V., Venkatkarthick, R., Jayashree, S., Chuetor, S., Dharmaraj, S., Kumar, G., Chen, W.-H., & Ngamcharussrivichai, C. (2022). Recent advances in lignocellulosic biomass for biofuels and value-added bioproducts - A critical review. Bioresource Technology, 344(Part B), Article 126195. https://doi.org/10.1016/j.biortech.2021.126195
Azwanida, N. N. (2015). A review on the extraction methods use in medicinal plants, principle, strength and limitation. Medicinal and Aromatic Plants, 4(3), Article 196. https://doi.org/doi:10.4172/2167-0412.1000196
Boundy-Mills, K., Karuna, N., Garay, L. A., Lopez, J. M., Yee, C., Hitomi, A., Nishi, A. K., Enriquez, L. L., Roberts, C., Block, D. E., & Jeoh, T. (2019). Conversion of cassava leaf to bioavailable, high-protein yeast cell biomass. Journal of the Science of Food and Agriculture, 99(6), 3034–3044. https://doi.org/10.1002/jsfa.9517
Box, G. E. P., & Wilson, K. B. (1951). On the experimental attainment of optimum conditions. Journal of the Royal Statistical Society: Series B (Methodological), 13(1), 1–38. https://doi.org/10.1111/j.2517-6161.1951.tb00067.x
Cavazzuti, M. (2013). Design of experiments. In M. Cavazzuti (Ed.), Optimization methods: From theory to design scientific and technological aspects in mechanics (pp. 13–42). Springer. https://doi.org/10.1007/978-3-642-31187-1_2
Chahyadi, A., & Elfahmi. (2020). The influence of extraction methods on rutin yield of cassava leaves (Manihot esculenta Crantz). Saudi Pharmaceutical Journal, 28(11), 1466–1473. https://doi.org/10.1016/j.jsps.2020.09.012
Chaiareekitwat, S., Latif, S., Mahayothee, B., Khuwijitjaru, P., Nagle, M., Amawan, S., & Müller, J. (2022). Protein composition, chlorophyll, carotenoids, and cyanide content of cassava leaves (Manihot esculenta Crantz) as influenced by cultivar, plant age, and leaf position. Food Chemistry, 372, Article 131173. https://doi.org/10.1016/j.foodchem.2021.131173
Coşgun, A., Günay, M. E., & Yıldırım, R. (2021). Exploring the critical factors of algal biomass and lipid production for renewable fuel production by machine learning. Renewable Energy, 163, 1299–1317. https://doi.org/10.1016/j.renene.2020.09.034
Díez Valbuena, G., García Tuero, A., Díez, J., Rodríguez, E., & Hernández Battez, A. (2024). Application of machine learning techniques to predict biodiesel iodine value. Energy, 292, Article 130638. https://doi.org/10.1016/j.energy.2024.130638
Hue, K. T., Van, D. T. T., Ledin, I., Wredle, E., & Spörndly, E. (2012). Effect of harvesting frequency, variety and leaf maturity on nutrient composition, hydrogen cyanide content and cassava foliage yield. Asian-Australasian Journal of Animal Sciences, 25(12), 1691–1700. https://doi.org/10.5713/ajas.2012.12052
Iweka, S. C., Ozioko, F. C., Edafiadhe, E. D., & Adepoju, T. F. (2023). Bio-oil production from ripe pawpaw seeds and its optimal output: Box-Behnken Design and Machine Learning approach. Scientific African, 21, Article e01826. https://doi.org/10.1016/j.sciaf.2023.e01826
Jiang, F., Huo, L., Chen, D., Cao, L., Zhao, R., Li, Y., & Guo, T. (2023). The controlling factors and prediction model of pore structure in global shale sediments based on random forest machine learning. Earth-Science Reviews, 241, Article 104442. https://doi.org/10.1016/j.earscirev.2023.104442
Jönsson, L. J., & Martín, C. (2016). Pretreatment of lignocellulose: Formation of inhibitory by-products and strategies for minimizing their effects. Bioresource Technology, 199, 103–112. https://doi.org/10.1016/j.biortech.2015.10.009
Kammoun, M., Margellou, A., Toteva, V. B., Aladjadjiyan, A., Sousa, A. F., Luis, S. V., Garcia-Verdugo, E., Triantafyllidis, K. S., & Richel, A. (2023). The key role of pretreatment for the one-step and multi-step conversions of European lignocellulosic materials into furan compounds. RSC Advances, 13(31), 21395–21420. http://doi.org/10.1039/d3ra01533e
Karuna, N., Buapho, P., Sukphan, S., Bootrumka, P., Poolthong, T., Kiatkittipong, W., & Jaturapiree, P. (2025). Cassava leaf extract for enhanced biobutanol production from sugarcane bagasse using Clostridium beijerinckii. Biomass Conversion and Biorefinery, 15(14), 21247–21259. http://doi.org/10.1007/s13399-025-06649-8
Karuna, N., Jindapang, P., Saengphenchan, R., Panpedthan, J., & Supasorn, S. (2023). Cassava leaves as an alternative nitrogen source for ethanol fermentation. Bioenergy Research, 16(2), 835–842. https://doi.org/10.1007/s12155-022-10473-7
Karuna, N., Zhang, L., Walton, J. H., Couturier, M., Oztop, M. H., Master, E. R., McCarthy, M. J., & Jeoh, T. (2014). The impact of alkali pretreatment and post-pretreatment conditioning on the surface properties of rice straw affecting cellulose accessibility to cellulases. Bioresource Technology, 167, 232–240. http://doi.org/10.1016/j.biortech.2014.05.122
Katongtung, T., Phromphithak, S., Onsree, T., & Tippayawong, N. (2024). Machine learning approach for predicting hydrothermal liquefaction of lignocellulosic biomass. Bioenergy Research, 7(4), 2246–2258. https://doi.org/10.1007/s12155-024-10773-0
Kim, J. S., Lee, Y. Y., & Kim, T. H. (2016). A review on alkaline pretreatment technology for bioconversion of lignocellulosic biomass. Bioresource Technology, 199, 42–48. https://doi.org/10.1016/j.biortech.2015.08.085
Kok, Z. H., Mohamed Shariff, A. R., Alfatni, M. S. M., & Khairunniza-Bejo, S. (2021). Support vector machine in precision agriculture: A review. Computers and Electronics in Agriculture, 191, Article 106546. https://doi.org/10.1016/j.compag.2021.106546
La Fé-Perdomo, I., Ramos-Grez, J. A., Jeria, I., Guerra, C., & Barrionuevo, G. O. (2022). Comparative analysis and experimental validation of statistical and machine learning-based regressors for modeling the surface roughness and mechanical properties of 316L stainless steel specimens produced by selective laser melting. Journal of Manufacturing Processes, 80, 666–682. https://doi.org/10.1016/j.jmapro.2022.06.021
Lamidi, S., Olaleye, N., Bankole, Y., Obalola, A., Aribike, E., & Adigun, I. (2022). Applications of response surface methodology (RSM) in product design, development, and process optimization. In P. Kayarogannam (Ed.), Response surface methodology - Research advances and applications (pp. 1–19). IntechOpen. https://doi.org/10.5772/intechopen.106763
Lammens, T. M., Franssen, M. C. R., Scott, E. L., & Sanders, J. P. M. (2012). Availability of protein-derived amino acids as feedstock for the production of bio-based chemicals. Biomass and Bioenergy, 44, 168–181. https://doi.org/10.1016/j.biombioe.2012.04.021
Malek, S., Hui, C., Aziida, N., Cheen, S., Toh, S., & Milow, P. (2019). Ecosystem monitoring through predictive modeling. In S. Ranganathan, M. Gribskov, K. Nakai, & C. Schönbach (Eds.), Encyclopedia of bioinformatics and computational biology (Vol. 3, pp. 1–8). Academic Press. https://doi.org/10.1016/B978-0-12-809633-8.20060-5
Manach, C., Scalbert, A., Morand, C., Rémésy, C., & Jiménez, L. (2004). Polyphenols: Food sources and bioavailability. The American Journal of Clinical Nutrition, 79(5), 727–747. https://doi.org/10.1093/ajcn/79.5.727
Manoharan, A., Begam, K. M., Aparow, V. R., & Sooriamoorthy, D. (2022). Artificial Neural Networks, Gradient Boosting and Support Vector Machines for electric vehicle battery state estimation: A review. Journal of Energy Storage, 55(Part A), Article 105384. https://doi.org/10.1016/j.est.2022.105384
Moldovan, M. L., Iurian, S., Puscas, C., Silaghi-Dumitrescu, R., Hanganu, D., Bogdan, C., Vlase, L., Oniga, I., & Benedec, D. (2019). A design of experiments strategy to enhance the recovery of polyphenolic compounds from Vitis vinifera by-products through heat reflux extraction. Biomolecules, 9(10), Article 529. https://doi.org/10.3390/biom9100529
Montesinos López, O. A., Montesinos López, A., & Crossa, J. (2022). Overfitting, model tuning, and evaluation of prediction performance. In O. A. Montesinos López, A. Montesinos López, & J. Crossa (Eds.), Multivariate statistical machine learning methods for genomic prediction (pp. 109–139). Springer. https://doi.org/10.1007/978-3-030-89010-0_4
Mota, F. L., Queimada, A. J., Pinho, S. P., & Macedo, E. A. (2008). Aqueous solubility of some natural phenolic compounds. Industrial & Engineering Chemistry Research, 47(15), 5182–5189. https://doi.org/10.1021/ie071452o
Öğütcü, M., Dincer Albayrak, E., & Toklucu, A. K. (2024). Optimization of organogels prepared with turpentine oil and wax mixtures via response surface methodology and determination of vaporization kinetic parameters. Journal of the Science of Food and Agriculture, 104(11), 6431–6438. https://doi.org/10.1002/jsfa.13466
Olawuni, O. A., Sadare, O. O., & Moothi, K. (2024). Optimization of liquid hot water pretreatment for extraction of nanocellulose crystal from South African waste corncobs. Chemical Engineering Communications, 211(1), 26–39. https://doi.org/10.1080/00986445.2023.2218269
Patil, S. S., Deshannavar, U. B., Gadekar-Shinde, S. N., Gadagi, A. H., & Kadapure, S. A. (2023). Optimization studies on batch extraction of phenolic compounds from Azadirachta indica using genetic algorithm and machine learning techniques. Heliyon, 9(11), Article e21991. https://doi.org/10.1016/j.heliyon.2023.e21991
Petchimuthu, P., Sumanth, G. B., Kunjiappan, S., Kannan, S., Pandian, S. R. K., & Sundar, K. (2023). Green extraction and optimization of bioactive compounds from Solanum torvum Swartz. using ultrasound-aided solvent extraction method through RSM, ANFIS and machine learning algorithm. Sustainable Chemistry and Pharmacy, 36, Article 101323. https://doi.org/10.1016/j.scp.2023.101323
Phromphithak, S., Onsree, T., & Tippayawong, N. (2021). Machine learning prediction of cellulose-rich materials from biomass pretreatment with ionic liquid solvents. Bioresource Technology, 323, Article 124642. https://doi.org/10.1016/j.biortech.2020.124642
Plackett, R. L., & Burman, J. P. (1946). The design of optimal multifactorial experiments. Biometrika, 33(4), 305–325. https://doi.org/10.2307/2332195
Roy, A., & Chakraborty, S. (2023). Support vector machine in structural reliability analysis: A review. Reliability Engineering & System Safety, 233, Article 109126. https://doi.org/10.1016/j.ress.2023.109126
Ruiz, H. A., Conrad, M., Sun, S.-N., Sanchez, A., Rocha, G. J. M., Romaní, A., Castro, E., Torres, A., Rodríguez-Jasso, R. M., Andrade, L. P., Smirnova, I., Sun, R.-C., & Meyer, A. S. (2020). Engineering aspects of hydrothermal pretreatment: From batch to continuous operation, scale-up and pilot reactor under biorefinery concept. Bioresource Technology, 299, Article 122685. https://doi.org/10.1016/j.biortech.2019.122685
Saha, R., Chauhan, A., & Rastogi Verma, S. (2024). Machine learning: An advancement in biochemical engineering. Biotechnology Letters, 46(4), 497–519. https://doi.org/10.1007/s10529-024-03499-8
Sittichoksataporn, J., & Choksuriwong, A. (2012, May 14–15). Comparison of support vector machine’s kernel function for unsmoke sheet rubber price forecasting [Conference session]. The 10th International PSU Engineering Conference, Prince of Songkla University, Songkhla, Thailand.
SixSigma. (2024, May 31). Screening DOE: Efficient factorial designs for identifying key variables. https://www.6sigma.us/six-sigma-in-focus/screening-doe/
Sluiter, A., Hames, B., Ruiz, R., Scarlata, C., Sluiter, J., & Templeton, D. (2008a). Determination of ash in biomass: Laboratory analytical procedure (LAP), Issue date: 7/17/2005 [Technical Report NREL/TP-510-42622 January 2008]. National Renewable Energy Laboratory. https://docs.nrel.gov/docs/gen/fy08/42622.pdf
Sluiter, A., Hames, B., Ruiz, R., Scarlata, C., Sluiter, J., Templeton, D., & Crocker, D. (2012). Determination of structural carbohydrates and lignin in biomass: Laboratory analytical procedure (LAP), Issue date: 4/25/2008 [Technical Report NREL/TP-510-42618 Revised August 2012]. National Renewable Energy Laboratory. https://docs.nrel.gov/docs/gen/fy13/42618.pdf
Sluiter, A., Ruiz, R., Scarlata, C., Sluiter, J., & Templeton, D. (2008b). Determination of extractives in biomass: Laboratory analytical procedure (LAP), Issue date: 7/17/2005 [Technical Report NREL/TP-510-42619 January 2008 ]. National Renewable Energy Laboratory. https://docs.nrel.gov/docs/gen/fy08/42619.pdf
Sowcharoensuk, C. (2023). Industry outlook 2023-2025: Cassava industry. https://www.krungsri.com/en/research/industry/industry-outlook/agriculture/cassava/io/cassava-2023-2025
Sparg, S. G., Light, M. E., & van Staden, J. (2004). Biological activities and distribution of plant saponins. Journal of Ethnopharmacology, 94(2–3), 219–243. https://doi.org/10.1016/j.jep.2004.05.016
Suriyachai, N., Weerasai, K., Upajak, S., Khongchamnan, P., Wanmolee, W., Laosiripojana, N., Champreda, V., Suwannahong, K., & Imman, S. (2020). Efficiency of catalytic liquid hot water pretreatment for conversion of corn stover to bioethanol. ACS Omega, 5(46), 29872–29881. https://doi.org/10.1021/acsomega.0c04054
Thiex, N. J., Manson, H., Anderson, S., & Persson, J.-Å. (2002). Determination of crude protein in animal feed, forage, grain, and oilseeds by using block digestion with a copper catalyst and steam distillation into boric acid: Collaborative study. Journal of AOAC International, 85(2), 309–317. https://doi.org/10.1093/jaoac/85.2.309
Tomás-Pejó, E., Alvira, P., Ballesteros, M., & Negro, M. J. (2011). Pretreatment technologies for lignocellulose-to-bioethanol conversion. In A. Pandey, C. Larroche, S. C. Ricke, C.-G. Dussap, & E. Gnansounou (Eds.), Biofuels: Alternative feedstocks and conversion processes (pp. 149–176). Academic Press. https://doi.org/10.1016/B978-0-12-385099-7.00007-3
Vinitha, N., Vasudevan, J., & Gopinath, K. P. (2023). Bioethanol production optimization through machine learning algorithm approach: Biomass characteristics, saccharification, and fermentation conditions for enzymatic hydrolysis. Biomass Conversion and Biorefinery, 13(8), 7287–7299. https://doi.org/10.1007/s13399-022-03163-z
Wang, J., & Gao, R. X. (2022). Innovative smart scheduling and predictive maintenance techniques. In D. Mourtzis (Ed.), Design and operation of production networks for mass personalization in the era of cloud technology (pp. 181–207). Elsevier. https://doi.org/10.1016/B978-0-12-823657-4.00007-5
Wang, M. F. Z., & Fernandez-Gonzalez, R. (2017). (Machine-)Learning to analyze in vivo microscopy: Support vector machines. Biochimica et Biophysica Acta - Proteins and Proteomics, 1865(11, Part B), 1719–1727. https://doi.org/10.1016/j.bbapap.2017.09.013
Widodo, A., & Yang, B.-S. (2007). Support vector machine in machine condition monitoring and fault diagnosis. Mechanical Systems and Signal Processing, 21(6), 2560–2574. https://doi.org/10.1016/j.ymssp.2006.12.007
Xia, Y. (2020). Correlation and association analyses in microbiome study integrating multiomics in health and disease. In J. Sun (Ed.), Progress in molecular biology and translational science (Vol. 171, pp. 309–491). Academic Press. https://doi.org/10.1016/bs.pmbts.2020.04.003
Yahya, A. B., Usaku, C., Daisuk, P., & Shotipruk, A. (2023). Enzymatic hydrolysis as a green alternative for glyceride removal from rice bran acid oil before γ-oryzanol recovery: Statistical process optimization. Biocatalysis and Agricultural Biotechnology, 50, Article 102727. https://doi.org/10.1016/j.bcab.2023.102727
Zendehboudi, A., Baseer, M. A., & Saidur, R. (2018). Application of support vector machine models for forecasting solar and wind energy resources: A review. Journal of Cleaner Production, 199, 272–285. https://doi.org/10.1016/j.jclepro.2018.07.164
Zhang, B., Liu, X., & Bao, J. (2023). High solids loading pretreatment: The core of lignocellulose biorefinery as an industrial technology – An overview. Bioresource Technology, 369, Article 128334. https://doi.org/10.1016/j.biortech.2022.128334
Zhang, M., Cui, S. W., Cheung, P. C. K., & Wang, Q. (2007). Antitumor polysaccharides from mushrooms: A review on their isolation process, structural characteristics and antitumor activity. Trends in Food Science & Technology, 18(1), 4–19. https://doi.org/10.1016/j.tifs.2006.07.013
Zhang, W., Li, P., Wang, L., Fu, X., Wan, F., Wang, Y., Shu, L., & Yong, L.-q. (2024). Prediction of the yield strength of as-cast alloys using the random forest algorithm. Materials Today Communications, 38, Article 108520. https://doi.org/10.1016/j.mtcomm.2024.108520