Ecology, Environment and Conservation Paper


Vol 31, 2, 2025

Page Number: 630-636

INTEGRATING GENOMIC ANALYSIS AND ACTIVE LEARNING FOR OPTIMIZED SURFACTIN PRODUCTION IN CALIDIFONTIBACILLUS ERZURUMENSIS

Dholiya K. and Pandya N.

Abstract

This study highlights an innovative and alternative approach to enhance surfactin production in Calidifontibacillus erzurumensis by integrating genomic insights with smart optimization techniques. The organism’s genome was analysed using Gapseq to reconstruct its metabolic network, identify essential nutrient sources, and design an optimized growth medium. Comparative testing with the standard Landy medium, supplemented with additional components, demonstrated significant improvements in Surfactin yield. A Plackett-Burman design further pinpointed nitrogen source and glucose as the key factors influencing production. To fine-tune these conditions, we applied an active learning strategy using Latin Hypercube Design (LHD). By modeling surfactin production data with heteroskedastic Gaussian Process Regression (GPR) and optimizing the q-Noisy Expected Improvement (q-NEI) acquisition function, we identified seven refined medium combinations for testing. This iterative process successfully increased surfactin production from 1.10 g/l to 2.67 g/l. These findings demonstrate how integrating genomic tools with data-driven approaches can effectively optimize secondary metabolite production.