Data-driven protein engineering: learning the sequence-function mapping from experimental data
ChEMS Seminar
Dr. Philip Romero
Postdoctoral Fellow
University of California, San Francisco
Proteins are amazingly diverse molecules that are capable of performing a wide variety of chemical and biological tasks. Such versatility presents tremendous opportunities for solving challenging human problems that range from medicine and agriculture to environmental protection and industrial chemistry. Despite this great potential, our ability to design proteins with tailor-made functions has been impeded by our limited understanding of these complex molecules.
Rational protein engineering relies on accurate models that relate a protein's sequence to its function. However, many molecular properties are extremely difficult to model because they may be poorly understood or involve subtle, possibly dynamic, structural changes. In this talk, I will present an alternative modeling approach where statistical models are used to learn the relationship between protein sequence and function from experimental data. These data-driven methods are able to implicitly capture the numerous and possibly unknown factors that shape the sequence-function mapping. Using these models, I describe an adaptive protein design algorithm that can efficiently identify optimized protein sequences. I will finish by describing my current work in high-throughput experimentation and how new technologies are being used to generate protein sequence-function data sets of an unprecedented scale.
Bio:
Phil Romero is currently a postdoctoral fellow in Adam Abate's lab at UCSF where he is developing microfluidic technologies for protein engineering. He obtained his B.S.E. and M.S. degrees from Tulane University in Biomedical Engineering and Molecular Biology, respectively. As a graduate student at Caltech, he worked in Frances Arnold's laboratory, where he engineered proteins for a variety of applications including medical imaging, cancer therapeutics, and biofuel production. His thesis research focused on developing new statistical methods that can learn the relationship between protein sequence and function from experimental data.
Share
Upcoming Events
-
MSE 298 Seminar: “You Can’t 3D Print That” as an Inspiration for New Technologies
-
CBE 298 Seminar: Dynamically Tunable Visible and IR Structural Colors
-
MSE 298 Seminar: Improving Coherence In Superconducting Qubits For Quantum Computing
-
EECS Seminar: Evaluating Generative AI in Healthcare
-
CEE Seminar: Machine Learning and Neural Networks for Porous Media and Materials - From Fluid Flow, Transport and Deformation to Learning the Governing Equations for Datasets