Refine
Document Type
- Article (3)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Biocatalysis (2)
- protein engineering (2)
- Catalytic Activity (1)
- Machine Learning (1)
- Stereoselectivity (1)
- Transaminases (1)
- amine transaminases (1)
- asymmetric synthesis (1)
- catalytic activity (1)
- machine learning (1)
Institute
Amine transaminases (ATAs) are powerful biocatalysts for the stereoselective synthesis of chiral amines. However, wild-type ATAs usually show pH optima at slightly alkaline values and exhibit low catalytic activity under physiological conditions. For efficient asymmetric synthesis ATAs are commonly used in combination with lactate dehydrogenase (LDH, optimal pH: 7.5) and glucose dehydrogenase (GDH, optimal pH: 7.75) to shift the equilibrium towards the synthesis of the target chiral amine and hence their pH optima should fit to each other. Based on a protein structure alignment, variants of (R)-selective transaminases were rationally designed, produced in E. coli, purified and subjected to biochemical characterization. This resulted in the discovery of the variant E49Q of the ATA from Aspergillus fumigatus, for which the pH optimum was successfully shifted from pH 8.5 to 7.5 and this variant furthermore had a two times higher specific activity than the wild-type protein at pH 7.5. A possible mechanism for this shift of the optimal pH is proposed. Asymmetric synthesis of (R)-1-phenylethylamine from acetophenone in combination with LDH and GDH confirmed that the variant E49Q shows superior performance at pH 7.5 compared to the wild-type enzyme.
Amine transaminases (ATAs) are powerful biocatalysts for the stereoselective synthesis of chiral amines. Machine learning provides a promising approach for protein engineering, but activity prediction models for ATAs remain elusive due to the difficulty of obtaining high-quality training data. Thus, we first created variants of the ATA from Ruegeria sp. (3FCR) with improved catalytic activity (up to 2000-fold) as well as reversed stereoselectivity by a structure-dependent rational design and collected a high-quality dataset in this process. Subsequently, we designed a modified one-hot code to describe steric and electronic effects of substrates and residues within ATAs. Finally, we built a gradient boosting regression tree predictor for catalytic activity and stereoselectivity, and applied this for the data-driven design of optimized variants which then showed improved activity (up to 3-fold compared to the best variants previously identified). We also demonstrated that the model can predict the catalytic activity for ATA variants of another origin by retraining with a small set of additional data.
Protein engineering is essential for altering the substrate scope, catalytic activity and selectivity of enzymes for applications in biocatalysis. However, traditional approaches, such as directed evolution and rational design, encounter the challenge in dealing with the experimental screening process of a large protein mutation space. Machine learning methods allow the approximation of protein fitness landscapes and the identification of catalytic patterns using limited experimental data, thus providing a new avenue to guide protein engineering campaigns. In this concept article, we review machine learning models that have been developed to assess enzyme-substrate-catalysis performance relationships aiming to improve enzymes through data-driven protein engineering. Furthermore, we prospect the future development of this field to provide additional strategies and tools for achieving desired activities and selectivities.