Increased availability of predicted protein complexes

A set of high-quality structures of core eukaryotic protein complexes has recently been generated by the Baker lab at University of Washington. These complexes have been produced using a combination of RoseTTAFold and AlphaFold to complement the models in AlphaFold Protein Structure Database, which are currently limited to only monomeric protein structures. This dataset of around 1000 structural models has been deposited to ModelArchive, with users able to freely access the full set of complexes.

These complexes are also accessible through the 3D-Beacons Network, which includes the data via ModelArchive. Crucially, the 3D-Beacons Network brings together experimentally determined and predicted protein structure models and related data from several providers and makes them freely available, through a single, dedicated programmatic access point. The data are provided from a number of resources, including PDBe and AlphaFold DB, with the addition of ModelArchive now extending the network even further.

ModelArchive is a repository of theoretical structural models, developed by the Protein Structure Bioinformatics Group at the Swiss Institute of Bioinformatics (SIB). This archive allows deposition of structures which do not conform to the requirements for submission to the PDB archive. These models are stored using ModelCIF, an extension to the existing PDBx/mmCIF dictionary ensuring consistent data format for representing and archiving computed structure models.

Accessing protein complexes through PDBe-KB

Thanks to the addition of ModelArchive to the 3D-Beacons network, these high-quality protein complexes are now displayed on the PDBe-KB aggregated view of proteins, which bring together all the available data on experimental macromolecular structures in the PDB. In order to provide a greater context of structural data beyond what is archived in the PDB, these aggregated views also provide links to structures from related data resources through the 3D-Beacons Network, and now include links to these predicted complexes.

The predicted models for complexes can be accessed from the ‘Structures’ tab on the PDBe-KB pages, for example on the page for TFIID subunit 5 from S. cerevisiae. The ProtVista sequence viewer on this page displays all available structures in the PDB, but also from other resources under the ‘other structures’ section which can be expanded and scrolled through to display the available data. These structures can be displayed directly in the Mol* 3D viewer on the page by clicking on the grey bar, or alternatively, links to the data resource pages are provided on the left hand side. The video below displays the process to view this data on the PDBe-KB aggregated views of proteins.

For more information on the PDBe-KB project, visit pdbekb.org. If you have any comments or queries about PDBe-KB pages, then please contact us through the ‘Feedback’ tab from any page.

Edit