With her PhD in her pocket, Erica Valentini is now ready to move onto the next stage of her career. But not before she has made sure the product of her PhD project – the Small Angle Scattering Biological Data Bank or SASBDB for short – is in good hands. “The project really was my baby!” she says with a wide grin. “Now it’s ready for the real world and is learning to walk!”
As a tool for gleaning information about the 3D atomic structure of proteins, Small Angle Scattering (SAS) is gaining in popularity and importance. As life scientists realise its potential, especially in combination with other structural biology approaches such as crystallography, so the amount of data being produced is increasing. The SASBDB is a repository for SAS data and models, and is presently the world’s largest database for user-friendly storage and searching of SAS X-ray (SAXS) and Neutron (SANS) data.
This is a really useful and much-needed resource for the SAS community
“This is a really useful and much-needed resource for the SAS community,” explains Dmitri Svergun, head of the SAXS group in Hamburg and Valentini’s PhD supervisor. “With an increasing amount of SAS data becoming available, the need for a comprehensive repository has become quite urgent,” he adds. “We are pleased we could address this need and can now present the database to the SAS community.” Prior to the SASBDB, several SAXS models – often submitted alongside crystallographic data – were stored in the worldwide Protein Data Bank (wwPDB). Recognising that they did not have the expertise nor resources to adequately handle and curate the SAXS data, the wwPDB established a task force to draw up guidelines for a dedicated SAS data repository.
A few years on, and scientists can now use the SASBDB to access and download data related to a SAXS/SANS experiment, rather than just viewing an image of the model or scattering curve in a publication. Currently, the SASBDB is receiving new entries every few weeks, and slowly but surely the database is growing. “We are asking many of our collaborators to deposit their structures in the data bank and we hope that it will become standard practice, just like the PDB is standard for crystallographic data,” explains Valentini. The SASBDB is now ready to receive the SAXS models that were previously stored in the wwPDB. “We have started doing tests and writing scripts to import the data,” Valentini says. “We already have about 190 entries, and we are gearing up to take another 50 or so.”
It is crucial that we have access to comprehensive raw data from all SAS experiments so that these can always be referred to, reinterpreted and reanalysed.
“Increasingly, techniques other than crystallography, NMR and electron microscopy are being used by structural biologists to study complex biological systems,” says Gerard Kleywegt, who heads the PDBe (Protein Data Bank in Europe) hosted at EMBL-EBI, “Hybrid methods, where multiple techniques are used, are becoming more and more common.” Kleywegt welcomes the launch of the SASBDB and the collaboration with wwPDB: “Having a major standard repository for each of these techniques is absolutely vital for scientists worldwide – as time goes on and methods develop, it is crucial that we have access to comprehensive raw data from all SAS experiments so that these can always be referred to, reinterpreted and reanalysed.” If not otherwise communicated to the SASBDB team, entries will be published six months after submission.
Remarkably, the database actually stemmed from a small side project, and put Valentini on the right path after a stumbling start to her PhD. “Our collaborators were starting a database for storing experimental data from different techniques, including SAXS, and asked for our contribution,” she explains. Svergun asked if she would like to be involved by providing them with some SAXS data – having studied databases as part of her Master’s degree, Erica knew that in order to store data you needed to understand the structure behind it first. “I drew a schema of what the database should look like – when Dmitri saw that I understood these things, we started to consider whether we could do something ourselves,” she says. “I feel really lucky with my PhD – Dmitri really understood my strengths and pushed me in the right direction.” Now in the process of making plans to leave Hamburg, Valentini has had to step back from the project and hand the reins to colleagues within the SAXS group. “It’s hard to let go of your baby,” she smiles, “but I am really satisfied with how it has all worked out, and I will keep on watching it grow!”
SASBDB facts and figures
– SASBDB is a searchable, curated repository of freely accessible and downloadable experimental data, which are deposited together with the relevant experimental conditions, sample details, derived models and their fits.
– SASBDB currently contains almost 200 experimental data sets and approaching 300 models.
– The database includes 17 entries from well-characterised, highly purified proteins: examples of good data that can be used for teaching, learning and programming purposes.
– Almost 3000 new users from across the globe have accessed the website since its launch in August 2014.
– More than 700 entries have been downloaded since its launch.
– The SASBDB is maintained by the Biological Small Angle Scattering Group, EMBL Hamburg.
– For questions and feedback, contact: firstname.lastname@example.org.
– Follow @svergungroup on twitter for SASBDB news and updates.