Edit

Rolf Apweiler: what I’ve learned

From student helper to EMBL-EBI Director, Rolf Apweiler has shaped the journey of EMBL and bioinformatics for over four decades

EMBL-EBI Director Rolf Apweiler shares his thoughts on the eve of his retirement. Credit: Jeff Dowling/EMBL-EBI

As a biology student in the 1980s, Rolf Apweiler applied for a student helper role at EMBL Heidelberg. Little did he know it was the beginning of an illustrious career in a field that was about to take off – bioinformatics.

We caught up with Rolf in the run-up to his retirement from EMBL to capture some of his memories and highlights, and his predictions for how open data and AI are changing the field. 

“I came to Heidelberg to study biology in 1984, and I had to work to support my young family,” remembered Apweiler. “At the beginning, I worked in factories during the school holidays, but one day I saw an advert for a student helper post at EMBL. The pay was better – 12 Deutsche Marks per hour. The job requirements included solid knowledge of biology, fluent English, and computer skills. Naturally, I applied – despite English being my worst subject and never having touched a computer. I phoned up, and to my surprise, I got the job. I was really proud of myself until I learned I had been the only person brave enough to apply!”  

Rolf with his wife Therese and youngest child Jens, in the summer of 1991 in Heidelberg. Credit: From the personal archives of Rolf Apweiler

Apweiler began curating data for the Swiss-Prot project, which later evolved into UniProt, the world’s leading resource of protein sequence and functional information. UniProt is jointly run by EMBL, the Swiss Institute of Bioinformatics (SIB), and the Protein Information Resource (PIR) in the USA, and is used by millions of scientists worldwide. Apweiler would read scientific papers and summarise important information to annotate the data in Swiss-Prot. Over time, he took on more responsibilities and became the Founder and Principal Investigator of UniProt. 

Alongside his EMBL work, Apweiler continued his studies at the University of Heidelberg, earning his undergraduate and PhD degrees. 

Bioinformatics in the age of dial-up internet

“Back then, the data volumes were much smaller, and to find literature on a protein or gene of interest, you had to spend hours or entire days digging through the library. Nowadays, researchers can do all of this and more online in an instant,” Apweiler said.

EMBL alumni Patricia Kahn and Graham Cameron, whom Apweiler worked alongside, were instrumental in persuading editors of scientific journals that the nucleotide sequences from papers should be sent to EMBL. These sequences were added to EMBL’s Data Library, the institute’s first public data resource, established in 1980. Believe it or not, sequences initially had to be typed into the database manually.

“My EMBL start coincided with the early days of the internet. You could connect to something called BITNET and send an email to someone in the US, but it took two hours to arrive. Similarly, to get the data to our users, we sent magnetic tapes through the post. I reckon we had a couple thousand users back then. Things are completely different now in scale, speed, and complexity. These days, EMBL-EBI’s data resources get over 120 million web requests every day, from 40 million IP addresses annually. The field has completely exploded during my time,” said Apweiler. 

“Rolf doesn’t saunter into a room, he bounces into a room, and this is incredibly energising for everyone.”
Nicky Mulder, H3ABioNet Principal Investigator and Head Computational Biology at the University of Cape Town

Setting up EMBL’s European Bioinformatics Institute

In 1994, Apweiler was among the seven colleagues who moved from EMBL Heidelberg to Hinxton, UK, to set up EMBL-EBI. The new institute would share a campus with the Wellcome Sanger Institute, which at the time was doing much of the DNA sequencing work for the Human Genome Project. “When we moved to Hinxton, our colleague Peter Stoehr put the computer running the Oracle Database still under a VMS operating system in the trunk of his car and drove it over to the UK. We must have had about 80GB of data back then. Nowadays, EMBL-EBI holds approximately half an exabyte of disk space,” Apweiler said.

EMBL-EBI started off with seven staff members working in portakabins. Pictured is Peter Stoehr, former EMBL-EBI Head of IT and later Head of the Literature Service, installing EMBL-EBI’s first official signage (circa 1994). Credit: EMBL Archive

Every 18 months or so, Apweiler and colleagues saw the data double in size, in line with Moore’s Law – the observation that the number of transistors on a microchip doubles approximately every two years. “Of course, we couldn’t double the number of staff as quickly, so we had to find ways to improve productivity, storage, and tools. I don’t think this challenge will ever go away, but being able to scale our resources is one of the big success stories of EMBL, in my view,” Apweiler said.

 Rolf and his team in 1999. Credit: Claire O’Donovan

Growth, excitement, and innovation 

Much like bioinformatics, Apweiler’s career was on an upward trajectory. His major contributions to the field of proteomics were recognised by the Human Proteomics Organisation’s Distinguished Achievement Award in Proteomics in 2004, and in 2007, he was elected President of the Human Proteomics Organisation. A few years later, in 2012, he was elected as a member of EMBO, and in 2015, he became an International Society for Computational Biology (ISCB) fellow.

By this time, EMBL-EBI had grown from seven members of staff to around 600. After Professor Dame Janet Thornton’s tenure at the helm of EMBL-EBI, in 2015, Rolf Apweiler and Ewan Birney became Joint Directors of the institute.

Left to right: Rolf Apweiler, Janet Thornton and Ewan Birney at Thornton’s retirement from her role as Director of EMBL-EBI. Each ‘volume’ of this impressive cake represents one of Thornton’s 15 years in the post. Can you spot the Rolf, Ewan and Janet figurines holding up the stack?  Credit: Robert Slowley

As technologies improved and the volumes and complexity of data increased further, Apweiler and Birney played essential roles in the development of data standards for proteomics and genomics. “Having shared taxonomies and data standards is incredibly important because it makes the data open for everyone,” Apweiler said. “In my view, open data and open science create the ideal conditions for collaboration and innovation. This is an argument we have to keep making loud and clear, so it never gets taken for granted.” 

EMBL and EMBL-EBI leadership, past and present, at EMBL-EBI’s 20th anniversary. Left to right: Rolf Apweiler, Janet Thornton, Edith Heard, Michael Ashburner, Ewan Birney, Graham Cameron. Credit: Phil Mynott

Bringing academia, industry, and funders together

Apweiler also played a crucial role in developing Open Targets, a unique public-private partnership to improve how scientists systematically identify and prioritise drug targets. Open Targets has been running for over a decade and has been a tremendous success. Nine out of 10 drug discovery programmes fail, usually after several years of work and millions of pounds. But studies show that with more genetic evidence, the likelihood that a drug goes to market doubles. Open Targets aims to make this evidence available in the public domain and ultimately drive up the success rate for clinical trials. 

“Rolf has always wanted to influence and change the environment where he is, and to drive change.”

As part of his mission to secure long-term, sustainable funding for public biodata resources, Apweiler was among the initiators of the Global Biodata Coalition, which brings together funders to help them coordinate and collaborate on the management and growth of biodata infrastructure worldwide. This approach ensures that biodata resources remain freely available to all researchers everywhere around the globe.

During the COVID-19 pandemic, Apweiler was a leading voice for the importance of open data sharing, global coordination, and collaboration to tackle the pandemic. He spearheaded EMBL-EBI’s efforts to develop the European COVID-19 Data Platform, with support from the European Union. Apweiler advised European governments, including the then German Chancellor Angela Merkel, and advocated for data-driven responses to keep people safe and stem the tide of the pandemic.

Below are a few of Apweiler’s reflections on his career so far, and predictions for the future, in his own words. 

Life at EMBL

In the beginning, I was too shy to take educational opportunities at EMBL. It took a while to understand that this is what EMBL does: giving learning opportunities to young people.

We never fell into the trap of thinking that the tools we developed were the best. The nature of EMBL has always been very collaborative, so we always tried to use the best thing available. I think this openness is part of EMBL’s success story.

In the 80s and 90s, EMBL Heidelberg was one of the founding places for bioinformatics. A lot of the PhDs from that time became world-renowned bioinformaticians.

Visits to the Red Lion pub in Hinxton have been a part of EMBL-EBI life since the institute’s inception. Pictured: Apweiler and colleagues enjoying an after-work tipple. Credit: Personal archives of Rolf Apweiler

The importance of open data

There are always pressures to patent sequences and put up paywalls, but they can really damage scientific progress. Look at the Human Genome Project. If the data hadn’t been made open, genomics would never have exploded like it did.

Having open dialogue between academia, industry, government, and charities is crucial. We’ve seen with Open Targets that as long as people buy into the concept of making the data open, what we can achieve by working together dwarfs individual efforts.

I disagree with the idea that industry should be treated differently from academic users. We want to embrace the private sector as a user community, to make sure data have the highest societal impact. Ultimately, it’s private companies that bring new products to the market, so we need them.

Science is for everyone and belongs to everyone. It is important to emphasise and propagate this crucial fact. 

Rolf Apweiler and UniProt Curator Hema-Bye-A-Jee trying out a public engagement activity explaining data curation and bioinformatics to non-scientists. Credit: Phil Mynott

How AI will change science and society

AI will transform the way we work. The importance of well-annotated experimental data in science is higher than ever before. If you want to have good AI predictions, you need good training data. The best example is the Nobel-Prize-winning AlphaFold AI, which wouldn’t have been possible without public data resources, including PDB, UniProt, MGnify, and so on.

EMBL-EBI also uses AI tools to speed up data annotation. Of course, AI-powered annotations are still predictions and can contain errors, but this is where the role of curators becomes even more important, because they have the expertise and experience to evaluate AI annotations. This increases productivity without diminishing the importance of our specialist curators.

I think AI will be revolutionary in the health sector, but the adoption will be slower than in research – and rightly so – because of the highly regulated environment and sensitive nature of the data.

EMBL-EBI and public data resources are foundational for the AI revolution in the life sciences. Without high-quality, well-annotated data, there is no AI.

Find out more about Rolf’s career and the history of EMBL-EBI in the video interview below, conducted by Angus Lamond and edited by Ruairi McEvoy.


Tags: alumni, bioinformatics, collaboration, covid-19, embl-ebi, protein, protein function content, uniprot, what I've learned

News archive

E-newsletter archive

EMBLetc archive

News archive

For press

Contact the Press Office
Edit