How EMBL alumnus Angus Lamond is driving forward new ways to explore the proteome
“Big data wasn’t a term I’d heard of ten years ago,” says Angus Lamond, an EMBL alumnus who is taking a new approach to proteomics research at the University of Dundee in Scotland. “In that short time, technology accelerated so quickly that the development of user-friendly tools for biologists to manage and analyse their data lagged behind.”
Lamond, a biochemist who was a group leader in the Gene Expression Programme at EMBL’s Heidelberg site from 1987-95, has emerged as a champion of user experience-driven design (UX). Accordingly, he has structured his laboratory to keep experimentalists and commercially trained computer scientists working hand-in-glove. This has been particularly valuable in the development of the group’s suite of data analysis, management and sharing tools, called PepTracker.
From genes to proteins
“We now have access to incredibly powerful, sophisticated technology to explore the proteome, which is the major driver of cell phenotype and provides profound insights into mechanisms of regulation and disease,” explains Lamond. “Proteomic analysis lets us see how changes in gene expression cascade through to changes in protein expression – for example when different types of T cells differentiate, or when healthy cells transform into cancer cells. We want to help researchers interact easily with such large-scale, complex datasets, so they can focus on designing and interpreting their experiments, without needing specialised training.”
PepTracker is particularly effective in the way it links with other online resources, such as UniProt. The PepTracker Encyclopedia of Protein Dynamics tool provides an open-access graphical interface for exploring the very large, complex datasets produced in proteomics and transcriptomics experiments. It is accessible on both desktop and mobile devices.
“When we started about nine years ago, our aim was to optimise data management, visualisation and analysis,” says Lamond. “Our original database was built by Yasmeen Ahmad, a computer science undergraduate who learned biology through sheer immersion, working closely with experimental scientists in our group.
“The PepTracker project has continued this interdisciplinary approach and has been successful, largely because we’ve brought on board people with expertise in data science in the commercial world. That lets us harness the remarkable advances in business intelligence technology and translate it to advance data analysis in biology. Both rely on rich, precise, standardised metadata to carry out analytical processes that are fundamentally similar.”
Lamond’s PepTracker team is building new graph-database solutions, integrating them with interactive visualisation tools to make it easier for biologists to explore and interpret the vast, complex datasets arising from large-scale ‘omics’ experiments.
“We’re trying to push the boundaries of scale and complexity, enabling work we couldn’t have done even six months ago,” says Lamond. “Computationally, we design everything with metadata in mind, and pay massive attention to both interface design and user interaction with the data. Ultimately, we’d like biologists to be able to do big-data analysis on whatever device works best for them – tablet, desktop or mobile – and to know they will be able to get their hands on the information they need immediately.”
Angus Lamond is a former member of the Scientific Advisory Board for the PRIDE proteomics database at EMBL-EBI. He also spoke at the EMBL-Wellcome Genome Campus conference Proteomics in Cell Biology and Disease Mechanisms, which took place at EMBL’s Advanced Training Centre 7 September.