Two EMBL researchers are exploring new ways to filter out noise and get to the data they need
BY EDWARD DADSWELL AND MARGAUX PHARES
Somehow, the multitudes of highly specialised cells in our bodies arise from just a single cell. It’s a process that might seem far removed from the changes that occur in the body during a disease. But for John Marioni and Oliver Stegle, research group leaders at EMBL-EBI, one principle ties these processes together.
“In effect, we’re trying to understand how cells make decisions,” says Marioni. “During development, there’s a process by which a cell decides to become a particular cell type. In a similar way, you can think of disease as a cell deciding to do something it shouldn’t. If we can understand how cells make these decisions, we’ll learn more about normal development and potentially gain deeper insight into what’s going on in disease.”
We’re trying to understand how cells make decisions
Silencing the cell cycle
To understand how these decisions are made, Marioni and Stegle use single-cell RNA sequencing, a technique that provides highly detailed information about the genes that are active in a cell. By studying the pattern of gene activity in many cells, they can identify different cell types. They can then start figuring out which genes are responsible for pushing a cell towards a particular fate.
Unfortunately there’s a problem: noise. “When you say ‘cell types,’ there’s this notion that they’re rather distinct, but in practice it’s more of a continuum,” Stegle explains. “There’s variation between cells of different types, but also between cells of the same type. Sometimes that variation is biologically interesting, but often it’s a source of noise in the data – obscuring the information you really care about.”
Marioni and Stegle recently joined forces to tackle one source of noise in the data: the cell cycle. This is the repeating process in which cells copy their DNA and then divide to produce two daughter cells. As you might expect, a cell’s gene activity is heavily influenced by where it is in the cell cycle. This is undoubtedly a biologically interesting process, but it’s not always the one that researchers like Marioni and Stegle are interested in. That’s when it becomes noise.
To cut through the noise of the cell cycle, Stegle, Marioni, and their collaborators developed a new computational approach: the single-cell latent variable model (scLVM). By analysing a small number of key genes in the cell division process, they could classify each cell according to its stage in the cell cycle. They could then cancel out the effect of the cell cycle using sophisticated statistical methods in the scLVM. This makes it possible to infer ‘corrected’ gene-expression levels for a wider range of genes.
With this corrected gene expression data, the team were able to more accurately compare gene activity in different cells. The result: the detection of cells at different stages of differentiation towards an immune cell type called a T helper cell. This provides insight into basic biology and a potential avenue for disease research.
Developing methods to remove the effect of the cell cycle is just one important step towards obtaining high-quality data from single-cell sequencing.
“One of the limitations of our model is that you have to know the source of noise you’re trying to remove,” says Marioni. “The cell cycle is an important one, but there are others. The way we handle cells in the lab, for example, could increase the expression of stress-related genes. We want to take out the step where you decide in advance which noisy factors you want to remove.”
“That will make a huge difference,” Stegle adds, “allowing us to cut out noise and let the meaningful biological signals come through.”