We are EMBL: Alana Sousa on data and machine learning, from young stars to crystals
EMBL Grenoble Data Scientist for AI Alana Sousa talks about how she went from analysing star formation to improving a machine learning tool for crystal identification in structural biology
An astrophysicist by training, Alana Sousa is currently a Data Scientist for AI in the Marquez Team at EMBL Grenoble, working on machine learning for crystallisation processes. We caught up with her to know more about her career path, how she went from astrophysics to applying artificial intelligence methods for crystallisation, and what she enjoys besides science.
Can you tell me more about your academic background?

I did all my studies in Brazil, where I’m from. My first contact with astrophysics was during my undergraduate studies in physics. This was also when I started coding and building my computational foundations. After obtaining my PhD in physics, with an emphasis on astrophysics, at the Federal University of Minas Gerais, I spent several years as a postdoctoral fellow in Brazil and France. In France, I worked for six years at the Institute for Planetary Sciences and Astrophysics, located here in Grenoble.
Throughout those years, I worked in the field of star formation, studying young stars. My goal was to trace their evolution into early stages of planetary system formation, combining theoretical insights with observational data from different telescopes and satellites. All of these experiences gave me the opportunity to grow as a researcher.
What made you decide to specialise in AI and machine learning?
Since completing my PhD, I have been working with data, but using a classical approach focused on observational data from stars. After my last postdoctoral fellowship, I decided to take a six-month training program dedicated to AI and machine learning. I wanted to align my experience with current market needs and open myself up to opportunities in other fields of research. And that is how an astrophysicist ends up in a structural biology lab!
Can you explain your current project at EMBL in a few sentences?
I am working in the Marquez Team at EMBL Grenoble, which operates the High-throughput Crystallography Laboratory (HTX), where they grow protein crystals. This is a challenging process that requires performing a large number of experiments, and each experiment involves repeated imaging over weeks – sometimes up to three months – to capture different phases. This generates millions of images annually, making manual annotation and visual inspection impractical.
The team has already developed an AI model called AXIS that works well at classifying these images as ‘crystal’ or ‘no crystal’. But this model was trained using a small dataset with manual annotations. Therefore, my project here is to use self-supervised learning using unlabelled images to improve the model, checking if we can get better classification results. Additionally, we are building a multiclass model that can identify not only the presence of crystals, but also other important stages of the crystallisation process.
What excites you most about the possibilities of AI in research?
I believe we should see AI as a helpful tool to accelerate the process and avoid bottlenecks in research. By automating repetitive tasks like image inspection, it frees up our time to focus on the data analysis.
At what age did you decide you wanted to be a scientist, and what triggered that?
I do not know exactly at what age it started, because when I was a child, I did not even know that it was possible to be a scientist: it seemed like a completely different world! But, together with my brother, I always wanted to know the ’why’ behind everything, like “why do ants walk in a line?” We often looked at the sky and asked questions. I was always doing small experiments to understand things. Later on, I decided to study physics because I liked it in school, and one day, I realised I was a scientist.

Besides science, what are your passions?
I like many things; each year I discover a new passion! Lately, I enjoy spending time with my 3D printer, and because I like to paint, I use it mostly to print figurines from movies and TV shows and hand-paint them. I also like cooking, taking care of plants, hunting mushrooms in the mountains (it is an addiction!), and dancing, especially forró.