Meet the Team: An Interview with Sari Ward, Head of Curation at COSMIC
Meet the Team: An interview with Sari Ward, Head of Curation at COSMIC
At COSMIC, our mission goes far beyond collating data, it's about uniting the vast and complex cancer genomics knowledge across industries to enable researchers, clinicians, and patients to better understand the disease. Behind every COSMIC dataset and module, there are dedicated people ensuring exemplary quality.
In this series, we go behind the scenes at COSMIC, highlighting the people who are responsible for building and evolving COSMIC. Here, we sit down with Sari Ward, Head of Curation, to learn more about her journey into the world of curation, what drives her passion, and how she and her team are shaping the future of the COSMIC Knowledgebase. From her path into the field to the challenges and rewards of curating one of the world’s leading cancer mutation databases, Sari shares unique insights into the work that makes COSMIC possible.
Thanks to the carefully curated data in the COSMIC Knowledgebase, our users find it much easier to analyse variants of interest. This valuable resource can save them anywhere from a few hours to several days, making research processes more efficient and less stressful.
Can you tell us a bit about your career path and how you came to lead the curation team at COSMIC?
I have a degree in biochemistry and in my early years worked on cancer research and cancer diagnostics. Then I spent 11 years at Pfizer R&D working on a number of other diseases. During that time, I transferred from the laboratory to computational biology. I got to work on drug repurposing, novel drug target mining, supporting the therapeutic areas with biological and safety questions, many involving genomics data. As a next step in my career and development, because I was passionate about making data available and, most importantly, accessible, I turned my interest into data curation. After many years curating somatic mutations at COSMIC and working on other projects such as on our clinical vocabularies, I was offered the opportunity to lead the curation team.
What does curation at COSMIC actually involve? Can you describe the process from source to database?
COSMIC curation sources scientific publications and other high quality datasets where patient tumours have been sequenced for mutations. We extract the key data, such as tumour type, mutations and patient related metadata, standardise it and contextualise it in our knowledgebase where the rest of the world can access it. The job is very satisfying because you can see the results of your labour every day. It is also frustrating when you see how poorly data is organised at the time of its publication. A part of our work revolves around future data standards, by working with authors, publishers, other databases, diagnostic companies, and healthcare organisations to create unified vocabularies, data formats, and processes that enable the effective exchange of data to improve cancer diagnostics and care for every patient.
How do you manage the challenges of curating data from an ever-expanding scientific literature?
We have to be selective in what we curate. Although our goal is to curate every somatic mutation out there, we strategically prioritise the data that is in high demand by our users. We still need more data in every tumour type and in as many ethnic groups as possible to support statistically robust data analysis. We continuously try to boost our curation efficiency. We invest in new tools, improve our vocabularies and use advanced computational systems to help us separate the wheat from the chaff.
How does curation contribute to the real world impact of COSMIC, for researchers, clinicians, and pharmaceutical companies?
COSMIC is a compilation of 20 years of work by an expert team. We collate, extract and standardise disparate and messy data, turning them into a much cleaner and streamlined format for downstream users. Without us, researchers and clinicians would have a desperate job in trying to gather data for their work while having no sufficient time or training to do that. Data contains inconsistencies and mistakes that we curators are trained to spot and eliminate. We often collaborate with the scientists who have produced the data to clarify the inconsistencies. We unify terminology, which brings essentially similar samples together, boosting the power of the existing data. While incorporating smaller and larger studies in our database and re-sharing them in a wider context under strict quality control, we essentially help the world to recycle and reuse data that is time consuming to produce.
What makes curated data so important in the cancer genomics field?
By aggregating disparate data and serving it to the researchers and clinicians in a standardised and interoperable format, we further fuel that research and speed up the development of new and more effective treatments for cancer patients. Cancer genomics is making discoveries every day. Yet, sometimes it seems to make cancer research more complicated. One of our missions is to make sense of that complexity by presenting our data in different ways, on our website and downloads. Since our curators know the data well, they are in the ideal position to also interpret that data. Where the data gets too big and complex for the human brain, clean, curated data is ideal training material for the AI to aid scientists in further discoveries.
What are the biggest challenges facing the curation field, and how is COSMIC adapting?
The more we know about cancer the more complex it seems to get. We need to look ahead, adapt and build flexible curation systems that are fit for future data. Cheaper sequencing is also making the curated datasets bigger. Our curation system needs to be able to handle that scale. Whilst more data is curated computationally, we need to ensure that the quality stays. Whilst computation removes chance for manual error, it introduces systemic errors when software is not handling exceptions well. We built our system to allow manual intervention and plenty of QC. Precision oncology that we anticipated 20 years ago when COSMIC was set up, is finally becoming reality in the clinic. The demand for our data is growing fast and there is an urgent need for COSMIC to respond to this demand.
What makes a great curator, and what qualities do you look for when building your team?
COSMIC curators are like artisans of data. They are highly trained for their job but also need to possess a creative and philosophical mindset. Having a background in the wet lab or research is essential. A great attention to detail and persistence are important. Our curators love and care about data. It is satisfying to know that our work directly benefits patients. I think it is important to have a diverse team, to encourage open debate and consider new ideas. You also need to be humble and own up to one's mistakes and ask for help. This ensures the quality of data that we pride ourselves on. I like using every curator's experience and strengths for the benefit of COSMIC, not just for the curation of data.
Looking ahead, what excites you about the future of scientific curation?
I like the fact that curation as a field has gained recognition and has established itself in the landscape of science and data. Curation is truly an interdisciplinary field and gives curators exciting job opportunities and a chance to do something truly useful. In the world that is driven by data, there is a growing need for clean datasets and understanding of how data needs to be organised. Advances in AI and ML are going to be of great help for curation in the field of somatic mutations, and the opportunities for curation are enormous. We are only beginning to materialise the personalised cancer care for the patients. It is a privileged position to be working in the cutting edge of science and healthcare and utilise all the skills that I have gained during my career in science.
We hope you’ve enjoyed getting to know Sari and gaining a deeper appreciation for the people and stories behind COSMIC’s data. This interview is just the beginning, throughout this series, we’ll continue to introduce you to the curators, developers, and commercial team who keep COSMIC evolving and ensure it remains an invaluable resource for the global cancer research community. Stay tuned for more conversations, and don’t forget to follow us on our linkedin or subscribe to our newsletter to keep up with the latest updates.