Hi Isidro! Tell us a bit about yourself for those that don’t know you.
I joined EMBL-EBI in 2019 as a research group leader. My team focuses on the development of computational tools to understand the molecular alterations underpinning cancer, with a focus on the analysis of somatic mutations using sequencing data.
I am also one of the scientific organisers of this year’s EMBL-EBI Cancer genomics training course.
What is your research focus, and how long have you worked in your scientific field?
I obtained my PhD at the Pasteur Institute in 2015 before completing postdoctoral training at Harvard Medical School, under the supervision of Prof. Peter Park, and at the University of Cambridge, under the supervision of Prof. Andreas Bender. My expertise includes biology, genomics and statistical modelling.
Tell us more about the Cancer genomics course you’re involved in. What advice would you give to anyone thinking of applying?
Do it! The course provides an exciting opportunity to learn about the latest approaches for cancer genome analysis, with relevance for both research and clinical applications.
That sounds great! How has training influenced or assisted your own career do you think?
Training at the MSc/PhD level is fundamental to cement basic knowledge in the field. This includes both reading literature and hands-on training to understand the particularities and limitations of the algorithms we use, all of which helps acquire the necessary expertise to drive research projects in a rigorous manner.
Thanks Isidro, we can’t wait to hear you talk at the course this June.
On the occasion of World Cancer Day (4 February), we meet two of the trainers of the virtual EMBL Course: Cancer Genomics (17 – 21 May 2021) – Tobias Rausch and Alexey Larionov.
Tobias Rausch (TR) received his PhD in “Computational Biology and Scientific Computing” at the International Max Planck Research School in 2009. He then started to work at the European Molecular Biology Laboratory (EMBL) as a bioinformatician. His primary research interests are population and cancer genomics, structural variant discovery and omics computational methods development. (https://github.com/tobiasrausch).
Initially educated as a clinical oncologist in Russia, Alexey Larionov (AL) switched to experimental oncology upon completion of his PhD. Initially he worked as a postdoctoral researcher in Edinburgh University studying transcriptomics of breast cancer, with a focus on markers and mechanisms of endocrine response and resistance. Working with data-rich methods (qPCR, micro-arrays, NGS) he became interested in data analysis and switched to bioinformatics. Since completing his MSc in Applied Bioinformatics, Alexey has worked as a bioinformatician at Cambridge University, focusing on NGS data analysis and heritable predisposition to cancer. Seehttp://larionov.co.ukfor more details.
What is your research focus?
TR: Computational genomics.
AL: Heritable predisposition to cancer
Why did you choose to become a scientist?
TR: When I started at EMBL I saw myself as a software engineer who loves to design, develop and implement algorithms to solve data analysis problems. With the advent of high-throughput sequencing, this engineering background gave me a competitive edge as a data scientist, and that’s how it happened!
AL: It was interesting…
Where do you see this field heading in the future?
TR: Nowadays cancer genomics is a data-driven team science, but it is a long way from obtaining data to obtaining insight. In the age of analytics we all have to wrap our heads around multi-domain data with spatio-temporal resolution, ideally in real-time.
AL: I assume that the question is about translational cancer research in general. I expect that in the near future the field needs better integration of different types of biological data and better collection of relevant clinical data.
How has training influenced your career?
TR: I think training is essential to get you started. Training is like a kind person who takes your hand and guides you through unknown territory. It goes along with mentorship and I was lucky enough to have good training and good mentorship already as a student.
AL: Since my initial clinical and bioinformatics degrees, cancer research has changed so much that I would not be able to even understand current papers if I hadn’t taken regular in-depth training in different aspects of computing and bioinformatics.
How has cancer research changed over the years?
TR: I hope I am still too young to answer that :-). I leave that question for Bert Vogelstein or Robert A. Weinberg.
AL: Cancer research has become much more complex and powerful because of the development of new methods; specifically significant progress in bioinformatics, sequencing and human genomics.
Which methods and new technologies will be addressed in the course?
TR: We try to give an overview of how high-throughput sequencing can be applied in cancer genomics. We cover a range of technologies (short-read and long-read sequencing), data types (RNA-Seq, DNA-Seq and ATAC-Seq) and data modalities (bulk and single-cell sequencing), and last but not least – we take a deep dive into cancer genomics data analysis.
AL: In my sections of the course, I will discuss established methods for the analysis of bulk RNA sequencing, focusing on differential gene expression. Then I will touch on the new methods being developed for the analysis of long-read RNA sequencing.
What learning outcomes should participants expect to take home after the course?
TR: To come back to my previous answer: I hope after the course, cancer genomics won’t be an unknown territory anymore for the participants. I hope we pave the way and then it’s up to the students to make something out of it.
AL: In my section of the course, participants will learn:
1) Bioinformatics algorithms and tools for QC, alignment, and gene expression measurement in bulk short-read RNA-sequencing data
2) Current approaches to analysis of long-read RNA-seq data, comparing the Oxford Nanopore and PacBio sequencing technologies.
The 4th EMBL Conference: Cancer Genomics (4 – 6 November 2019) brought together over 240 scientists in the field of cancer research to present the latest findings in cancer functional genomics, systems biology, cancer immunogenomics and epigenomics, as well as their translation and clinical impact.
123 posters were presented at the two poster sessions, out of which two were selected as the winners by popular vote.
Infinite sites violations during tumour evolution reveal local mutational determinants
The infinite sites model of molecular evolution requires that every base in the genome is mutated at most once. It is a cornerstone of (tumour) phylogenetic analysis, and is often implied when calling, phasing and interpreting variants or studying the mutational landscape as a whole. It is unclear however, whether this assumption holds in practice for bulk tumour samples. Here we provide frameworks to model and detect infinite sites violations, identifying 24,459 in total, including 6 candidate biallelic driver events, in 700 bulk tumour samples (26.3%) from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes project. Violations generally occur at mutational hotspots and their frequency and type can accurately be predicted from the overall mutation spectrum. In melanoma, their local sequence context evidences how not only ETS, but also NFAT-family transcription factor binding creates hotspots for UV-induced cyclobutane pyrimidine dimer formation. In colorectal adenocarcinoma, violations reveal hypermutable special cases of the trinucleotide mutational contexts identified in POLE-mutant tumours. Taken together, we reveal the infinite sites model breaks down at the bulk level for a considerable fraction of tumours. These results warrant a careful evaluation of current pipelines relying on the validity of the infinite sites assumption, especially when scaling up to larger sets of mutations and lineages in the future.