by Nicolas Descostes
In recent years, big data is a term that invaded the media and that the public has been exposed to. From finance to social networks, data are collected to infer trends and sometimes to manipulate opinions as it has been observed during recent elections. However, the public is less aware of the big data revolution that is occurring in biology. In this post, I would like to begin by explaining how big data is used in biology, and more specifically in genomics, and end by sharing some thoughts on how big data is currently shaping research.
In the early 2000’s, a battle opposed J. Craig Venter and the International Human Genome Sequencing Consortium to publish the first sequencing data of the human genome. The race produced two articles published in Science and Nature but more importantly, opened an intellectual revolution giving new possibilities to explore the 3 billion base-pair DNA sequence of the human genome in its entirety. Even if sequencing was used a long time before, sequencing the human genome opened the door to sequencing the genome of many other species.