Compared with physics or mathematics, biology merely appears in the job description of data scientists. Indeed, biologists own quite a few perks in becoming great data scientists. Prestigious data scientists such as Chris Wiggins, David Heckerman, and even the originator of R language, Robert Gentleman all went through years of training in the area of biomedical science.
Biologists have a deep understanding of biologically inspired algorithms. Aside from the well-known neural network or genetic algorithms. Biologists have pretty good grasps about the abstract concepts in machine learning and artificial intelligence. Central dogma demonstrates that every possible biological process performs on a universal cellular machinery (DNA-RNA-Protein). This idea of using many simple agents to achieve complex tasks let us relate to many key concepts in data science, including one-learning algorithm hypothesis (supporters including my favorite MIT professor, Marvin Minsky), ensemble models or even swarm robotics.
The ability to abstract ideas from complex biological processes prepares biologists to solve data science problems. Biologists see the full life-cycle of data science. They collect data, manage data, identify problems and answer questions by statistics and models. In natural science, biologists start all from scratch. They have no idea about what data to collect and what problem has been defined. This is similar to identify a new customer demand from big data when customers don’t know about themselves. Additionally, the scale of data is comparable in the world of business and the world of population genomics.
Statistical methods (both frequentists and Bayesian) and many advanced techniques such as MCMC and classification models have been well-applied in many sub-disciplines in biology. It is exhilarating as many biologists start to gravitate towards data science. Step up your game biologists! They make organic data scientists.