• Pharma and Life Sciences Analytics
  • Experfy Editor
  • JUN 25, 2014

The Intersection of Genomics and Big Data

Genomics happens to be the study of the complete genetic material of organisms.  This specialized field of study encompasses the activities related to the “sequencing, mapping, and analysis of a wide range of RNA and DNA codes”—collected from living organisms spanning the entire hierarchy of the biological kingdom.

In the recent years, a lot of focus has been given to determining the entire DNA sequence of humans to understand the gene science present in the human bodies. The primary objective of this intense study is to demystify the correlation between genes and heredity, more specifically “heritable traits,” so that this knowledge can be put to the beneficial use of disease prevention and cure.

The vast amount of data outpours from genome sequencing, mapping, and analyzing has necessitated this scientific field to embrace big data technologies. According to healthcare market analyst, Dr. Bonnie Feldman, “Genomics produces huge volumes of data; each human genome has 20,000-25,000 genes comprised of 3 million base pairs. This amounts to 100 gigabytes of data, equivalent to 102,400 photos. “

Thus, the routine procedures in Genomics can easily produce petabytes of data with the possibility of further data explosion after gene analysis.

Angelina Jolie’s double mastectomy not only fueled widespread public interest in the power of gene science in predicting diseases in human bodies, but also aroused public awareness of the use of genetic data in preventing diseases in future. The importance of big data analytics in Genomics lies in its ability to accumulate and analyze useful gene-related information that can be converted into highly valuable medical insights for disease prevention and cure. 

The ongoing research of human genome 

The study of human genome dates back to 1990, which saw the initiation of the study and later, pioneered the automated gene-sequencing process.  Supported by strong technological advances, the  Genome Wide Association Studies (GWAS) has broadened the scope of their initial study to explore hidden “connections between genes and diseases.”

Without going into the scientific details which you can find in the GWAS site, it is probably useful to state that about 1600 genome studies have established the connected between 2000 gene associations and more than 300 common human disease traits.

The current applications that GWAS is engaged with are:

  • Predictive models to identify patients, who may be considered “high-risk” for a particular type of disease such as Type 1 Diabetes.
  • Classifying disease subtypes for guided clinical trials or targeted treatments of diseases like cancer.
  • High-quality information for screening drug candidates for toxicity and efficacy before clinical trials.

Genome study inevitable for predictive and preventive medicine

Genome study—a study of individual human genome—is the basic foundation for both predictive and preventive medicine. With a patient’s genetic data, physicians and researchers can get a clearer grasp diseases or the medical condition of an individual. The other expectation from genome study is that it will enable completely individualized treatments for patients, based on their individual genome data. 

The benefit of big data: Drastic cost reduction in genome study

Just a few years ago, the cost of human genome sequencing was $95 million; today it may be less than $ 40 million—and this cost is expected to fall further.  The research community and the industry, both have been working towards making genome sequencing affordable and accessible to the general public.

Today, individual human genome sequencing costs $5000. To better understand the role of big data in Genomics, we recommend this video: Genomics, big data, bioinformatics, and the tools necessary to move personalized medicine.

The Harvard Innovation Lab

Made in Boston @

The Harvard Innovation Lab


Matching Providers

Matching providers 2
comments powered by Disqus.