To fully realise the potential gains of human genome diversity and its relationship to human disease, biorepositories must play an integral role. Through optimal utilisation of such repositories, comprehensive human phenotype data can be united with corresponding genomic and proteomic samples to shed new light on the underlying basis of human disease and offer new hope for a better understanding of how to diagnose and treat patients.
The first complete human genome sequence was published in 2003 and laid the foundation for a more comprehensive understanding of human disease. When this sequence was published, many felt as though the era of personalised medicine was around the corner and that pharmaceutical companies should embrace this new knowledge to better design personalised therapies which could treat and potentially even prevent human disease. Yet those closest to the Human Genome Project recognised that this initial landmark was simply the beginning of years of science and innovation required to fully unlock the secrets of the genome.
Today, we have learned much from this one human genome sequence, but we have also seen, with the subsequent completion of two additional genomes namely by Dr James Watson and Dr Craig Venter, that our genomes are incredibly diverse and contain significantly more variation, both at the single nucleotide level as well as structural diversity. More recently, new knowledge has been gained by publication of the first complete tumour genome sequence from an acute myeloid leukaemia patient. Our challenge today is to uncover the full diversity of the human genome and utilise this comprehensive knowledge to better diagnose, treat and perhaps even prevent disease. Key to gaining this knowledge is fully realising the potential of clinical trials and biological and genomic samples collected during these trials.
• Will pharmaceutical companies be poised to achieve these goals?
• What does your company’s biorepository hold today?
• Will you be positioned for the innovative technologies and emerging scientific discoveries occurring at lightning speed across the globe?
My hope is to provide you with additional impetus to ensure the ethical collection of genomic samples as well as the safe, secure storage of samples to allow the full investigation of genomic diversity and to gain the knowledge that may finally herald in the new era of individualised therapies.
As a founding member of the pharmacogenomics group at Pfizer Global Research and Development in 1996, our group recognised the increasing knowledge of genome diversity which was emerging from the Human Genome Project. Yet at that time, studies investigating the diversity of the genome and the relation of this diversity to disease and drug response involved limited sets of genetic markers in candidate genes. These studies were quite limited in two ways—limited knowledge of common genetic differences and the technology to allow the analysis of these common differences, referred to as Single Nucleotide Polymorphisms (SNPs), across hundreds of thousands of patients. Genotyping technologies involved standard PCR reactions followed by restriction fragment length polymorphism detection. In concert with the Human Genome Project, genotyping technologies were rapidly emerging to drive scale up and cost down. Innovative technologies emerged and quickly outpaced our imagination. We realised that the fundamental enabler in genotype / phenotype studies was the availability of large numbers of well-phenotyped patient samples to provide a whole new opportunity for merging human genomics and drug discovery and development.
To fully leverage the new science and technology, a key strategic imperative of the group was the establishment of a biorepository to serve as the underlying foundation for pharmacogenomic investigations. Key in this development was ensuring the appropriate standard operating procedures for the collection of clinical samples including informed consent documentation, educational materials for institutional review boards, effective integration of sample collection (predominantly whole blood for DNA isolation) into clinical trial protocols, training for colleagues and physicians involved in clinical trials and effective laboratory mechanisms for storage and processing of these large numbers of samples. A considerable amount was invested with clarity of vision for the future knowledge to be gained as technologies emerged. Looking back on these early days, it was these visionary leaders who could look into the future and imagine the potential unlocked by a biorepository containing hundreds of thousands of biological specimens.
To further demonstrate the commitment to a central biorepository was the correct one, an early study provided the necessary demonstration of the importance of stored clinical samples. Prior to fully implementing the biorepository, we performed a retrospective collection of samples from patients enrolled in clinical trials in which an adverse event was seen in a small fraction of patients. The challenge to reconnect with study sites several years after the clinical trials were completed, obtain appropriate informed consent from affected patients and control patients proved quite more complex and expensive than imagined. We wound up with only a subset of the affected patients and a reasonable number of controls but again, lost precious time during the cumbersome process. In the end, the limited data set proved interesting but far from conclusive. This study was a powerful memory when Pfizer officially opened the doors of the world’s first -80°C automated biorepository on a hot summer day in 2006.
As we were building the infrastructure, the pace of technology development quickly outpaced our imagination—in fact, when one considers a scientific need or challenge, the scientific community quickly delivers often exceeding the expected timeline for technology delivery. When one looks back over the last ten years, and views the success of global initiatives such as the pharmaceutical companies combined efforts in the The SNP Consortium (TSC), Phases I, II and now III of the public HapMap initiative and the recent Genetic Association Information Network (GAIN), the pace of genome knowledge has been remarkable. But yet this is simply the foundation required to fully understand the intimate relationship of an individual’s genome, the diversity contained within, the consequences of this diversity, the interaction with the individual’s environment and finally the successful integration of this knowledge to better diagnose and treat disease. With each step we get a bit closer to the knowledge of genomic factors which contribute to disease, but we still have much to learn. Again, possessing the genomic samples which will allow this knowledge to be fully investigated is critical.
The Cancer Genome Atlas (TGCA) provides an excellent demonstration of the importance of appropriately collected and stored human biospecimens. Launched in later 2005, the TCGA seeks to investigate the genetic differences seen in tumours obtained from three distinct cancers—brain, lung and ovarian. Stored biological samples from patients were intended for the pilot study in which a full genomic characterisation of the tumours was planned including both extensive sequencing of candidate genes to catalogue the somatic changes present in tumour patients as well as comprehensive gene expression profiling to provide further insight into genes which may be altered in regulation. The initial publication emerging from the pilot study describes the investigation of gene sequence information in 206 glioblastoma patients and revealed new insight into genes and gene pathways involved in this devastating disease. Much learning has emerged from these initial pilot studies including the need for biological specimens which have been collected appropriately to ensure that the genomic material present in the samples is preserved effectively. Many samples collected for the purpose of genomic studies proved to contain genomic material of insufficient quality for sequencing or expression analyses. The pilot project provided much key learning and has led to a very extensive review of the process for sample collection and curation to maintain the high quality of specimens required for such analyses. The National Cancer Institute officially created the Office of Biorepositories and Biospecimen Research at the same time the TCGA was launched to “guide, coordinate, and develop the Institute’s biospecimen resources and capabilities” and has developed comprehensive resources for physicians and researchers to facilitate their efforts in such collections.
Finally, we have seen a whole new invigoration of the DNA sequencing field with the launch of the next-generation sequencing technologies as well as the newest addition to this innovative field—that of direct single molecule sequencing such as that of Helicos BioSciences, my newest venture into the pursuit of individualised approaches to healthcare. With a remarkable trajectory with the throughput and scale by which we can measure the true biology of the genome, the pace of technology innovative will continue in earnest. It is my belief that in the next two to three years, the genomic samples contained within your growing biorepositories will hold a wealth of new knowledge directly interrogating genome sequence, the expressed genome and the translated genome to allow a full and complete picture of the biology of health and disease. You might imagine that once again, as we did in our initial search for a comprehensive map of common variation, we will arrive at a time where complete genome sequencing becomes cost-effective and the statistical methods will allow the complex analyses required to build relationships between rare and common SNPs as well as structural variation to build a comprehensive view of the personal genome and its intimate relationship with environment and resulting phenotypes. I look forward to participating in the science and innovation required to fully unlocking the secrets of the genome and to the opportunity this knowledge heralds for patients across the globe.
(1) International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409: 860-921.
(2) Wheeler DA, Srinivasan M, Egholm M, Shen Y et al (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452:
(3) Venter genome
(4) Ley, TJ, Mardis, ER, Ding, L, Fulton, B et al (2008) DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456: 66-72
(5) Mank-Seymour A, Richmond J, Wood L, Reynolds J, Warnes G, Milos P and Thompson J (2006) Association of Torsades de Pointes with novel and known SNPs in LQTS genes. American Heart Journal 152: 1116-1122.
(6) The Cancer Genome Atlas Research Network (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature