HGP-Write has been envisioned as an effort to synthesize one nucleotide at a time to piece together the entire human genome. What could go wrong?
On June 2, 2016, scientists from multiple academic institutions in the US published a perspective in the journal Science proposing a second human genome project, called Human Genome Project-Write (HGP-Write). In order to understand the true potential of this initiative, we need to go back in history and learn about the original Human Genome Project (refereed to as HGP-Read by the HGP-Write team) and whether the project has fulfilled its promise.
The idea behind reading the exact arrangement of individual nucleotides (the building blocks in DNA) arranged in 23 pairs of human chromosomes (collectively known as the human genome) took shape firmly around the mid-1980s during multiple meetings, workshops and conferences. Charles DeLisi’s essay (paywall) provides an excellent historical account of some of the events. The credit goes to a group of scientists involved in the original human genome project (i.e., HGP-Read) who possessed the farsightedness and ingenuity to see the project through and realize its true potential. The draft human genome sequence, the final outcome of HGP-Read, was at the time the single largest research undertaking in the biological sciences. The project had taken nearly $3 billon and 13 years to complete, and resulted in some landmark publications and commentaries, as well as being hailed as a cornerstone of modern science.
Fifteen years have passed since HGP-Read. Significantly, its underpinning technology has moved beyond being dependent on the pioneering achievements of one of the greatest scientists of our time, Fred Sanger. The field of sequencing has grown leaps and bounds; improvements in several areas, like sequencing chemistry, read throughput and assay parallelization – along with automation in generating and reading the templates – have resulted in the availability of whole genome sequences of more than 16,000 organisms till date. While HGP-Read used the capillary automated low-throughput Sanger sequencers, sequence information on most whole genomes available today use the now-generation sequencing (NGS) instruments – popularly called the ‘next-generation sequencers’, where the word next is a misnomer.
NGS instruments use chemistry with modified DNA polymerase (the enzyme that copies DNA) to read with minimum errors and with maximum fidelity; readers (cameras that read one nucleotide at a time) to capture images faster; and with precision and analytical pipelines (algorithms, computational and statistical tools to analyse and visualise data). Processes are made to run in parallel in these high-throughput DNA sequencing instruments to produce millions of small stretches of DNA (usually less than 250 nucleotides) that are stitched together using computational algorithms to produce sequences of complete genomes. So with the availability of faster instruments and inexpensive chemistry to generate sequence-data using high-throughout sequencing instruments, the challenge has rapidly shifted from ‘data generation’ to ‘data management and analysis’.
Before we return to HGP-Write, it’s important to examine whether HGP-Read has fulfilled its potential and resulted in a positive payoff or if it was all a big fishing expedition that took funding away from individual, hypothesis-driven research, as some of its critics have feared. The short answer is that HGP-Read, which produced a draft human genome sequence in 2001 and a near complete version in 2003, was a resounding success. The Battelle Technology Partnership Practice has compiled the true economic benefit of HGP-Read and suggested that, among other outputs, the US economy received a return of $141 for every $1 invested by the US government. And beyond economics, the availability of the human genome sequence has aided our understanding of the function of many human genes; in the discovery of new genes linked to human characteristics; in studies on genetic diversity between humans and other humanoids, apes and primates; in studies on genes related to intelligence, cognitive functions and speech; and finally in better comprehending the characteristics linked to being human.
Perhaps the best examples of the usefulness of the human genome sequence are in the realm of disease-gene discovery for many monogenic disorders and personalized medicine in oncology. In the last five years, sequencing the thousands of cancer genomes has resulted in the discovery of driver mutations leading to the possibility of developing new drugs together with companion diagnostic tests and better clinical trial designs by selecting a sub-group of patients harboring specific genetic changes against a specific drug. One specific and significant example is the cancer genome sequencing efforts that resulted in the discovery of the mutations in the IDH1/2 gene in secondary glioblastomas, high-grade astrocytomas and oligodendrogliomas, and acute myelogenous leukaemias, which provides us with opportunities to develop new drugs by exploring the link between metabolism and cancer.
Now, to HGP-Write. HGP-Write is an international, multi-disciplinary open research project that plans to build genomes from scratch. Specifically, it has been envisioned as an effort to synthesize one nucleotide at a time to piece together the entire human genome. The project anticipates that such an effort will result in innovation in genome engineering and technology leading to reduction in cost toward building larger genomes, including human genomes, in cell-lines.
Although HGP-Write is audacious in its goals, the process of synthesizing genes is not new. George Church, a professor of genetics at Harvard University and a leader of HGP-Write, has already produced several genes synthetically for the common bacteria Escherichia coli to fight phages (viruses that attach to bacteria) in his lab. Another leader of HGP-Write, Jef Boeke of New York University, is a leader of the project to create a synthetic genome of baker’s yeast (Saccharomyces cerevisiae), which is about 300-times smaller than the human genome.
Going by what HGP-Read achieved in terms of spurring innovation in instrumentation, chemistry, data parallelisation, algorithms and analytical pipeline development for sequence analyses and interpretation, it is not hard to fathom that HGP-Write will enhance that further. It may be too early to project the actual outcomes – but HGP-Write presents a platform to innovate on genome technology, engineering, analysis and synthetic biology. Some of the potential benefits of the project, as HGP-Write talks about, are creating cell-lines resistant to multiple pathogens, producing safe organs for xenotransplantation, producing resistant cell-lines for cancer research and creating a homozygous and baseline human genome bearing the most common human alleles. All of these tools are going to serve as essential assets to the wider research community.
Despite its promise, the project has drawn some criticism. The Science perspective on HGP-Write was the outcome of a closed-door meeting held at Harvard University this year, and it drew criticism from Stanford University synthetic biologist Drew Endy and bioethicist Laura Zoloth from Northwestern University. Both argued that such a meeting should have included experts from a variety of disciplines, like theology, philosophy and ethics, right from the beginning. Others argued that creating something as fundamental as a human genome must be well thought-through and well-debated before starting the project as it will have far-reaching consequences. The Science perspective on HGP-Write on its part does discuss this by stating,
We will enable broad public discourse on HGP-Write; having such conversations well in advance of project implementation will guide emerging capabilities in science and contribute to societal decision-making. Through open and ongoing dialogue, common goals can be identified. In-formed consent must take local and regional values into account and enable true decision-making on particularly sensitive use of cells and DNA from certain sources. Finally, the highest biosafety standards should guide project work, and safety for lab workers, research participants, and eco-systems should pervade the design process.
Essentially, Endy and Zoloth were advocating for more than implementing a program like the Ethical, Legal, and Social Implications (ELSI) program founded alongside HGP-Read. They wanted the active involvement of experts from a wider set of disciplines even before contemplating such a project, apart from voicing criticism against the fact that it was a closed-door meeting. To his credit, Church defended the need for privacy by citing journal embargo policies and the difficulty to freely discuss many aspects of synthetic biology in open meetings (where media and the general public are present).
Church, a long-time proponent of anything to do with genome technology, and who has been involved with most things to do with the development of technologies in genome-sequencing, analysis and perturbations in the last two decades, argues in favor on the merits of HGP-Write. Despite the real “ethical, legal and social” issues involved, he and others maintain that the project has no intention of creating actual humans but to create genes and organisms that could help cells resist against deadly foreign pathogens and help fight diseases like cancer. Although HGP-Write intends to work on humans, the project does discuss the possibility of including other organisms – such as mice, pigs, fruit flies (Drosophila melanogaster), nematodes (Caenorhabditis elegans), common experimental plants (e.g., Arabidopsis thaliana) and baker’s yeast (S. cerevisiae) – in the effort.
Despite its potential, HGP-Write is not going to be easy and is not going to proceed without failures. In fact, HGP-Write acknowledges the ambitiousness of the project and states that the cost of synthesizing a full human genome today is going to be higher that the original cost of the original human genome project itself. However, it does anticipate a sharp drop in price resulting from innovation and new technology development in DNA synthesis and sequencing. Per HGP-Write, the development of new technology might have brought the cost of sequencing genomes down by a million-times in 15 years, but that the real innovation of editing, engineering and synthesizing human and other genomes will have greater benefits.
Whether some like it not, it’s just a matter of time (and price) as to when humans will be able to stitch together genomes synthetically. Therefore, it’s just not a question of how but when. Looking back to the historic announcement in the White House in June 26, 2000 – when President Bill Clinton, UK’s prime minister Tony Blair, and Drs. Francis Collins and Craig Venter (heads of the public and private human genome sequencing initiatives respectively) announced the completion of the first rough draft of the human genome – we have come a long way. The Human Genome Project was one of the very few scientific projects that achieved more than it envisioned, was completed before time and spent less than what was originally planned (the original proposal was to sequence human genome in 15 years using $3 billion but it delivered in 13 years spending $2.7 billion).
HGP-Write too can achieve similar success and can bring in innovations in genome sequencing, engineering and synthetic biology that can have far-reaching beneficial consequences in the field of medicine and agriculture. However, for the project to get there, it must discuss the ethical, societal and legal issues proactively, openly and widely – and place safeguards against any future misuse.