Journal of the History of Biology (2014) 47:495–496 DOI 10.1007/s10739-014-9389-9
Ó Springer Science+Business Media Dordrecht 2014
Book Review
Hallam Stevens, Life Out of Sequence: A Data-Driven History of Bioinformatics (Chicago: Chicago University Press, 2013), 304 pp., 19 halftones, 3 line drawings, $90 (cloth), $30 (paper). Life out of Sequence is part of a small handful of new books on the role of computers in biology. Yet as the title suggests, this is a book on data and data-driven biology rather than on computers. Based on both archival research and extensive field work at the Broad Institute at MIT and Harvard and at the European Bioinformatics Institute at Hinxton in the UK, Hallam Stevens sets out to answer the questions: What is data-driven biology? What do biologists do when they sit in front of their computer screens, and how has this kind of doing biology become so predominant in twenty-first century biology? The book tackles these questions in six well articulated and engagingly written chapters. The book starts with a brief history of the development of the computer as a machine for data analysis and management in the context of code breaking, bomb development, operational research and business practices in the 1940s–1960s. Computers were most successful in biology, Stevens suggests in chapter 2, where biology fitted the mold of data analysis and management. The accumulation of sequence data in the context of the human genome and other sequencing projects from the late 1980s provided fertile ground for the computer to make a decisive impact. In turn the computer transformed biology materially and epistemically with computer workstations taking the place of wet labs and biological research becoming data driven. Data, including sequencing data, Stevens argues, only exist in computers and computers are nothing else than data processors. The rest of the book examines these claims and the new work practices of computerized biology further. Chapter 3 and 4 provide a detailed account of how biological data is produced. The reader is invited to follow how biological material fed into sequence machines is transformed into digital data, how the data is structured by ontologies, incorporated into databases, made universal and shared. The sequencing facilities of the Broad Institute especially exemplify the new organization mode of ‘‘lean’’ (that is computer
496
BOOK REVIEW
managed) production of sequence data for which the Toyota production plants provide the model. Chapter 5 highlights the centrality of databases in structuring the virtual workspace of data driven biology. It provides a brief history of biological databases and analyzes how biologists use the databases to produce biological knowledge by comparing data and looking for new patterns. Databases, Stevens insists, are not just collections of data but structured entities that facilitate and constrict how biologists think. At the same time he does keep up the idea that data driven biology is ‘‘hypothesis free.’’ In the final and perhaps most illuminating chapter Stevens points to the role of computer generated images in visualizing data and making them amenable to examination and manipulation while at the same time providing a link between the digital and the natural world. Bioinformatics, we learn, increasingly relies on images rather than on text like the letter code of DNA. Despite its reliance on visual representations, biological research done on databases is viewed as theoretical work. As such it is struggling to gain recognition in a field traditionally dominated by experimental approaches. In the conclusions Stevens points to possible future developments, including the introduction of a ‘‘wet web’’ which works like a human brain and eventually may lead to an ‘‘erasure of the boundary between the biological and the informatic’’ (p. 219). Stevens is careful to point out that many biologists resist to embrace data driven biology. He nonetheless seems to suggest that there is an irreversible momentum that draws biology in this direction and that the time will come when life will be computed ‘‘out of sequence.’’ If these predictions hold true or if this is a just a view successfully propagated by the bioinformatics centers whose raison d’eˆtre rests on this presupposition, only history will be able to tell. Meanwhile Out of Sequence provides a vivid and thought provoking introduction to the working practices of computerized biology. The book is essential reading for anyone interested in the development of bioinformatics and in current discussions on data-driven science. Soraya de Chadarevian University of California Los Angeles