Reaping Knowledge from a Deluge of Data

Gene sequencing data is now accumulating at a phenomenal rate, and that rate is increasing faster than Moore's law.

photo of michael schatzThis makes computer storage, management and analysis into bottlenecks in the process of reaping knowledge from all that data. In the December first, 2011 New York Times, Michael Schatz of the Laufer Center and Cold Spring Harbor Laboratory (CSHL) explained that the world's annual production of DNA sequencing data today is 13 quadrillion DNA bases, enough to fill a stack of DVDs two miles high. 

However, Schatz said, "Google has enough capacity to do all of genomics in a day." In fact, Google is invested in the bioinformatics company, DNAnexus. Schatz aims to apply Google techniques to genomics data, while Senator Charles E Schumer prods Google toward more cooperative efforts with CSHL.