Monday, September 23, 2013

Single-cell genomics: taking noise into account

Technical variation versus average read counts
Reprinted by permission from Macmillan Publishers Ltd
 Nat Methods, advance online (doi:10.1038/nmeth.2645)
Sequencing throughput and amplification strategies have improved to a point where single cell sequencing has become feasible.  There was a recent review in Nat Rev Gen covering the progress in single cell genomics and some of its potential applications that is worth a read.  However, the required amplification steps are likely to introduce significant variation for small amounts of starting material. A group of investigators from the EBML-Heidelberg, EMBL-EBI and the Sanger had a look at this problem and developed an approach to quantify and account for such technical variability. The method is described in a paper that is now in press and makes use of spiked-ins to estimate technical variation across a range of different mean expression strengths (see Figure). As with most of these short communications a lot of work is included in supplementary materials, including a detailed R workflow description that should allow anyone to recreate the main figures from the paper.

This paper is a starting point for more things to come. It is focused on the method and there is clearly a lot of biological findings to be made from those data. More broadly, the Sanger and the EMBL-EBI have recently set up a joint single cell genomics centre to acquire an develop the required technology. From the EBI side this is headed by Sarah Teichmann (also affiliated with Sanger) and John Marioni. Unfortunately, for my interests in post-translational regulation, single-cell proteomics is still lagging way behind. The Cytof comes closest but still requires antibodies for detection.