Friday, 24 June 2016

I don't want to leave Europe

Brexit sucks...probably. The issue is we don't really know what the vote really means, or even if we'll actually leave the European Union in the next couple of years at all. However one thing cannot be ignored and that is the two-fingered salute to our European colleagues from 52% of the UK voting population that got out of bed on a rainy Thursday.

I am privileged to work in one of the UK's top cancer institutes at the top UK University: the Cancer Research UK Cambridge Institute, a department of the University of Cambridge. The institute, and the University, is an international one with people from all across the globe, many of the staff in my lab have come from outside the UK and they are all great people to work with. I dislike the idea that these people feel insecure about their future because our politicians have done such a crap job on governing the country.

I'd like to keep the international feel so if you're still thinking that working in the UK would be good for you (and your a genomics whiz) then why not check out the job ad for a new Genomics Core Deputy ManagerWe're expanding the lab and putting lots of effort into single-cell genomics service (10X Genomics and Fluidigm C1 right now). I'm looking for a senior scientist of any nationality to help lead the team, with NGS experience, and ideas about single-cell genomics.

You can get more information about the lab on our websiteYou can get more information about the role, and apply on the University of Cambridge website.

Friday, 17 June 2016

Come and work in my lab...

I've just readvertised for someone to my lab as the new Genomics Core Deputy Manager. We're expanding the Genomics core and building single-cell genomics capabilities (we currently have both 10X Genomics and Fluidigm C1). I'm looking for a senior scientist who can help lead the team, who has significant experience of NGS methods and applications; and ideally has an understanding of the challenges single-cell genomics presents. You'll be hands-on helping to define the single-cell genomics services we offer, and build these over the next 12-18 months.

You'll have a real opportunity to make a contribution to the science in our institute and drive single-cell genomics research. The Cancer Research UK Cambridge Institute is a great place to work. It's a department of the University of Cambridge, is one of Europe's top cancer research institutes. We are situated on the Addenbrooke's Biomedical Campus, and are part of both the University of Cambridge School of Clinical Medicine, and the Cambridge Cancer Centre. The Institute focus is high-quality basic and translational cancer research and we have an excellent track record in cancer genomics 123. The majority of data generated by the Genomics Core facility is Next Generation Sequencing, and we support researchers at the Cambridge Institute, as well as nine other University Institutes and Departments within our NGS collaboration.

You can get more information about the lab on our websiteYou can get more information about the role, and apply on the University of Cambridge website.

Tuesday, 14 June 2016

SPRI alternatives for NGS: survey results

Everyone loves bead cleanups, and it appears that almost everyone (85%) who read my recent post about SPRI alternatives loves Agencourt AMPureXP. I'd asked readers to take a survey asking if they used AMPure XP, a commercial alternative, or a home-brew version - the results are below.
Take the survey:

I was surprised to see more home-brew responses than commercial alternatives, but this could simply reflect the attitudes of people reading CoreGenomics.

Friday, 10 June 2016

CNV, RNA, ChIP and cfDNA sequencing for £10 per sample

Copy-number analysis is a useful tool for many researchers and we use it a lot for analysis of tumour samples. In the past this was done using SNP arrays e.g. Affymetrix SNP6.0 in METABRIC, but today we're generally using low-coverage whole genome sequencing and tools like qDNAseq. I've posted before about our use of low-coverage WGS in our exome pipeline. Most recently we've got groups doing low-coverage WGS on large numbers of samples purely for copy-number analysis.

Low-coverage WGS makes CNV-seq fast and cheap but a recent Genome Research paper suggest some great methodological improvements to push costs down to very low levels:  SMASH, a fragmentation and sequencing method for genomic copy number analysis

WGS and SMASH generate highly concordant CNV calls

Thursday, 9 June 2016

Proteomics is starting to rock too...

I don’t usually read Proteomics papers but have been thinking about how we might combine single cell genome and transcriptome sequencing - with Fluidigm’s Helios (CyToF) and have been trying to get more acquainted with Proteomics methods. In doing so I found this excellent paper: Proteomic Biomarker Discovery in 1000 Human Plasma Samples with Mass Spectrometry. The paper is probably a tour-de-force of Proteomics, even if the published results were not stunning, but not being a Proteomecist I’m not sure I’m qualified to say that. It is obvious that the group working on this put a large amount of effort into experimental design ahead of completing the mass-spec work.

Figure from Cominetti et al 2016

Any large project needs to consider design very carefully from considering what factors might need to be controlled for, to deciding what controls to use. The experimental design for the Human proteome paper is illustrated above; they used a controlled-randomised plate layout to remove plate confounding effects for sample origin, gender, age, ethnicity, BMI, blood pressure, glycemic indices, and clinical biochem.

Reducing mass-spec variability with tandem-mass-tags: The key to making the data comparable across what was over 300 mass-spec runs was the use of tandem-mass tags purchased from Thermo Scientific (Rockford, IL, USA), these add specific masses to all proteins in a sample allowing multiplexing of up to 24 samples per run. With a carefully designed experiment it is possible to reduce the impact of run-to-run variability. Much in the same way as we designed projects using multi-sample microarrays, the experimental groups are balanced across mass-spec runs. I’ve learnt a lot more about tandem-mass tagging in Proteomics over the last 18 months after hearing about the tech in an internal seminar. It seems that this approach is going to allow Proteomics researchers to take advantage of the statistical tools developed for gene expression array analysis. The group used a pair of control samples in each run further reducing the impact of technical variability. 304 TMT 6-plex mass-spec runs were performed, with each 6-plex containing two standards, and 4 samples. 1000 patient plasma samples were processed in 19x 96-well plates over a period of just 15 weeks. All sample handling was tracked, although they did not describe their tracking and whether they used a LIMs or not. The paper is a great example of careful experimental design and I thought was one well worth sharing outside the Proteomics community.

Between 150-200 proteins were identified and the authors argue strongly that this was only possible because f the use of TMTs. Label-free mass-spec approaches would have introduced more variability and taken significantly longer (38 weeks by their estimation). However after crunching the numbers only two proteins in Human Plasma had significant correlation with BMI. Both were shown to be associated with obesity.

NGS experimental design: We're lucky to have such a large number of sample barcodes available for NGS experiments. We can usually fit the whole experiment into one library prep plate and a single sequencing pool and remove almost all the confounding technical issues. However this does not mean we should skip careful design of NGS experiments. Taking a little time to discuss the major question(s) being asked, the samples available and the methods we'll use in both wet and dry labs is time very well spent. 

Tuesday, 31 May 2016

London Calling: nanopores updated

I did not arrive at London Calling until late on Thursday afternoon but what an excellent start I was given: Clive Brown’s update - where ONT are today, and what’s coming tomorrow! The Thursday afternoon, evening and Friday sessions created a palpable buzz in the attendees. It is easy to get carried away with predictions about how soon ONT will become actual competition for Illumina, but it is not clear that this is ONT's main goal. More exciting is the push into uncharted territory, bringing sequencing to individuals and taking it to Mars. I'm excited; but my core facility manager head says, "hold on, you're not getting rid of those HiSeq's just yet"!

Monday, 30 May 2016

Oxford Nanopore's Direct RNA-seq - the killer app for bacterial genomics?

I'll finish my write up of the ONT London Calling event after the bank-holiday weekend but I wanted to get this post out as I think we came up with a killer app for MinION, or even SmidgION on your iPhone: direct RNA-seq from bacterial 16S rRNA. This was an idea I proposed in response to an answer about what one of the speakers was doing with Nanopores (they were working on 16S amplicons). In the ensuing discussion we decided that sequencing the ribosomal RNA directly (see Clive's announcement below) would allow interrogation of phylogenetic relationships and environmental diversity, quickly, and with close to zero bias from sample extraction and library prep.

Monday, 23 May 2016

Increased read duplication on patterned flowcells- understanding the impact of Exclusion Amplification

Next-generation sequencing is fantastic technology and its use has revolutionised our understanding of biology, but it is not perfect, multiple issues occur in every lab from sample extraction through to the actual sequencing. Not all of these are well enough understood to be safely ignored and in this post I'm going to talk about one that I'm trying to better understand right now - duplication of sequences in datasets, and in particular Exclusion Amplicifation duplication on HiSeq 4000.

Sources of read duplicates in Illumina data - courtesy Illumina 2016

Thursday, 19 May 2016

Happy 10th birthday NGS!

NGS is 10...according to the latest Nature Reviews Genetics: Coming of age: ten years of next-generation sequencing technologies. Just by chance I was asked to give a talk to explain how Illumina sequencing works in a technology seminar series being delivered by the Core Heads at the CRUK Cambridge institute, and as part of that I uploaded a slide-deck and created some animations for Twitter to explain how clustering works...I hope you like it.

Illumina paired-end dual-index clustering and sequencing
and here is a "slow-mo" version for people who could not keep up with the frame-rate!

Wednesday, 18 May 2016

How will Foundations recent patent announcment affect the cancer testing environment

Foundation Medicine were yestoday granted US Patent 9,340,830 "Optimization of Multigene Analysis of Tumor Samples". This is likely to stir up the can of worms that is tumour testing by NGS and is another patent in a complex landscape. The claims basically cover WGS library prep, exome sequencing, alignment and variant calling. It covers all sorts of mutation calling including SNVs at low freq (5%) and mid-freq (10%) or above, SNPs to assess CNV & LOH, fusions and other structural variants, as well as pharmacogenomic SNPs. It also includes in the test a DNA fingerprint.

Michael Pellini, Foundation's CEO appears to be using some very positive language in describing the award of this patent, he said "we do not intend to block the use of methods covered by the patent in patient testing that may be offered by others". But how much of the patent claim is truly novel and might stand up in court remains to be seen. The basic idea of exome sequencing patients is old hat and Foundation were certainly not the first people to be doing this. The SNP ID of patients is an idea even I'd proposed over four years ago (here & here). But if Foundation's patent makes it harder for others to clamp down on competition that can only be a good thing.