Tuesday, 1 September 2015

S is for Sequencer: the new instrument from Ion Torrent

Today Ion Torrent launched their newest sequencers the S5 ($65k or $150k with the S5 XL, which includes more compute). Ion Torrent was exciting technology when we first heard about it and I'm disappointed it never quite lived up to the promise, and that it never competed against Illumina in the space I cared about. The new systems do not change my disappointment, but they may increase the competition for Illumina in the amplicon sequencing space. In my lab we're still running lots of amplicons using Fluidigm and HiSeq and I don't think we'll be buying an S5 anytime soon. However labs looking to set up clinical (or other) tests on standard panels might see the S5 as a realistic alternative to a MiSeq. But Ion need to deliver on this visibly and believably, I know lots of labs that have stopped using their PGMs, although I know a few that have stopped using their MiSeq's too!

The YouTube video (above) offers a very marketing-heavy overview (where is the S5 vs MiSeq video). Ion have focused very clearly on amplicons and delivering results as quickly as possible at reasonable cost. The S5 XL includes additional compute for processing to speed up this data delivery in an automated manner.

Like PGM and Proton before, S5 offers multiple chip configurations that allow you to run fewer samples without sacrificing per sample costs (too badly). The 520/530 chips are equivalent to the PGM offering 200-400bp runs with 5 or 20M reads per run. The 540 chip only offers 200bp runs but with 80M reads. All with 2 to 4 hour run times and analysis completed in as little as 5 hours.

S5 and Ampliseq: An open letter from Mark Gardner (Ion's GM) at the Behind the Bench blog explains the desire to continue the democratisation of sequencing. The aim was to  make targeted NGS easy for anyone to work with, and to deliver the best value benchtop sequencer. As the S5 system is very obviously targeted to amplicons I thought I'd highlight what Ion say in their literature about the numbers of samples per run - using the Ion AmpliSeq Cancer Hotspot Panel v2 (50 genes), 16, 48 or 96 samples can be run on 520, 530 or 540 chips respectively.

You'll need the Ion Chef in your lab to go from DNA to data with less than 45 minutes of hands-on time; otherwise you'll have to manually prepare libraries and templates for chip loading.

Data analysis can be completed automatically within 5 hours.

This does offer a tempting solution for people sequencing panels, but it may be limited in what else is realistically possible on other (Illumina) sequencers. The low-cost is going to be attractive, but only to a point. If Ion can deliver a truly end-to-end solution then they may be onto a winner.

AllSeq got their first with their digging around on Google, and Keith at Omics Omics got a sneak peek, "Ion have identified a profitable market segment...fast, targeted sequencing -- and is going full bore in this area". Keith points out that the new chips are not compatible with PGM or Proton, "the concept that all upgrades are encased in the consumable is long dead". He goes into some depth about the competition for these new instruments; MiSeq (takes four hours to deliver 1x36bp); GnuBio, Genia/Roche and QIAGEN have gone all but silent (expect more from Qiagen very soon), and ONT's MinION (possibly).

See it yourself on the Ion Torrent world tour. The UK Genome Sciences meeeting does not feature on the list of upcoming conferences where Ion will present data - I am sure there's still room. You can catch them in London on the 10th.

I'll be interested to start talking with users at AGBT in February. Whether the S series will take off where PGM and Proton did not may partly depend on whether PGM and Proton users get a good trade-in.
Sign up for Ion World Tour location near you >
Sign up for Ion World Tour location near you >
Sign up for Ion World Tour location near you >

PS: If I wanted to trade-in a HiSeq 2000 for amplicon work what would Ion offer me?

Tuesday, 25 August 2015

Normalisation made simple...please?

We do lots of DNA, RNA and NGS library normalisations mostly by hand after calculating dilutions in Excel. It works well but is cumbersome and prone to error. We're fortunate to have an Agilent Bravo robot and are now automating more of this with excellent results. The most obvious impact has been on NGS library pooling, below is our most recent run where the variability on all but 9 libraries is pretty much perfect.

However automation is a pain and it would be great if there were a solution for normal labs. We've discussed an idea with pipette companies in the past but no-one appears to offer anything yet so I thought I'd suggest it here as well. If you've seen something on the market please do comment and let me know (Sartorius are almost there with the eLine pipette see link below).

If you like the idea follow up with you favourite pipette rep and lets see if someone can make one!

The background to the idea: normalisation can be done by combining a standard volume of sample with a know volume of water or diluent. Any lab trying to do this manually is almost certainly taking concentration measurements from PicoGreen or qPCR and working out how much water to add to each well of a 96well plate with sample already added.

The idea: We use repeat pipettes for may applications and there do exist programmable versions. Repeat pipettes are great for adding the same volume of a master mix to all wells of a plate and several companies make them. We use the slightly older Eppendorf Stream but Gilson, Sartorius and others exist.

A programmable repeat pipette that can accept a TXT file. The upload of a TXT file after calculating the amount of water to add to each well, and after specifying pipetting across rows or down columns, would allow the user to complete normalisation in as few as 9 pipetting steps: 8x12 channel 5ul transfers from the source plate to a normalisation plate, and 1x automatic repeat pipette dispense. "Simples" as the Meerkat says.

Friday, 21 August 2015

Reproducibility of RNA-seq

The GEUVADIS Consortium published a study looking at reproducibility in RNA-seq in Nature Biotech in 2013 and I only just saw the paper. I thought that readers of this blog would be interested so take a look: Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories.

465 lymphoblastoid cell lines were prepped and sequenced in seven EU labs. TruSeq mRNA seq libraries were sequenced using PE75-100 reads. Each lab ran 50-100 samples but 5 of these were run across all labs. The results were good; in the paper they report very low levels of technical variation, "smaller than the already limited biological variation" and suggest that issues like variation in insert size or GC content between labs can be corrected for. They concluded that large RNA-seq projects could be distributed across labs "given proper standardisation and randomisation procedures", which would have seemed almost impossible in the days of microarrays. "All participating laboratories (must) use the exact same protocols"!

The figure above shows how variable libraries were for (a) total number of reads obtained per sample, (b) mean Q score, (c) mean Q30 length, (d) %duplication, (e) %mapped, (f) %aligned to exons.

A major objective of the study was to evaluate the feasibility of sharing RNA-sequencing among different laboratories in large consortia projects and the paper clearly demonstrates that RNA-seq can be highly comparable across labs. As long as samples can be adequately randomised across participating labs then statistical models can be built that will allow smaller technical effects to be removed. This is good for people planning really massive projects. But technology marches on and the HiSeq 4000 can now deliver a project of 500 RNA-seq samples in a single run, making the need to distribute the efforts much less. Whilst the labs in this study have done a great job, they obviously had to work closely to agree on protocols used in the whole process, this communication can be harder than sending everything to one place to get the job done.

They ran a K-mer profiling which appears to be a useful QC that does not require alignment to a reference genome, I'll look at whether this is something we should be doing here. 

Making RNA-seq cost-effective: This project over-sequenced very significantly, and two technical challenges were part of the problem. The first was the use of PE75 vs SE50 sequencing; for RNA-seq differential gene expression several papers have demonstrated that anything longer than SE50 is a waste of time and money. The second issue is one we still wrestle with, sample balance in a pool. In the paper each lab chose how best to pool and sequence samples to get the required minimum of 10M mapped reads or around 5000M reads in total. The variation in read numbers per sample was huge, averaging 58M rather than the 10 required. The authors explained that this "was partly due to differences in the number of samples per lane and partly due to difficulties with equimolar pooling". This over-sequencing resulted in 26,000M reads being generated! As the study used PE75 sequencing (£1200 per lane in my lab); this makes the bill for sequencing about £100,000! We'd typically use SE50 for RNA-seq (£600 per lane in my lab) and if 10M per sample could be obtained this would cost only £10,000.

I'd agree that a big challenge has been to get the balance of samples in a pool as close to 1:1 as possible; we're getting there with our most recent RNA-seq runs, but this is by no means solved. However we always run the entire experiment as one pool (assuming max 96 samples) and run as many lanes as we need, this allows us to keep over-sequencing to a minimum and keep costs down! In figure A from the paper (above) the read distributions are multimodal for all labs suggesting real issues in quantification and pooling. The paper describes issues with different sizes which we almost never see with TruSeq RNA (it's always 270bp). 

Batch effects revealed: I really liked figure 3d (below) which shows how each of the technical factors considered adds to the variation in the samples. Quite clearly biology is the biggest cause of sample variation, with only RNA extraction batch, library prep date and sample index being factors that may require more careful investigation.

Saturday, 25 July 2015

PhD done...time for a holiday

Core Genomics is now on holiday for two weeks after finally graduating from my PhD yesterday, 20 years to the date I finished my BSc, better late than never! Back in mid-August folks.

Thursday, 23 July 2015

Cell-free DNA trisomy 21 tests kick ass

NIPT for Down's Syndrome and other chromosomal abnormalities is taking off. A colleague of mine recently had an Ariosa test, paid for privately, and reported real satisfaction with the process. Lin Chitty at UCL Institute of Child Health and Great Ormond Street Hospital recently reported on a model-based analysis of NHS costs and outcomes of NIPT for Down's syndrome. This suggested that NIPT was cost effective if offered at around £50 per test (compare this to the £500-£1000 privately). NIPT is not my area of expertise but I've been watching it as technological developments have often been a little in advance of cell-free DNA work in cancer.

Possibly millions of tests have now been performed. NIPT is being rolled out to patients across the globe at an amazing rate compared to the introduction of other diagnostic tests, and the NHS is getting in on the game. The number of companies offering tests is growing and so are the litigation's. Most recently Illumina filed a new patent infringement suit against Ariosa claiming their Harmony NIPT test infringes a patent for “Multiplex Nucleic Acid Reactions” (one of the patent holders is ex-Illumina, and was an author on the Ariosa paper discussed below). NIPT commonly tests for trisomy 21 (Down’s Syndrome), trisomy 18 (Edwards’ Syndrome) and trisomy 13 (Patau’s Syndrome) and most tests are NGS based, Ariosa's test is array based. You can get an NGS-based NIPT test from ThisIsMy for just £320, tests in North America are as low as £200. Tests are available from: Ariosa Harmony, BGI NIFTY, Genesis Genetics Serenity (Verifi), Illumina Verifi, Natera Panorama, Premaitha IONA, Sequenom MATERNIT21. 

What do you want to be when you grow up?

The MRC have a nice career mapping tool: Interactive career framework which allows biomedical researchers to navigate through different options to see how they might get where they want to.

I'd like to think of myself as a technology Specialist Director: "an individual with technical expertise / specialist skills useful beyond their own specific group" - what are you?

Wednesday, 15 July 2015

How should I store my NGS data: disc, tape or tube

Genomics has recently been singled out as one of the largest data headaches we face. As we move to sequencing people multiple times, start newborn genome sequencing programs and increase our use of consumer genomics the amount of data goes up and up. Our GA1 generated 1Gb of data in about 11 days. Today our HiSeq 2500 puts out 1TB in 6.

We're currently storing our data on disc for up to six months. After this we either delete it or archive it onto tape (although Ive no idea if we ever try to get it back off the tapes). A while back people used to talk about the storage being more expensive that a rerun, and I wonder if we are getting even closer to that point, especially if you try to grab the data off a tape in a secure storage facility.

I've always liked the idea of storing libraries and we have all 10,000 that we've run safely stored at -80C. These tubes take minimum space and most could be rerun today at a fraction of the cost from a few years ago. I am now wondering if we should go for an even greener solution and start the long term storage on Whatman cards (available from CamLab and others). A small number of cards could store almost everything we've ever run!

Is anyone doing this?

Tuesday, 14 July 2015

An example of how fast NGS develops

Illumina have discontinued the version 1 of the NextSeq chemistry. Launched in January of last year the NextSeq was a revolutionary new sequencer, although not everyone was an immediate fan. The V2 chemistry was launched just before AGBT and the data certainly looked a lot closer to the quality we expected from the longer-lived 4-colour SBS chemistry. The V1 discontinuation notice arrived in my InBox today, just 18 months after the NextSeq launch.

 That's not much longer than the shelf-life of a kit!

Monday, 13 July 2015

Your genome for under £2000

Illumina have a new offer on their Understand Your Genome (UYG) program that means you can get your genome sequenced, analysed and clinically interpreted for under £2000.


Interested? Then there are a few requirements, mainly that you give informed consent and get a doctors prescription for the test. Your DNA is sent to Illumina's own Clinical Services Laboratory, CLIA-certified since 2009. The results will be reported to you at the first day of the ASHG meeting in Baltimore. Samples need to be with Illumina by July 31st giving them 67 days for sequencing and analysis.

You'll get back results on 12 genes important in pharmacogenomics, and hundreds of genes implicated in human disease. However you'll need to discuss any "medically significant results" with your GP, and you can ask not to receive some data back.
Sounds like a pretty good bargain given you'd need to sequence 50+ genomes to get close the $1000 genome from an X Ten provider. I'm not sure if you'll find out how much Neanderthal you're carrying around?

PS: If anyone fancies crowd-sourcing a Hadfieldome drop me a line, or my PayPal account is...

Thursday, 9 July 2015

Exciting developments in Pancreatic Cancer

A paper just published in Nature Communications describes a molecular analysis of Pancreatic Cancer by tumour exome and ctDNA targeted sequencing. The results showed enrichment of mutations in known PaCa associated genes, and identified clinically actionable mutations in over 1/3rd of patients.