Thursday, 17 November 2016

MinION: 500kb reads and counting

A couple of Tweets today point to the amazing lengths Oxford Nanopores MinION sequencer is capable of generating - over 400kb!

Dominik Handler Tweeted a plot showing read distribution from a run . In replies following the Tweet he describes the DNA handling as involving "no tricks, just very careful DNA isolation and no, really no pipetting (ok 2x pipetting required)".


and Martin Smith Tweeted an even longer read, almost 500kb in length...


Exactly how easily we'll all see similar read lengths is unclear, but it is going to be hugely dependant on the sample and probably having "green fingers" as well.

Here's Dominics gel...


Wednesday, 9 November 2016

Unintended consequences of NGS-base NIPT?

The UK recently approved an NIPT test to screen high risk pregnancies for foetal trisomy 21, 13, or 18 after the current primary screening test, and in place of amniocentesis (following on from the results of the RAPID study). I am 100% in favour of this kind of testing and 100% in favour of individuals, or couples, making the choice of what to do with the results. But what are the consequences of this kind of testing and where do we go in a world where cfDNA foetal genomes are possible?


I decided to write this post after watching "A world Without Downs", a documentary on BBC2 that was presented by Sally Phillips (of Bridget Jones fame), mother to Olly who has Down's syndrome. She presented a program where the case for the test was made (just), but the programme was very clearly pro-Down's. Although not quite to the point of being anti-choice.

Friday, 21 October 2016

Does the world have too many HiSeq X Tens?

Illumina stock dropped 25% after a hammering by the stock market with their recent announcements that Q3 revenues would be 3.4% lower than expected at just $607 million. This makes Illumina a much more attractive acquisition (although I doubt this summers rumours of a Thermo bid had any substance), and also makes a lot of people ask the question "why?"

The reasons given for the shortfall were "a larger than anticipated year-over-year decline in high-throughput sequencing instruments" i.e. Illumina sold fewer sequencers than it expected to. It is difficult to turn these revenue figures and statements into the number of HiSeq 2500's, 4000's or X's that Illumina missed it's internal forecasts by, but according to Francis de Souza Illumina "closed one less X deal than anticipated" - although he did not say if this was an X5, X10 or X30! Perhaps more telling was that de Souza was quoted saying that "[Illumina was not counting on a continuing increase in new sequencer sales]"...so is the market full to bursting?



Controlling for bisulfite conversion efficiency with a 1% Lamda spike-in

The use of DNA methylation analysis by NGS has become a standard tool in many labs. In a project design discussion we had today somebody mentioned the use of a control for bisulfite conversion efficiency that I'd missed, as its such a simple one I thought I'd briefly mention it here. In their PLoS Genet 2013 paper, Shirane et al from Kyushu University spiked-in unmethylated lambda phage DNA (Promega) to control for, and check, the C/T conversion rate was greater than 99%.






The bisulfite conversion of cytosine bases to uracils, by deamination of unmethylated cytosine (as shown above) is the gold standard for methylation analysis.

Monday, 17 October 2016

SIRVs: RNA-seq controls from @Lexogen

This article was commissioned by Lexogen GmbH.

My lab has been performing RNA-seq for many years, and is currently building new services around single-cell RNA-seq. Fluidigm’s C1, academic efforts such as Drop-seq and inDrop, and commercial platforms from 10X Genomics, Dolomite Bio, Wafergen, Illumina/BioRad, RainDance and others makes establishing the technology in your lab relatively simple. However the data being generated can be difficult to analyse and so we’ve been looking carefully at the controls we use, or should be using, for single-cell, and standard, RNA-seq experiments. The three platforms I’m considering are the Lexogen SIRVs (Spike-In RNA Variants), or SEQUINs, or ERCC 2.0 (External RNA Controls Consortium) controls. All are based on synthetically produced RNAs that aim to mimic complexities of the transcriptome: Lexogen’s SIRVs are the only controls that are currently available commercially; ERCC 2.0 is a developing standard (Lexogen is one of the groups contributing to the discussion), and SEQUINs for RNA and DNA were only recently published in Nature Methods.

You can win a free lane of HiSeq 2500 sequencing of your own RNA-seq libraries (with SIRVs of course) by applying for the Lexogen Research Award


Lexogen’s SIRVs are probably the most complex controls available on the market today as they are designed to assess alternative splicing, alternative transcription start and end sites, overlapping genes, and antisense transcription. They consist of seven artificial genes in-vitro transcribed as multiple (6-18) isoforms to generate a total of 69 transcripts. Each has a 5’triphosphate and a 30nt poly(A)-tail, enabling both mRNA-Seq and TotalRNA-seq methods. Transcripts vary from 191 to 2528nt long and have variable (30-50%) GC-content.



Want to know more: Lexogen are hosting a webinar to describe SIRVs in more detail on October 19th: Controlling RNA-seq experiments using spike-in RNA variants. They have also uploaded a manuscript to BioRxiv that describes the evaluation of SIRVs and provides links to the underlying RNA-Seq data. As a Bioinformatician you might want to download this data set and evaluate the SIRV reads yourself. Or read about how SIRVs are being used in single-cell RNA seq in the latest paper from Sarah Teichmann’s group at EBI/Sanger.



Before diving into a more in-depth description of the Lexogen SIRVs, and how we might be using them in our standard and/or single-cell RNA-seq studies, I thought I’d start with a bit of a historical overview of how RNA controls came about...and that means going back to the days when microarrays were the tool of choice and NGS had yet to be invented!

Friday, 14 October 2016

Batch effects in scRNA-seq: to E or not to E(RCC spike-in)

At the recent Wellcome Trust conference on Single Cell Genomics (Twitter #scgen16) there was a great talk (her slides are online) from Stephanie Hicks in the @irrizarry group (Department of Biostatistics and Computational Biology at Dana-Farber Cancer Institute). Stephanie was talking about the recent work she's been doing looking at batch effects in single-cell data, all of which you can read about in her paper is on the BioRxiv: On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data. You can also read about this paper over at NExtGenSeek.

Adapted from Figure 1 in Hicks et al.

Tuesday, 11 October 2016

Clinical trials using ctDNA

DeciBio have a great interactive Tableau dashboard which you can use to browse and filter their analysis of 97 “laboratory biomarker analysis” ImmunOncolgy clinical trials; see: Diagnostic Biomarkers for Cancer Immunotherapy – Moving Beyond PD-L1. The raw data comes from ClinicalTrials.gov where you can specify a "ctDNA" search and get back 50 trials, 40 of which are open.

Two of these trails are happening in the UK. Investigators at The Royal Marsden are looking to measure the presence or absence of ctDNA post CRT in EMVI-positive rectal cancer. And Astra Zeneca are looking for ctDNA as a secondary outcome to obtain a preliminary assessment of safety and efficacy of AZD0156 and its activity in tumours by evaluation of the total amount of ctDNA.

You can also specify your own search terms and get back lists of trials from  OpenTrials which went live very recently. The Marsden's ctDNA trials above is currently listed.

You can use the DeciBio dashboard on their site. In the example below I filtered for trials using ctDNA analysis and came up with 7 results:

  1. Dabrafenib and Trametinib Followed by Ipilimumab and Nivolumab or Ipilimumab and Nivolumab Followed by Dabrafenib and Trametinib in Treating Patients With Stage III-IV BRAFV600 Melanoma
  2. Nivolumab in Eliminating Minimal Residual Disease and Preventing Relapse in Patients With Acute Myeloid Leukemia in Remission After Chemotherapy
  3. Nivolumab and Ipilimumab in Treating Patients With Advanced HIV Associated Solid Tumors
  4. Entinostat, Nivolumab, and Ipilimumab in Treating Patients With Solid Tumors That Are Metastatic or Cannot Be Removed by Surgery or Locally Advanced or Metastatic HER2-Negative Breast Cancer
  5. Nivolumab in Treating Patients With HTLV-Associated T-Cell Leukemia/Lymphoma
  6. Tremelimumab and Durvalumab With or Without Radiation Therapy in Patients With Relapsed Small Cell Lung Cancer
  7. Pembrolizumab, Letrozole, and Palbociclib in Treating Patients With Stage IV Estrogen Receptor Positive Breast Cancer With Stable Disease That Has Not Responded to Letrozole and Palbociclib


Thanks to DecBio's Andrew Aijian for the analysis, dashboard and commentary. And to OpenTrials for making this kind of data open and accessible.

Friday, 7 October 2016

Index mis-assignment to Illumina's PhiX control

Multiplexing is the default option for most of the work being carried out in my lab, and it is one of the reasons Illumina has been so successful. Rather than the one-sample-per-lane we used to run when a GA1 generated only a few million reads per lane, we can now run a 24 sample RNA-seq experiment in one HiSeq 4000 lane and expect to get back 10-20M reads per sample. For almost anything other than genomes multiplexed sequencing is the norm.

But index sequencing can go wrong, and this can and does happen even before anything gets on the sequencer. We noticed that PhiX has been turning up in demultiplexed sample Fastq. PhiX does not carry a sample index index so something is going wrong! What's happening? Is this a problem for indexing and multiplexing in general on NGS platforms? These were the questions I have recently been digging into after our move from HiSeq 2500 to HiSeq 4000. In this post I'll describe what we've seen with mis-assignment of sample indexes to PhiX. And I'll review some of the literature that clearly pointed out the issue - in particular I'll refer to Jeff Hussmann's PhD thesis from 2015.

The problem of index mis-assignment to PhiX can be safely ignored, or easily fixed (so you could stop reading now). But understanding it has made me realise that index mis-assignment between samples is an issue we don not know enough about - and that the tools we're using may not be quote up to the job (but I'll not cover this in depth in this post).


Tuesday, 20 September 2016

The future of Illumina according to @chrissyfarr

In yesterdays Fast Company piece Christina Farr (on Twitter) gives a very nice write up of Illumina's history and where they are going with respect to bringing DNA sequencing into the clinic. I really liked the piece and wanted to share my thoughts after reading it with Core-Genomics readers.


Friday, 16 September 2016

Reporting on Fluidigm's single-cell user meeting at the Sanger Institute

The Genomics community is pushing ahead fast on single-cell analysis methods as these are revolutionising how we approach biological questions. Unfortunately my registration went in too late for the meeting running at the Sanger Institute this week (Follow #SCG16 on Twitter), but the Fluidigm pre-meeting was a great opportunity to hear what people are doing with their tech. And it should be a great opportunity to pick other users brains about their challenges with all single-cell methods.



Imaging mass-cytometry: the most exciting thing to happen in 'omics?

Mark Unger (Fluidigm VP of R&D) started the meeting off by asking the audience to consider the two axes of single-cell analysis: 1) Number of cells being analysed, 2) what questions can you ask of those cells (mRNA-seq is only one assay) - proteomics, epigenetics, SNPs, CNVs, etc.

Right now Fluidigm has the highest number of applications that can be run on single-cells with multiple Fluidigm and/or user developed protocols on the Fludigm Open App website; 10X Genomics only have single-cell 3' mRNA-seq right now, as do BioRad/Illumina and Drop-seq. But I am confident other providers will expand into non 3'mRNA assays...I'd go further and say that if they don't they'll find it hard to get traction as users are likely require a platform that can do more than one thing.