CoreGenomics: March 2015

Tuesday 31 March 2015

Book Review: Bang wongs "Visual Strategies for Biological Data"

I've written before about how much I liked Nature Methods “Points of View” by Bang Wong and I created a public Mendeley group so you could access the papers. I'd also said that having the articles collected together as a hard-copy version would be great.

Now available is Nature Collections: Visual Strategies for Biological Data "this

e-book collects the Points of View columns published in Nature Methods through February 2015, providing practical advice on effective strategies for visualising biological data to researchers in the biological sciences."

Enjoy.

Friday 20 March 2015

Oxford Nanopore MinION for ctDNA sequencing

A great poster at AGBT was presented by Boreal Genomics and available on the Nanopore wiki for MAPpers. In A nanopore liquid biopsy Patrick Davies describes their combination of the Boreal On-Target with ONT MinION sequencing to detect mutant allele fractions in ctDNA of sub 0.1%. I spoke briefly to Andre Marziali (Boreal Founder & CSO) about the work and summarise the poster here.

Fancy working in my lab?

I've currently got three positions open in my lab and thought I'd use this blog as another way to get the message out to prospective candidates. Two people recently moved onto new jobs; one in Inivata (the first spin-out from CRUK-CI) and one at AbCam, and another person was recently promoted. We're also busy so we're also recruiting for a six-month temporary contract to help out with the sequencing services.

If you want to see what the lab does please take a look at our lab website and you may have seen us on Twitter.

The posts:

Senior Scientific Associate (Genomics) £28,695-£37,394

Research Assistant Genomics £24,775-£28,695
Research Assistant - Genomics (Fixed Term) £24,775-£28,695

The posts will all be involved in providing Next-Generation Sequencing and library preparation services; including nucleic acid and library quant (with KAPA); setting up, monitoring and troubleshooting Illumina HiSeq, NextSeq and MiSeq sequencers; and library prep using a diversity of methods, such as Exome-seq, ChIPseq, and RNASeq - we do a lot of RNA-seq and Exomes. The senior post wil be responsible for the day-today operational management of the NGS service, and will work alongside their counterpart running the library prep services.

The Genomics core has been operational for 8 years and we've focused on NGS for 7 of those; it is an experienced lab doing exciting work with a diverse set of users from across the Cambridge, although our primary focus is on Cancer Research methods for scientists at the CRUK funded Cambridge Institute.

Please follow the links for details on applications rather than contacting me directly.

Thanks.

James.

PS: Closing date for all posts is 27 March 2015.

Friday 13 March 2015

A better way to sequence exomes?

I caught up with a new company on the target capture scene, Directed Genomics, at AGBT. Their approach is based on a simple idea: if you want to sequence exomes, why not capture only exons?

Most exome-seq methods (Illumina, Agilent, Nimblegen) use oligo-baits to pull-down adapter-ligated fragment libraries, with fragments of 200-300bp. As exons are only 170bp long (80–85% Human exons less than 200bp Zhu et al & Sakharkar et al) we sequence lots of near- or off-target bases. These can be used (cnvOffSeq for instance), but are to some degree wasted sequencing.

The Directed Genomics approach: similar to other exome capture companies Directed Genomics also uses a probe hybridisation to targeted regions and/or exons, but applies this in a very different manner than we’re used to with standard exome capture. Two methods are presented in their recent posters; the first uses two probes, one at each end of the exon; the second uses a single probe hyb and random 5’end to create molecularly identifiable libraries. Current plans appear to be for custom panels, but hopefully they'll to build out to a whole exome panel over time.

Directed Genomics workflows

1: In their dual-probe method a short 50bp biotinyated-oligo probe is hybridised to fragmented gDNA at the 3’ end of an exon, the sequence upstream of this is then enzymatically digested and the 3’ hairpin adapter ligated. Next a second 50bp probe is hybridised to the 5’ end of the exon, the 5’ end is blunted and a 5’ adapter is ligated. Rather cleverly the hairpin adaptor ligated at the 3' end of the target links the target to the probe, allowing for a heat step in the second probe hybridisation without losing the target. Finally the 3’ hairpin is cleaved releasing products for PCR amplification and sequencing that contain only targeted exonic sequences. On-target rates of 97% were reported in their AGBT poster.

2: In their single-probe method a short 50bp probe is hybridised to fragmented gDNA at the 3’ end of an exon, the sequence upstream of this is then enzymatically digested and the 3’ adapter ligated. The probes is then extended to create the complementary strand and a 5’ adapter is ligated to the blunt end. This creates a library with random 5’ ends enabling a duplicate filtering step, unlike PCR approaches.

The protocols are both same-day 6-8 hours with around 1.5 hours hands-on time (according to the posters). Both allow a certain amount of, or all of the off-target sequence to be removed, reducing the amount of sequencing wasted. However the variation in exon length means that some sequence is inevitably lost.

Molecular IDs in cell free DNA: Their single-probe method creates libraries with in-built molecular ID. The random nature of the 5’ end should allow removal of all PCR duplication, without affecting biological duplication too much. Adding a molecular identifier to the 3’ probe would increase this even further; and also bring molecular ID to the of the dual-probe method.

These molecular ID’s are likely to become increasingly important in methods to call low-frequency mutations in cell-free DNA applications, particularly ctDNA. Current methods make use of deep-sequencing to call mutations just below 1% MAF (mutant allele freq). However simply sequencing deeper may not be enough to get under 0.1%. A MAF of 0.1% would require sequencing to >10,000x to have enough mutant allele reads; and PCR, clustering and sequencing errors all make the detection harder.

Adding a molecular identifier should allow us to develop better statistical methods to call lower and lower MAF. Ultimately we aim to get to a point where we are restricted more by the presence of mutant alleles in a sample than by the technology used to capture and sequence them.

Directed Genomics and cell free DNA: The AGBT poster contained results from the Horizon Diagnostics Multiplex Reference Standard (link). Correlations of observed vs expected allele frequencies were >0.91. This is one of the first methods that can target mutant alleles with a single oligo, as compared to the two used for PCR amplicon sequencing, e.g. TAM-seq. It should mean an increase in sensitivity as more ctDNA molecules can be captured and amplified.

Directed Genomics expects to be launching later in 2015.

Thursday 12 March 2015

Combining high-throughput CRISPR with in silico cancer drug development

In my last post I wrote about a computational screen of TCGA data and its use in repurposing approved drugs and/or finding new drug candidates for cancer patients. The work demonstrated the possibilities for finding novel treatments, but I also pointed to a cautionary Vemurafenib study that showed poor performance repurposing the drug in Colorectal cancer. As it becomes easier to identify novel therapies in a high-throughput manner, we need to develop methods to test these the are equally high-throughput. CRISPR knock-out or mutation of cancer drivers in multiple cancer cell lines or in tumour xeongrafts is one possibility - but most groups have carried out only a handful of knock-out or genome editing experiments.

In silico prescription of cancer drugs is likely to benefit patients - and can only get better

A fantastic paper just out in Cancer Cell: In Silico Prescription of Anticancer Drugs to Cohorts of 28 Tumor Types Reveals Targeting Opportunities from Nuria Lopez-Bigas's BioMedical Genomics lab in Barcelona.

They have developed an in silico prescription strategy by identifying the driver events in TCGA data, collating data on therapeutic drugs that target driver genes, and connecting patients with driver mutations to potential therapies (see figure 1 from their paper below).

40% (1635) of patients benefit from in silico prescription and repurposing of FDA-approved drug
33.1% (1346) of additional patients benefit from in silico prescription and repurposing of drugs currently in clinical trials
39% of patients could benefit from novel combination therapies

Figure 1 from Rubio-Perez et al (Cancer Cell 2015)

To identifying the driving events they took data from 28 cancers studied as part of TCGA and analysed all somatic SNVs, InDels, CNVs, fusions and RNA-seq differential gene expression. They found well over 400 genes that drive tumorigenesis via mutations, CNAs or gene fusions (data available at IntOGen http://www.intogen.org; Gonzalez-Perez et al., 2013a). Many of these driver events are in loss-of-function mutations that could be druggable, are present in many samples, and are not in well-established cancer genes. 25 driver events occur in at least 5% of tumours of at least one cancer type.

Understanding cancer biology is vital: Whilst exciting the results presented need to be taken with a pinch of salt, and one worries about the headlines journalists might be using in non-scientific media!

ICGC/TCGA and other NGS-based cancer projects have discovered new insights into cancer biology. However the results need very careful evaluation (and clinical trials) before their impact can be stated, and, at least in the case of Vemurafenib, targeted therapy can fail when applied to a different cancer. Vemurafenib increases survival in up to 80% of melanoma patients with the BRAF V600E mutation (although many patients develop resistance). But when prescribed to BRAF V600E positive colorectal cancer patients only 5% responded (see Prahallad et al in Nature 2012). This was reported as being due to activation of EGFR, by inhibition of BRAF V600E, driving continued cell proliferation. EGFR is expressed at low-levels in melanoma so the feedback activation is not significant. examples like this, and we can expect more to be reported, demonstrate the need to understand cancer biology with respect to targeted therapeutics.

The future for this kind of analysis looks bright: ICGC/TCGA data sets are getting larger and richer, analysis algorithms continue to improve, data on basket trials is starting to be reported, companies like Foundation Medicine are developing tests to report this kind of result. Hopefully this kind of analysis will be routine in just a few years time.

Friday 6 March 2015

10X Genomics: what's the fuss over phasing

At AGBT 2015 the big splash was clearly 10X Genomics and their new technology the GemCode "toaster"; presumably so called because of its diminutive size, and not because your microtitre plate is launched out the top nice and warm! The system is available to order now costing $75K, with a $500 per sample price. Using an input of just 1ng means users can test this even with precious clinical samples. Hopefully the improved structural variant detection 10X are promising will have a significant impact on cancer research, perhaps making translocation discovery easier.

CoreGenomics

Pages