Welcome to the CBRG
The Computational Biology Research Group (CBRG) provides computing support for
bioinformatics analysis at the University
of Oxford. We have expertise in many aspects of bioinformatics (sequence analysis, microarrays, proteomics and integration).
We especially encourage collaborations that require writing custom software, bioinformatics tools and databases.
An account with the CBRG has many benefits and gives automatic access to a large number of molecular biology computing packages and
to numerous biological databases.
We are based at the Sir William Dunn School of Pathology and at the
Weatherall Institute of Molecular Medicine. Full details can be found on
the contact details page.
Taylor JC, Martin HC, Lise S, Broxholme J, Cazier JB, Rimmer A, Kanapin A, Lunter G, Fiddy S, Allan C, Aricescu AR, Attar M, Babbs C, Becq J, Beeson D, Bento C, Bignell P, Blair E, Buckle VJ, Bull K, Cais O, Cario H, Chapel H, Copley RR, Cornall R, Craft J, Dahan K, Davenport EE, Dendrou C, Devuyst O, Fenwick AL, Flint J, Fugger L, Gilbert RD, Goriely A, Green A, Greger IH, Grocock R, Gruszczyk AV, Hastings R, Hatton E, Higgs D, Hill A, Holmes C, Howard M, Hughes L, Humburg P, Johnson D, Karpe F, Kingsbury Z, Kini U, Knight JC, Krohn J, Lamble S, Langman C, Lonie L, Luck J, McCarthy D, McGowan SJ, McMullin MF, Miller KA, Murray L, Németh AH, Nesbit MA, Nutt D, Ormondroyd E, Oturai AB, Pagnamenta A, Patel SY, Percy M, Petousi N, Piazza P, Piret SE, Polanco-Echeverry G, Popitsch N, Powrie F, Pugh C, Quek L, Robbins PA, Robson K, Russo A, Sahgal N, van Schouwenburg PA, Schuh A, Silverman E, Simmons A, Sørensen PS, Sweeney E, Taylor J, Thakker RV, Tomlinson I, Trebes A, Twigg SR, Uhlig HH, Vyas P, Vyse T, Wall SA, Watkins H, Whyte MP, Witty L, Wright B, Yau C, Buck D, Humphray S, Ratcliffe PJ, Bell JI, Wilkie AO, Bentley D, Donnelly P, McVean G
Factors influencing success of clinical genome sequencing across a broad spectrum of disorders.
Nat Genet (2015) 47: 717-26
» View abstract
To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis but also highlight many outstanding challenges.
Zhang Y, Makvandi-Nejad S, Qin L, Zhao Y, Zhang T, Wang L, Repapi E, Taylor S, McMichael A, Li N, Dong T, Wu H
Interferon-induced transmembrane protein-3 rs12252-C is associated with rapid progression of acute HIV-1 infection in Chinese MSM cohort.
AIDS (2015) 29: 889-94
» View abstract
The interferon-inducible transmembrane protein-3 (IFITM3) is a protein that restricts multiple pathogenic viruses such as influenza virus. The single-nucleotide polymorphism rs12252-C, which is rare in Caucasian populations, but much more common in the Han Chinese population, has been found in much higher homozygous frequency in patients with severe acute influenza. Until now, there has been no study on the effect of this genetic variant on the clinical control of other viral infections.To investigate the impact of IFITM3-rs12252 genotypes on primary HIV-1 infection progression in an acute HIV-1-infected cohort in Beijing (PRIMO), China.We identified IFITM3-rs12252 genotypes of 178 acute HIV-1-infected patients and 196 HIV-negative candidates from the PRIMO cohort. HIV-1 viral load and CD4(+) T-cell counts were monitored at multiple time points during the first year of infection, and the association between IFITM3-rs12252 genotype and disease progression was evaluated.The current study shows that the IFITM3-rs12252 genetic variant affects the progression of HIV-1 infection, but not the acquisition. A significantly higher frequency of the CC/CT genotypes was found in rapid progressors compared to nonprogressors. Patients with CC/CT genotypes showed an elevated peak viremia level and significantly lower CD4(+) T-cell count at multiple time points during the first year of primary infection, and a significantly higher risk of rapid decline of the CD4(+) T-cell count to below 350â cells/Î¼l.A novel association between IFITM3 gene polymorphism and rapid disease progression is reported in an acute HIV-1-infected MSM cohort in China.
French A, Yang CT, Taylor S, Watt SM, Carpenter L
Human induced pluripotent stem cell-derived B lymphocytes express sIgM and can be generated via a hemogenic endothelium intermediate.
Stem Cells Dev (2015) 24: 1082-95
» View abstract
The differentiation of human pluripotent stem cells to the B-cell lymphoid lineage has important clinical applications that include in vitro modeling of developmental lymphogenesis in health and disease. Here, we first demonstrate the capacity of human induced pluripotent stem cells (hiPSCs) to differentiate into CD144(+)CD73(-)CD43/CD235a(-) cells, characterized as hemogenic endothelium, and show that this population is capable of differentiating to CD10(+)CD19(+) B lymphocytes. We also demonstrate that B lymphocytes generated from hiPSCs are able to undergo full VDJ rearrangement and express surface IgM (sIgM(+)), thus representing an immature B-cell subset. Efficiency of sIgM expression on the hiPSC-derived B lymphocytes (â¼ 5% of CD19(+) cells) was comparable with B lymphocytes generated from human umbilical cord blood (UCB) hematopoietic progenitor cells. Importantly, when assessed by global transcriptional profiling, hiPSC-derived B-cells show a very high level of similarity when compared with their UCB-derived counterparts, such that from more than 47,000 different transcripts, only 45 were significantly different (with a criteria adjusted P value P<0.05, log FC >1.5 or 2.8-fold). This represents a unique in vitro model to delineate critical events during lymphogeneisis in development and lymphoid diseases such as acute lymphocytic leukemia.
Babbs C, Lloyd D, Pagnamenta AT, Twigg SR, Green J, McGowan SJ, Mirza G, Naples R, Sharma VP, Volpi EV, Buckle VJ, Wall SA, Knight SJ, Parr JR, Wilkie AO
De novo and rare inherited mutations implicate the transcriptional coregulator TCF20/SPBP in autism spectrum disorder.
J Med Genet (2014) 51: 737-47
» View abstract
Autism spectrum disorders (ASDs) are common and have a strong genetic basis, yet the cause of â¼70-80% ASDs remains unknown. By clinical cytogenetic testing, we identified a family in which two brothers had ASD, mild intellectual disability and a chromosome 22 pericentric inversion, not detected in either parent, indicating de novo mutation with parental germinal mosaicism. We hypothesised that the rearrangement was causative of their ASD and localised the chromosome 22 breakpoints.The rearrangement was characterised using fluorescence in situ hybridisation, Southern blotting, inverse PCR and dideoxy-sequencing. Open reading frames and intron/exon boundaries of the two physically disrupted genes identified, TCF20 and TNRC6B, were sequenced in 342 families (260 multiplex and 82 simplex) ascertained by the International Molecular Genetic Study of Autism Consortium (IMGSAC).IMGSAC family screening identified a de novo missense mutation of TCF20 in a single case and significant association of a different missense mutation of TCF20 with ASD in three further families. Through exome sequencing in another project, we independently identified a de novo frameshifting mutation of TCF20 in a woman with ASD and moderate intellectual disability. We did not identify a significant association of TNRC6B mutations with ASD.TCF20 encodes a transcriptional coregulator (also termed SPBP) that is structurally and functionally related to RAI1, the critical dosage-sensitive protein implicated in the behavioural phenotypes of the Smith-Magenis and Potocki-Lupski 17p11.2 deletion/duplication syndromes, in which ASD is frequently diagnosed. This study provides the first evidence that mutations in TCF20 are also associated with ASD.
Bassett AR, Azzam G, Wheatley L, Tibbit C, Rajakumar T, McGowan S, Stanger N, Ewels PA, Taylor S, Ponting CP, Liu JL, Sauka-Spengler T, Fulga TA
Understanding functional miRNA-target interactions in vivo by site-specific genome engineering.
Nat Commun (2014) 5: 4640
» View abstract
MicroRNA (miRNA) target recognition is largely dictated by short 'seed' sequences, and single miRNAs therefore have the potential to regulate a large number of genes. Understanding the contribution of specific miRNA-target interactions to the regulation of biological processes in vivo remains challenging. Here we use transcription activator-like effector nuclease (TALEN) and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 technologies to interrogate the functional relevance of predicted miRNA response elements (MREs) to post-transcriptional silencing in zebrafish and Drosophila. We also demonstrate an effective strategy that uses CRISPR-mediated homology-directed repair with short oligonucleotide donors for the assessment of MRE activity in human cells. These methods facilitate analysis of the direct phenotypic consequences resulting from blocking specific miRNA-MRE interactions at any point during development.
Armitage AE, Stacey AR, Giannoulatou E, Marshall E, Sturges P, Chatha K, Smith NM, Huang X, Xu X, Pasricha SR, Li N, Wu H, Webster C, Prentice AM, Pellegrino P, Williams I, Norris PJ, Drakesmith H, Borrow P
Distinct patterns of hepcidin and iron regulation during HIV-1, HBV, and HCV infections.
Proc Natl Acad Sci U S A (2014) 111: 12187-92
» View abstract
During HIV type-1 (HIV-1), hepatitis C virus (HCV), and hepatitis B virus (HBV) infections, altered iron balance correlates with morbidity. The liver-produced hormone hepcidin dictates systemic iron homeostasis. We measured hepcidin, iron parameters, cytokines, and inflammatory markers in three cohorts: plasma donors who developed acute HIV-1, HBV, or HCV viremia during the course of donations; HIV-1-positive individuals progressing from early to chronic infection; and chronically HIV-1-infected individuals (receiving antiretroviral therapy or untreated). Hepcidin increased and plasma iron decreased during acute HIV-1 infection, as viremia was initially detected. In patients transitioning from early to chronic HIV-1 infection, hepcidin in the first 60 d of infection positively correlated with the later plasma viral load set-point. Hepcidin remained elevated in individuals with untreated chronic HIV-1 infection and in subjects on ART. In contrast to HIV-1, there was no evidence of hepcidin up-regulation or hypoferremia during the primary viremic phases of HCV or HBV infection; serum iron marginally increased during acute HBV infection. In conclusion, hepcidin induction is part of the pathogenically important systemic inflammatory cascade triggered during HIV-1 infection and may contribute to the establishment and maintenance of viral set-point, which is a strong predictor of progression to AIDS and death. However, distinct patterns of hepcidin and iron regulation occur during different viral infections that have particular tissue tropisms and elicit different systemic inflammatory responses. The hypoferremia of acute infection is therefore a pathogen-specific, not universal, phenomenon.
Taylor S, Noble R
HTML5 PivotViewer: high-throughput visualization and querying of image data on the web.
Bioinformatics (2014) 30: 2691-2
» View abstract
Visualization and analysis of large numbers of biological images has generated a bottle neck in research. We present HTML5 PivotViewer, a novel, open source, platform-independent viewer making use of the latest web technologies that allows seamless access to images and associated metadata for each image. This provides a powerful method to allow end users to mine their data.Documentation, examples and links to the software are available from http://www.cbrg.ox.ac.uk/data/pivotviewer/. The software is licensed under GPLv2.
Woll PS, Kjällquist U, Chowdhury O, Doolittle H, Wedge DC, Thongjuea S, Erlandsson R, Ngara M, Anderson K, Deng Q, Mead AJ, Stenson L, Giustacchini A, Duarte S, Giannoulatou E, Taylor S, Karimi M, Scharenberg C, Mortera-Blanco T, Macaulay IC, Clark SA, Dybedal I, Josefsen D, Fenaux P, Hokland P, Holm MS, Cazzola M, Malcovati L, Tauro S, Bowen D, Boultwood J, Pellagatti A, Pimanda JE, Unnikrishnan A, Vyas P, Göhring G, Schlegelberger B, Tobiasson M, Kvalheim G, Constantinescu SN, Nerlov C, Nilsson L, Campbell PJ, Sandberg R, Papaemmanuil E, Hellström-Lindberg E, Linnarsson S, Jacobsen SE
Myelodysplastic syndromes are propagated by rare and distinct human cancer stem cells in vivo.
Cancer Cell (2014) 25: 794-808
» View abstract
Evidence for distinct human cancer stem cells (CSCs) remains contentious and the degree to which different cancer cells contribute to propagating malignancies in patients remains unexplored. In low- to intermediate-risk myelodysplastic syndromes (MDS), we establish the existence of rare multipotent MDS stem cells (MDS-SCs), and their hierarchical relationship to lineage-restricted MDS progenitors. All identified somatically acquired genetic lesions were backtracked to distinct MDS-SCs, establishing their distinct MDS-propagating function in vivo. In isolated del(5q)-MDS, acquisition of del(5q) preceded diverse recurrent driver mutations. Sequential analysis in del(5q)-MDS revealed genetic evolution in MDS-SCs and MDS-progenitors prior to leukemic transformation. These findings provide definitive evidence for rare human MDS-SCs in vivo, with extensive implications for the targeting of the cells required and sufficient for MDS-propagation.
Clynes D, Jelinska C, Xella B, Ayyub H, Taylor S, Mitson M, Bachrati CZ, Higgs DR, Gibbons RJ
ATRX dysfunction induces replication defects in primary mouse cells.
PLoS One (2014) 9: e92915
» View abstract
The chromatin remodeling protein ATRX, which targets tandem repetitive DNA, has been shown to be required for expression of the alpha globin genes, for proliferation of a variety of cellular progenitors, for chromosome congression and for the maintenance of telomeres. Mutations in ATRX have recently been identified in tumours which maintain their telomeres by a telomerase independent pathway involving homologous recombination thought to be triggered by DNA damage. It is as yet unknown whether there is a central underlying mechanism associated with ATRX dysfunction which can explain the numerous cellular phenomena observed. There is, however, growing evidence for its role in the replication of various repetitive DNA templates which are thought to have a propensity to form secondary structures. Using a mouse knockout model we demonstrate that ATRX plays a direct role in facilitating DNA replication. Ablation of ATRX alone, although leading to a DNA damage response at telomeres, is not sufficient to trigger the alternative lengthening of telomere pathway in mouse embryonic stem cells.
Hay AS, Pieper B, Cooke E, Mandáková T, Cartolano M, Tattersall AD, Ioio RD, McGowan SJ, Barkoulas M, Galinha C, Rast MI, Hofhuis H, Then C, Plieske J, Ganal M, Mott R, Martinez-Garcia JF, Carine MA, Scotland RW, Gan X, Filatov DA, Lysak MA, Tsiantis M
Cardamine hirsuta: a versatile genetic system for comparative studies.
Plant J (2014) 78: 1-15
» View abstract
A major goal in biology is to identify the genetic basis for phenotypic diversity. This goal underpins research in areas as diverse as evolutionary biology, plant breeding and human genetics. A limitation for this research is no longer the availability of sequence information but the development of functional genetic tools to understand the link between changes in sequence and phenotype. Here we describe Cardamine hirsuta, a close relative of the reference plant Arabidopsis thaliana, as an experimental system in which genetic and transgenic approaches can be deployed effectively for comparative studies. We present high-resolution genetic and cytogenetic maps for C. hirsuta and show that the genome structure of C. hirsuta closely resembles the eight chromosomes of the ancestral crucifer karyotype and provides a good reference point for comparative genome studies across the Brassicaceae. We compared morphological and physiological traits between C. hirsuta and A. thaliana and analysed natural variation in stamen number in which lateral stamen loss is a species characteristic of C. hirsuta. We constructed a set of recombinant inbred lines and detected eight quantitative trait loci that can explain stamen number variation in this population. We found clear phylogeographic structure to the genetic variation in C. hirsuta, thus providing a context within which to address questions about evolutionary changes that link genotype with phenotype and the environment.
Gutowska-Owsiak D, Selvakumar TA, Salimi M, Taylor S, Ogg GS
Histamine enhances keratinocyte-mediated resolution of inflammation by promoting wound healing and response to infection.
Clin Exp Dermatol (2014) 39: 187-95
» View abstract
The role of the epidermis in the immune response is well known. While multiple cytokines are implicated in keratinocyte-mediated infection clearance and wound healing, little is known about the involvement of keratinocytes in promoting resolution of inflammation.To assess effects of histamine stimulation on keratinocyte function.We performed a combined microarray/Gene Ontology analysis of histamine-stimulated keratinocytes. Functional changes were tested by apoptosis assessment and scratch assays. Histamine receptor involvement was also assessed by blocking wound closure with specific antagonists.Histamine treatment had extensive effects on keratinocytes, including effects on proinflammatory responses and cellular functions promoting wound healing. At the functional level, there was reduced apoptosis and enhancement of wound healing in vitro. At the receptor level, we identified involvement of all keratinocyte-expressed histamine receptors (HRHs), with HRH1 blockage resulting in the most prominent effect.Histamine activates wound healing and infection clearance-related functions of keratinocytes. While enhancement of histamine-mediated wound healing is mediated predominantly via the HRH1 receptor, other keratinocyte-expressed receptors are also involved. These effects could promote resolution of skin inflammation caused by infection or superficial injury.
Hughes JR, Roberts N, McGowan S, Hay D, Giannoulatou E, Lynch M, De Gobbi M, Taylor S, Gibbons R, Higgs DR
Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment.
Nat Genet (2014) 46: 205-12
» View abstract
Gene expression during development and differentiation is regulated in a cell- and stage-specific manner by complex networks of intergenic and intragenic cis-regulatory elements whose numbers and representation in the genome far exceed those of structural genes. Using chromosome conformation capture, it is now possible to analyze in detail the interaction between enhancers, silencers, boundary elements and promoters at individual loci, but these techniques are not readily scalable. Here we present a high-throughput approach (Capture-C) to analyze cis interactions, interrogating hundreds of specific interactions at high resolution in a single experiment. We show how this approach will facilitate detailed, genome-wide analysis to elucidate the general principles by which cis-acting sequences control gene expression. In addition, we show how Capture-C will expedite identification of the target genes and functional effects of SNPs that are associated with complex diseases, which most frequently lie in intergenic cis-acting regulatory elements.
Favaro FP, Alvizi L, Zechi-Ceide RM, Bertola D, Felix TM, de Souza J, Raskin S, Twigg SR, Weiner AM, Armas P, Margarit E, Calcaterra NB, Andersen GR, McGowan SJ, Wilkie AO, Richieri-Costa A, de Almeida ML, Passos-Bueno MR
A noncoding expansion in EIF4A3 causes Richieri-Costa-Pereira syndrome, a craniofacial disorder associated with limb defects.
Am J Hum Genet (2014) 94: 120-8
» View abstract
Richieri-Costa-Pereira syndrome is an autosomal-recessive acrofacial dysostosis characterized by mandibular median cleft associated with other craniofacial anomalies and severe limb defects. Learning and language disabilities are also prevalent. We mapped the mutated gene to a 122 kb region at 17q25.3 through identity-by-descent analysis in 17 genealogies. Sequencing strategies identified an expansion of a region with several repeats of 18- or 20-nucleotide motifs in the 5' untranslated region (5' UTR) of EIF4A3, which contained from 14 to 16 repeats in the affected individuals and from 3 to 12 repeats in 520 healthy individuals. A missense substitution of a highly conserved residue likely to affect the interaction of eIF4AIII with the UPF3B subunit of the exon junction complex in trans with an expanded allele was found in an unrelated individual with an atypical presentation, thus expanding mutational mechanisms and phenotypic diversity of RCPS. EIF4A3 transcript abundance was reduced in both white blood cells and mesenchymal cells of RCPS-affected individuals as compared to controls. Notably, targeting the orthologous eif4a3 in zebrafish led to underdevelopment of several craniofacial cartilage and bone structures, in agreement with the craniofacial alterations seen in RCPS. Our data thus suggest that RCPS is caused by mutations in EIF4A3 and show that EIF4A3, a gene involved in RNA metabolism, plays a role in mandible, laryngeal, and limb morphogenesis.
Pan X, Huang LC, Dong T, Peng Y, Cerundolo V, McGowan S, Ogg G
Combinatorial HLA-peptide bead libraries for high throughput identification of CD8âº T cell specificity.
J Immunol Methods (2014) 403: 72-8
» View abstract
Comprehensive antigenic characterization of a T cell population of unknown specificity is challenging. Existing MHC class I expression systems are limited by the practical difficulty of probing cell populations with an MHC class I peptide library and the cross-reactivity of T cells that are able to recognise many variants of an index peptide. Using emulsion PCR and emulsion in vitro transcription/translation of a random library of peptides conjugated to CD8-null HLA-A*0201 on beads, we probed HLA-A*0201-restricted T cells with specificity for influenza, CMV and EBV. We observed significant enrichment for sequences containing HLA-A2 anchors and correct viral fragments for all T cell populations. HLA bead display provides a novel approach to identify the specificity of T cells.
Swiers G, Baumann C, O'Rourke J, Giannoulatou E, Taylor S, Joshi A, Moignard V, Pina C, Bee T, Kokkaliaris KD, Yoshimoto M, Yoder MC, Frampton J, Schroeder T, Enver T, Göttgens B, de Bruijn MF
Early dynamic fate changes in haemogenic endothelium characterized at the single-cell level.
Nat Commun (2013) 4: 2924
» View abstract
Haematopoietic stem cells (HSCs) are the founding cells of the adult haematopoietic system, born during ontogeny from a specialized subset of endothelium, the haemogenic endothelium (HE) via an endothelial-to-haematopoietic transition (EHT). Although recently imaged in real time, the underlying mechanism of EHT is still poorly understood. We have generated a Runx1 +23 enhancer-reporter transgenic mouse (23GFP) for the prospective isolation of HE throughout embryonic development. Here we perform functional analysis of over 1,800 and transcriptional analysis of 268 single 23GFP(+) HE cells to explore the onset of EHT at the single-cell level. We show that initiation of the haematopoietic programme occurs in cells still embedded in the endothelial layer, and is accompanied by a previously unrecognized early loss of endothelial potential before HSCs emerge. Our data therefore provide important insights on the timeline of early haematopoietic commitment.