Welcome to the CBRG
The Computational Biology Research Group (CBRG) provides computing support for
bioinformatics analysis at the University
of Oxford. We have expertise in many aspects of bioinformatics (sequence analysis, microarrays, proteomics and integration).
We especially encourage collaborations that require writing custom software, bioinformatics tools and databases.
An account with the CBRG has many benefits and gives automatic access to a large number of molecular biology computing packages and
to numerous biological databases.
We are based at the Sir William Dunn School of Pathology and at the
Weatherall Institute of Molecular Medicine. Full details can be found on
the contact details page.
Bioinformatics analysis tools online:
A wide range of bioinformatics programs are available online via EMBOSS Explorer.
You will need a molbiol username and password to use these tools.
Other web tools available include: BLAST,
(see full list).
Hay AS, Pieper B, Cooke E, Mandáková T, Cartolano M, Tattersall AD, Ioio RD, McGowan SJ, Barkoulas M, Galinha C, Rast MI, Hofhuis H, Then C, Plieske J, Ganal M, Mott R, Martinez-Garcia JF, Carine MA, Scotland RW, Gan X, Filatov DA, Lysak MA, Tsiantis M
Cardamine hirsuta: a versatile genetic system for comparative studies.
Plant J (2014) :
» View abstract
A major goal in biology is to identify the genetic basis for phenotypic diversity. This goal underpins research in areas as diverse as evolutionary biology, plant breeding and human genetics. A limitation for this research is no longer the availability of sequence information but the development of functional genetic tools to understand the link between changes in sequence and phenotype. Here we describe Cardamine hirsuta, a close relative of the reference plant Arabidopsis thaliana, as an experimental system where genetic and transgenic approaches can be effectively deployed for comparative studies. We present high-resolution genetic and cytogenetic maps for C. hirsuta and show that the genome structure of C. hirsuta closely resembles the eight chromosomes of the Ancestral Crucifer Karyotype and provides a good reference point for comparative genome studies across the Brassicaceae. We compare morphological and physiological traits between C. hirsuta and A. thaliana and analyse natural variation in stamen number where lateral stamen loss is a species characteristic of C. hirsuta. We construct a set of Recombinant Inbred Lines and detect eight quantitative trait loci that can explain stamen number variation in this population. We find clear phylogeographic structure to the genetic variation in C. hirsuta, thus providing a context within which to address questions about evolutionary change that link genotype with phenotype and environment. This article is protected by copyright. All rights reserved.
Gutowska-Owsiak D, Selvakumar TA, Salimi M, Taylor S, Ogg GS
Histamine enhances keratinocyte-mediated resolution of inflammation by promoting wound healing and response to infection.
Clin Exp Dermatol (2014) 39: 187-95
» View abstract
The role of the epidermis in the immune response is well known. While multiple cytokines are implicated in keratinocyte-mediated infection clearance and wound healing, little is known about the involvement of keratinocytes in promoting resolution of inflammation.To assess effects of histamine stimulation on keratinocyte function.We performed a combined microarray/Gene Ontology analysis of histamine-stimulated keratinocytes. Functional changes were tested by apoptosis assessment and scratch assays. Histamine receptor involvement was also assessed by blocking wound closure with specific antagonists.Histamine treatment had extensive effects on keratinocytes, including effects on proinflammatory responses and cellular functions promoting wound healing. At the functional level, there was reduced apoptosis and enhancement of wound healing in vitro. At the receptor level, we identified involvement of all keratinocyte-expressed histamine receptors (HRHs), with HRH1 blockage resulting in the most prominent effect.Histamine activates wound healing and infection clearance-related functions of keratinocytes. While enhancement of histamine-mediated wound healing is mediated predominantly via the HRH1 receptor, other keratinocyte-expressed receptors are also involved. These effects could promote resolution of skin inflammation caused by infection or superficial injury.
Hughes JR, Roberts N, McGowan S, Hay D, Giannoulatou E, Lynch M, De Gobbi M, Taylor S, Gibbons R, Higgs DR
Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment.
Nat Genet (2014) 46: 205-12
» View abstract
Gene expression during development and differentiation is regulated in a cell- and stage-specific manner by complex networks of intergenic and intragenic cis-regulatory elements whose numbers and representation in the genome far exceed those of structural genes. Using chromosome conformation capture, it is now possible to analyze in detail the interaction between enhancers, silencers, boundary elements and promoters at individual loci, but these techniques are not readily scalable. Here we present a high-throughput approach (Capture-C) to analyze cis interactions, interrogating hundreds of specific interactions at high resolution in a single experiment. We show how this approach will facilitate detailed, genome-wide analysis to elucidate the general principles by which cis-acting sequences control gene expression. In addition, we show how Capture-C will expedite identification of the target genes and functional effects of SNPs that are associated with complex diseases, which most frequently lie in intergenic cis-acting regulatory elements.
Favaro FP, Alvizi L, Zechi-Ceide RM, Bertola D, Felix TM, de Souza J, Raskin S, Twigg SR, Weiner AM, Armas P, Margarit E, Calcaterra NB, Andersen GR, McGowan SJ, Wilkie AO, Richieri-Costa A, de Almeida ML, Passos-Bueno MR
A noncoding expansion in EIF4A3 causes Richieri-Costa-Pereira syndrome, a craniofacial disorder associated with limb defects.
Am J Hum Genet (2014) 94: 120-8
» View abstract
Richieri-Costa-Pereira syndrome is an autosomal-recessive acrofacial dysostosis characterized by mandibular median cleft associated with other craniofacial anomalies and severe limb defects. Learning and language disabilities are also prevalent. We mapped the mutated gene to a 122 kb region at 17q25.3 through identity-by-descent analysis in 17 genealogies. Sequencing strategies identified an expansion of a region with several repeats of 18- or 20-nucleotide motifs in the 5' untranslated region (5' UTR) of EIF4A3, which contained from 14 to 16 repeats in the affected individuals and from 3 to 12 repeats in 520 healthy individuals. A missense substitution of a highly conserved residue likely to affect the interaction of eIF4AIII with the UPF3B subunit of the exon junction complex in trans with an expanded allele was found in an unrelated individual with an atypical presentation, thus expanding mutational mechanisms and phenotypic diversity of RCPS. EIF4A3 transcript abundance was reduced in both white blood cells and mesenchymal cells of RCPS-affected individuals as compared to controls. Notably, targeting the orthologous eif4a3 in zebrafish led to underdevelopment of several craniofacial cartilage and bone structures, in agreement with the craniofacial alterations seen in RCPS. Our data thus suggest that RCPS is caused by mutations in EIF4A3 and show that EIF4A3, a gene involved in RNA metabolism, plays a role in mandible, laryngeal, and limb morphogenesis.
Pan X, Huang LC, Dong T, Peng Y, Cerundolo V, McGowan S, Ogg G
Combinatorial HLA-peptide bead libraries for high throughput identification of CD8(+) T cell specificity.
J Immunol Methods (2014) 403: 72-8
» View abstract
Comprehensive antigenic characterization of a T cell population of unknown specificity is challenging. Existing MHC class I expression systems are limited by the practical difficulty of probing cell populations with an MHC class I peptide library and the cross-reactivity of T cells that are able to recognise many variants of an index peptide. Using emulsion PCR and emulsion in vitro transcription/translation of a random library of peptides conjugated to CD8-null HLA-A*0201 on beads, we probed HLA-A*0201-restricted T cells with specificity for influenza, CMV and EBV. We observed significant enrichment for sequences containing HLA-A2 anchors and correct viral fragments for all T cell populations. HLA bead display provides a novel approach to identify the specificity of T cells.
Swiers G, Baumann C, O'Rourke J, Giannoulatou E, Taylor S, Joshi A, Moignard V, Pina C, Bee T, Kokkaliaris KD, Yoshimoto M, Yoder MC, Frampton J, Schroeder T, Enver T, Göttgens B, de Bruijn MF
Early dynamic fate changes in haemogenic endothelium characterized at the single-cell level.
Nat Commun (2013) 4: 2924
» View abstract
Haematopoietic stem cells (HSCs) are the founding cells of the adult haematopoietic system, born during ontogeny from a specialized subset of endothelium, the haemogenic endothelium (HE) via an endothelial-to-haematopoietic transition (EHT). Although recently imaged in real time, the underlying mechanism of EHT is still poorly understood. We have generated a Runx1 +23 enhancer-reporter transgenic mouse (23GFP) for the prospective isolation of HE throughout embryonic development. Here we perform functional analysis of over 1,800 and transcriptional analysis of 268 single 23GFP(+) HE cells to explore the onset of EHT at the single-cell level. We show that initiation of the haematopoietic programme occurs in cells still embedded in the endothelial layer, and is accompanied by a previously unrecognized early loss of endothelial potential before HSCs emerge. Our data therefore provide important insights on the timeline of early haematopoietic commitment.
Giannoulatou E, McVean G, Taylor IB, McGowan SJ, Maher GJ, Iqbal Z, Pfeifer SP, Turner I, Burkitt Wright EM, Shorto J, Itani A, Turner K, Gregory L, Buck D, Rajpert-De Meyts E, Looijenga LH, Kerr B, Wilkie AO, Goriely A
Contributions of intrinsic mutation rate and selfish selection to levels of de novo HRAS mutations in the paternal germline.
Proc Natl Acad Sci U S A (2013) 110: 20152-7
» View abstract
The RAS proto-oncogene Harvey rat sarcoma viral oncogene homolog (HRAS) encodes a small GTPase that transduces signals from cell surface receptors to intracellular effectors to control cellular behavior. Although somatic HRAS mutations have been described in many cancers, germline mutations cause Costello syndrome (CS), a congenital disorder associated with predisposition to malignancy. Based on the epidemiology of CS and the occurrence of HRAS mutations in spermatocytic seminoma, we proposed that activating HRAS mutations become enriched in sperm through a process akin to tumorigenesis, termed selfish spermatogonial selection. To test this hypothesis, we quantified the levels, in blood and sperm samples, of HRAS mutations at the p.G12 codon and compared the results to changes at the p.A11 codon, at which activating mutations do not occur. The data strongly support the role of selection in determining HRAS mutation levels in sperm, and hence the occurrence of CS, but we also found differences from the mutation pattern in tumorigenesis. First, the relative prevalence of mutations in sperm correlates weakly with their in vitro activating properties and occurrence in cancers. Second, specific tandem base substitutions (predominantly GC>TT/AA) occur in sperm but not in cancers; genomewide analysis showed that this same mutation is also overrepresented in constitutional pathogenic and polymorphic variants, suggesting a heightened vulnerability to these mutations in the germline. We developed a statistical model to show how both intrinsic mutation rate and selfish selection contribute to the mutational burden borne by the paternal germline.
Salek M, McGowan S, Trudgian DC, Dushek O, de Wet B, Efstathiou G, Acuto O
Quantitative phosphoproteome analysis unveils LAT as a modulator of CD3Î¶ and ZAP-70 tyrosine phosphorylation.
PLoS One (2013) 8: e77423
» View abstract
Signaling through the T cell receptor (TCR) initiates adaptive immunity and its perturbation may results in autoimmunity. The plasma membrane scaffolding protein LAT acts as a central organizer of the TCR signaling machinery to activate many functional pathways. LAT-deficient mice develop an autoimmune syndrome but the mechanism of this pathology is unknown. In this work we have compared global dynamics of TCR signaling by MS-based quantitative phosphoproteomics in LAT-sufficient and LAT-defective Jurkat T cells. Surprisingly, we found that many TCR-induced phosphorylation events persist in the absence of LAT, despite ERK and PLCÎ³1 phosphorylation being repressed. Most importantly, the absence of LAT resulted in augmented and persistent tyrosine phosphorylation of CD3Î¶ and ZAP70. This indicates that LAT signaling hub is also implicated in negative feedback signals to modulate upstream phosphorylation events. Phosphorylation kinetics data resulting from this investigation is documented in a database (phosphoTCR) accessible online. The MS data have been deposited to the ProteomeXchange with identifier PXD000341.
May G, Soneji S, Tipping AJ, Teles J, McGowan SJ, Wu M, Guo Y, Fugazza C, Brown J, Karlsson G, Pina C, Olariu V, Taylor S, Tenen DG, Peterson C, Enver T
Dynamic analysis of gene expression and genome-wide transcription factor binding during lineage specification of multipotent progenitors.
Cell Stem Cell (2013) 13: 754-68
» View abstract
We used the paradigmatic GATA-PU.1 axis to explore, at the systems level, dynamic relationships between transcription factor (TF) binding and global gene expression programs as multipotent cells differentiate. We combined global ChIP-seq of GATA1, GATA2, and PU.1 with expression profiling during differentiation to erythroid and neutrophil lineages. Our analysis reveals (1) differential complexity of sequence motifs bound by GATA1, GATA2, and PU.1; (2) the scope and interplay of GATA1 and GATA2 programs within, and during transitions between, different cell compartments, and the extent of their hard-wiring by DNA motifs; (3) the potential to predict gene expression trajectories based on global associations between TF-binding data and target gene expression; and (4) how dynamic modeling of DNA-binding and gene expression data can be used to infer regulatory logic of TF circuitry. This rubric exemplifies the utility of this cross-platform resource for deconvoluting the complexity of transcriptional programs controlling stem/progenitor cell fate in hematopoiesis.
Roberts I, Alford K, Hall G, Juban G, Richmond H, Norton A, Vallance G, Perkins K, Marchi E, McGowan S, Roy A, Cowan G, Anthony M, Gupta A, Ho J, Uthaya S, Curley A, Rasiah SV, Watts T, Nicholl R, Bedford-Russell A, Blumberg R, Thomas A, Gibson B, Halsey C, Lee PW, Godambe S, Sweeney C, Bhatnagar N, Goriely A, Campbell P, Vyas P
GATA1-mutant clones are frequent and often unsuspected in babies with Down syndrome: identification of a population at risk of leukemia.
Blood (2013) 122: 3908-17
» View abstract
Transient abnormal myelopoiesis (TAM), a preleukemic disorder unique to neonates with Down syndrome (DS), may transform to childhood acute myeloid leukemia (ML-DS). Acquired GATA1 mutations are present in both TAM and ML-DS. Current definitions of TAM specify neither the percentage of blasts nor the role of GATA1 mutation analysis. To define TAM, we prospectively analyzed clinical findings, blood counts and smears, and GATA1 mutation status in 200 DS neonates. All DS neonates had multiple blood count and smear abnormalities. Surprisingly, 195 of 200 (97.5%) had circulating blasts. GATA1 mutations were detected by Sanger sequencing/denaturing high performance liquid chromatography (Ss/DHPLC) in 17 of 200 (8.5%), all with blasts >10%. Furthermore low-abundance GATA1 mutant clones were detected by targeted next-generation resequencing (NGS) in 18 of 88 (20.4%; sensitivity â¼0.3%) DS neonates without Ss/DHPLC-detectable GATA1 mutations. No clinical or hematologic features distinguished these 18 neonates. We suggest the term "silent TAM" for neonates with DS with GATA1 mutations detectable only by NGS. To identify all babies at risk of ML-DS, we suggest GATA1 mutation and blood count and smear analyses should be performed in DS neonates. Ss/DPHLC can be used for initial screening, but where GATA1 mutations are undetectable by Ss/DHPLC, NGS-based methods can identify neonates with small GATA1 mutant clones.
Brackley CA, Taylor S, Papantonis A, Cook PR, Marenduzzo D
Nonspecific bridging-induced attraction drives clustering of DNA-binding proteins and genome organization.
Proc Natl Acad Sci U S A (2013) 110: E3605-11
» View abstract
Molecular dynamics simulations are used to model proteins that diffuse to DNA, bind, and dissociate; in the absence of any explicit interaction between proteins, or between templates, binding spontaneously induces local DNA compaction and protein aggregation. Small bivalent proteins form into rows [as on binding of the bacterial histone-like nucleoid-structuring protein (H-NS)], large proteins into quasi-spherical aggregates (as on nanoparticle binding), and cylinders with eight binding sites (representing octameric nucleosomal cores) into irregularly folded clusters (like those seen in nucleosomal strings). Binding of RNA polymerase II and a transcription factor (NFÎºB) to the appropriate sites on four human chromosomes generates protein clusters analogous to transcription factories, multiscale loops, and intrachromosomal contacts that mimic those found in vivo. We suggest that this emergent behavior of clustering is driven by an entropic bridging-induced attraction that minimizes bending and looping penalties in the template.
McGowan SJ, Hughes JR, Han ZP, Taylor S
MIG: Multi-Image Genome viewer.
Bioinformatics (2013) 29: 2477-8
» View abstract
Multi-Image Genome (MIG) viewer is a web-based application for visualizing, querying and filtering many thousands of genome browser regions as well as for exporting the data in a variety of formats. This methodology has been used successfully to analyze ChIP-Seq data and RNA-Seq data and to detect somatic mutations in genome resequencing projects.MIG is available at https://mig.molbiol.ox.ac.uk/mig/
Babbs C, Roberts NA, Sanchez-Pulido L, McGowan SJ, Ahmed MR, Brown JM, Sabry MA, Bentley DR, McVean GA, Donnelly P, Gileadi O, Ponting CP, Higgs DR, Buckle VJ
Homozygous mutations in a predicted endonuclease are a novel cause of congenital dyserythropoietic anemia type I.
Haematologica (2013) 98: 1383-7
» View abstract
The congenital dyserythropoietic anemias are a heterogeneous group of rare disorders primarily affecting erythropoiesis with characteristic morphological abnormalities and a block in erythroid maturation. Mutations in the CDAN1 gene, which encodes Codanin-1, underlie the majority of congenital dyserythropoietic anemia type I cases. However, no likely pathogenic CDAN1 mutation has been detected in approximately 20% of cases, suggesting the presence of at least one other locus. We used whole genome sequencing and segregation analysis to identify a homozygous T to A transversion (c.533T>A), predicted to lead to a p.L178Q missense substitution in C15ORF41, a gene of unknown function, in a consanguineous pedigree of Middle-Eastern origin. Sequencing C15ORF41 in other CDAN1 mutation-negative congenital dyserythropoietic anemia type I pedigrees identified a homozygous transition (c.281A>G), predicted to lead to a p.Y94C substitution, in two further pedigrees of SouthEast Asian origin. The haplotype surrounding the c.281A>G change suggests a founder effect for this mutation in Pakistan. Detailed sequence similarity searches indicate that C15ORF41 encodes a novel restriction endonuclease that is a member of the Holliday junction resolvase family of proteins.
Hughes JR, Lower KM, Dunham I, Taylor S, De Gobbi M, Sloane-Stanley JA, McGowan S, Ragoussis J, Vernimmen D, Gibbons RJ, Higgs DR
High-resolution analysis of cis-acting regulatory networks at the Î±-globin locus.
Philos Trans R Soc Lond B Biol Sci (2013) 368: 20120361
» View abstract
We have combined the circular chromosome conformation capture protocol with high-throughput, genome-wide sequence analysis to characterize the cis-acting regulatory network at a single locus. In contrast to methods which identify large interacting regions (10-1000 kb), the 4C approach provides a comprehensive, high-resolution analysis of a specific locus with the aim of defining, in detail, the cis-regulatory elements controlling a single gene or gene cluster. Using the human Î±-globin locus as a model, we detected all known local and long-range interactions with this gene cluster. In addition, we identified two interactions with genes located 300 kb (NME4) and 625 kb (FAM173a) from the Î±-globin cluster.
Cossins J, Belaya K, Hicks D, Salih MA, Finlayson S, Carboni N, Liu WW, Maxwell S, Zoltowska K, Farsani GT, Laval S, Seidhamed MZ, Donnelly P, Bentley D, McGowan SJ, Müller J, Palace J, Lochmüller H, Beeson D
Congenital myasthenic syndromes due to mutations in ALG2 and ALG14.
Brain (2013) 136: 944-56
» View abstract
Congenital myasthenic syndromes are a heterogeneous group of inherited disorders that arise from impaired signal transmission at the neuromuscular synapse. They are characterized by fatigable muscle weakness. We performed linkage analysis, whole-exome and whole-genome sequencing to determine the underlying defect in patients with an inherited limb-girdle pattern of myasthenic weakness. We identify ALG14 and ALG2 as novel genes in which mutations cause a congenital myasthenic syndrome. Through analogy with yeast, ALG14 is thought to form a multiglycosyltransferase complex with ALG13 and DPAGT1 that catalyses the first two committed steps of asparagine-linked protein glycosylation. We show that ALG14 is concentrated at the muscle motor endplates and small interfering RNA silencing of ALG14 results in reduced cell-surface expression of muscle acetylcholine receptor expressed in human embryonic kidney 293 cells. ALG2 is an alpha-1,3-mannosyltransferase that also catalyses early steps in the asparagine-linked glycosylation pathway. Mutations were identified in two kinships, with mutation ALG2p.Val68Gly found to severely reduce ALG2 expression both in patient muscle, and in cell cultures. Identification of DPAGT1, ALG14 and ALG2 mutations as a cause of congenital myasthenic syndrome underscores the importance of asparagine-linked protein glycosylation for proper functioning of the neuromuscular junction. These syndromes form part of the wider spectrum of congenital disorders of glycosylation caused by impaired asparagine-linked glycosylation. It is likely that further genes encoding components of this pathway will be associated with congenital myasthenic syndromes or impaired neuromuscular transmission as part of a more severe multisystem disorder. Our findings suggest that treatment with cholinesterase inhibitors may improve muscle function in many of the congenital disorders of glycosylation.