Software
The Bioinformatics team focuses on the development of Bioinformatics software for high-throughput analysis, quality control and visualization of biological data relevant to our work at the CSC. All sofware developed is publically available through our github site or from online, peer-reviewed repositories. We currently maintain 3 analysis pipelines (Basecalling, ChIP-seq and RNA-seq) available on our Github site, to allow for the rapid and reproducible processing and initial analysis of these common data types. We also developed and maintain four Bioconductor packages within the team: ChIPQC, tracktables, soGGi and triform.
ChIPQC (Bioconductor package)
Developed by Tom Carroll, MRC Clinical Sciences Centre, with Rory Stark at CRUK, ChIPQC provides a set of tools for the evaluation of ChIP-seq, MNAse-seq and ChIP-exo quality Bioconductor Site:- Link Bioconductor talk/course on ChIPQC:- Link Related paper:- Link
tracktables (Bioconductor package)
Developed by Tom Carroll at the MRC Clinical Sciences Centre, tracktables provides tools to create dynamic genome browser linked (IGV) HTML reports from BAM, bigwigs and interval files as well as many Bioconductor objects. Bioconductor Site:- Link Github Site:- Link
soGGi (Bioconductor package)
Developed by Tom Carroll at the MRC Clinical Sciences Centre, soGGi (summarising over grouped genomic intervals) provides tools to summarise and visualise signal, motifs and conservation over genomic ranges using GGplot2. Bioconductor Site:- Link Github Site:- Link
triform (Bioconductor package - Maintainer)
Currently maintained by Tom Carroll at the MRC Clinical Sciences Centre, triform identifies punctate peaks by investigating the clustering of ChIP-seq signal. Bioconductor Site:- Link
Infrastructure
On top of the publically available software created and maintained by the MRC CSC Bioinformatics team, the team develops pipelines and infrastructure for the required high throughput processing, quality control and initial analysis of high throughput biological data created within the MRC CSC.
- Basecalling - Automated basecalling, demulitplexing, FastQC and simple statistics
- ChIP-seq - Automated alignment, ChIPQC, BigWigs, peakcalling, motif identification and differential bindind analysis
- RNA-seq - Automated alignment, RNAseq-QC, bigWigs, counting in genes, differential expression analysis