Visualisation of Next Generation Data for Hepatitis C Virus
A retrospective ten year study into the temporal dynamics of quasispecies evolution, maintenance, extinction and expansion by next generation sequencing revealed remarkable shifts in complexity and diversity. The molecular equilibrium from dominance to subdominance of distinct lineages varied over time. What was a complex multi-lineage infection changed to a monophyletic lineage over the ten year period. Next generation sequencing by its very nature generates huge volumes of data. The large volume of raw data makes direct visualisation extremely difficult. The lack of direct visualisation can lead to loss of relevancy. By making big data sets more accessible and therefore relevant through data visualisation we can help bring science alive for many professional and lay audiences alike, helping to translate complex findings into accessible formats.
However, a challenge with this volume of data is the visualisation in a context that makes the data tangible to the original in situ architecture, i.e., in the case of HCV, putting NGS data onto the liver, not in an uninformed manner but keeping the visualisation as real as possible by reference to the published literature, e.g., Kandathil et al data (PMID: 23973767). The aim of this project is to make our HCV NGS data more visual yet keeping the depth of data readily available.
We present a three dimensional format for the visualisation of NGS data. We have modelled the NGS data onto a volume model of the liver, using colour coding to identify different quasispecies lineages from the NGS data of a genotype 4a infection over ten years. Using a randomised modelling format and a three dimensional (3D) scaffold-type representation of the liver, we populate different layers of the liver with hues and voxels representing not just haplotype frequency but clonal density data. Through the use of computer simulations, we map the real dynamics of change over the 10 year study period onto a 3D mesh of the liver. We additionally map the output of predator-prey dynamics of humoral immune mediated selection pressure on various quasispecies lineages into this model.