Obtaining regulatory approval for a new drug is not only costly, but can take up to 10 years. The process is tricky and fails around 90% of the time. The success rate can be dramatically increased, even doubled using genetic evidence, however, it is extremely difficult to understand the genetic basis of complex disorders such as Parkinson’s disease and diabetes. In Parkinson’s, for example, 70% of the ‘genetic load’ - the proportion that an illness is due to genetic rather than environmental factors - is unexplained. Conventional analysis generates limited information and insights, often identifying just one significant gene variable. The onset of complex disorders is thought to rely on the interaction of many variables, something not examined by these traditional methods.
What did the project do?
A research partnership between the Centre, small drug discovery company C4X Discovery Limited, and Clive Bowman at the Oxford University Mathematics Institute aims to shorten this drug identification stage by increasing the speed at which large patient datasets can be analysed – together with analysing multiple genetic variables at a time, to reveal the genes significant in specific disorders and the interactions between them. Taxonomy3, the software developed, uses unique mathematics to help fill the heritability gap, which should lead to novel genetic insights and therefore the identification of highly valuable drug targets with an increased success rate.
The mathematical method is based on the ‘individualised divergences’ theory developed in 2005, a non-linear transformation which the Mathematics Institute has now made applicable to real patient and control datasets. This process turns any data type into a numerical measure and contrasts it across groups to allow the genes of patients with complex disorders to be compared to healthy individuals according to various factors such as the patients’ physical characteristics, gender or age, for example.
The Centre provides the computing expertise that allows thousands of variables of large sets of data to be analysed at a time. The software, which uses CUDA code to harness the power of GPU computing to accelerate it, identifies significant variables and patterns between patients. One such analysis compared 1 million genetic variables in 51 patients with Drug Induced Liver Injury with those of 282 controls (healthy individuals), matched by country and gender. This research identified a particular Gene, which is central to drug-induced hepatotoxicity (liver damage) in mice, confirming that genes identified by the new method provide a significant opportunity for drug development.
Dr Wes Armour of the e-Research Centre said “This work has demonstrated the applicability of GPUs to targeted drug discovery, showing that computational pipelines used for this work can be accelerated by over two orders of magnitude. This provides not only cost savings, but allows for detailed study of complex disorders, for example the detailed study of large patient datasets and has provided valuable insights into the complex nature of Parkinson’s disease”
What was the impact of the project?
Taxonomy3 is a promising tool to help us understand complex disorders such as Parkinson’s disease, and derive innovative drug targets with direct genetic support. We are now scaling up this important work to fuel the C4XD drug discovery pipeline, and have already hired a team of analysts for this purpose.Olivier Delrieu, VP of Clinical Development & Mathematics at C4X Discovery Ltd
Making analysis much faster and less costly means that running larger datasets, such as those relating to Parkinson’s disease, is now possible. The records of 1,705 UK-based patients with this disorder were analysed along with 3,000 healthy controls, using data from the Wellcome Trust, which is in the public domain. Prior classical analysis on the disease identified just 2 significant genes. Using Taxonomy3 the new method revealed thirteen – plus three different genetic subtypes, each with specific genes driving the same disorder. Using this information, drug targets can now be identified relating to each of the subgroups.
Scaling this important work up not only creates further jobs for analysts, but significantly, moving from CPUs to much faster GPUs to conduct the analyses allows for much larger data sets to be studied, enabling greater understanding of complex disorders and more accurate results to be generated. This also has the added advantage of reducing the execution time by over two orders of magnitude, reducing costs and decreasing the time taken to get scientific results. For example running analyses on 4,000 subjects delivers a 13-fold cost reduction.
Looking ahead Dr Armour said “The next steps for this work are to increase the computational efficiency of some of our GPU algorithms and to move more of the processing pipeline from CPU to GPU compute devices. This will allow C4X to analyse much larger datasets with a greater number of variables, this in turn will provide more accurate results and allow for the most complex disorders to be studied using these techniques”
Story courtesy of the Oxford e-Research Centre