Big data for understanding small molecules; a case study
25 Aug 2020
As big data continues to transform the industry must we remain so reluctant to share data? In her article; ‘Collaborate to accumulate’, updated and published in Laboratory News July 2020 issue, Katharine Briggs looked at the benefits, challenges and considerations surrounding the sharing of proprietary data. Here, she presents a companion case study looking at the eTox consortium.
Maximising the accessibility of pharmaceutical data will become increasingly important as in silico systems move towards the prediction of more complex phenomena for which datasets of an appropriate size, quality and coverage are limited.
In a survey by the Publishing Research Consortium in 2010, access to ‘datasets, data models, algorithms and programs’ was ranked as important or highly important by 62% of the 3823 respondents, whereas only 38% graded these as very or fairly easy to access.
Driven by the increased recognition of the importance of in silico systems, the eTOX consortium was a seven-year public-private partnership within the framework of the European Innovative Medicines Initiative. The project aimed to develop innovative in silico strategies and novel software tools to better predict the toxicological profiles of small molecules in the early stages of the drug development pipeline.
The backbone of the project was a database hosted and curated by Lhasa Limited, who acted as the honest broker for the project. The database consisted of pre-clinical toxicity data for drug compounds or candidates, extracted from previously unpublished, legacy reports from 13 European pharmaceutical companies. The database was enhanced by the incorporation of publicly available, high-quality toxicology data, which was being collected by the European Bioinformatics Institute and also incorporates the RepDose database donated by Fraunhofer.
Honest broker
Pharmaceutical companies vary in whether they consider data on marketed drugs to be sensitive data. Sensitivity of data can also change as a result of the repurposing of drugs and drug candidates. One of the eTOX project participants was able to elaborate a procedure for obtaining general permission for full or restricted sharing, dependent on the status of the compound i.e. whether it was marketed, terminated, under current development (excluding new formulations, new indications or combinations of marketed drugs) or subject to product liability claims.
Responsibility for deciding if data can be shared is often delegated to legal and IP departments. The disadvantage of this is that they only see the risks and not the benefits of data sharing and, being risk adverse, say no by default. In addition, the utility of the data can be difficult to demonstrate ahead of the data being donated. The eTOX project participants highlighted the need for a summary about the project which could be shared with upper management and departments involved in granting authorisation in order to increase publicity and to facilitate decision-making.
eTOX in action
Early eTOX use cases included the investigation of the relevance of specific histopathology findings (confirmed to be target related and species specific), identification of potential target related effects (leading to inclusion of specific target organs in early in vivo studies), and the implementation of a framework of four key approaches (similarity of structure, pharmacology or adverse effects and use of in silico prediction) as part of an early small molecule drug development pipeline.
The eTOX project has now ended but its legacy has led to the formation eTOXsys, a software solution that can deliver improved early drug candidate safety assessment through access to proprietary toxicology data and predictive models.
Author info: Katharine Briggs is Research Leader at Lhasa Limited, a not-for-profit organisation and educational charity that facilitates collaborative data sharing projects in the pharmaceutical, cosmetics and chemistry-related industries. lhasalimited.org