Catalent - Asia-Pacific clinical packaging facilities

Artificial Intelligence

Bridging the pharma sector

Surabhi Johari, Institute of Management Sciences School of Biosciences

Artificial Intelligence AI has provoked a digital revolution This next generation technology with advanced capabilities is bridging the Pharmaceutical industry by designing new algorithms and tools for drug discovery AI has empowered researchers in healthcare with drug repurposing translational medicines and biomarkers development There is a need to showcase the innovation of AI in healthcare as it will help to solve realworld problems more competently

There's a need consolidate natural information with computational strategies for extricating important and fitting qualities from the thousands of qualities measured. Artificial Intelligence (AI) has been connected within the sedate disclosure field for decades. Today, conventional machinelearning modelling has advanced into an assortment of unused strategies, such as combi-QSAR and crossover QSAR, and remains a prevalent approach to consider different drug-related themes. There are different drugs on the showcase and/or in clinical trials that have been outlined by computational strategies. As a recently created machine insights method, AI is been investigated as potential for utilise within the unused huge information time of sedate disclosure. With more information getting to be accessible and unused approaches being created, AI will enable a major Computer-Aided Drug Design (CADD) approach within the close future.

Pharmacologists have demonstrated the classification of various drugs to therapeutic categories solely based on their transcriptional profiles by Deep Neural Networks (DNN) trained on large data sets of transcription response. The LINCS Project's perturbation samples of 678 drugs across A549, MCF-7, and PC-3 cell lines are used here that are linked to 12 MeSH-derived therapeutic usage categories. To train the DNN, transcriptomic data and transcriptomic data are organized using a scoring algorithm for pathway activation.

The DNN achieved high classification accuracy and convincingly outperformed the Support Vector Machine (SVM) model in both pathway and gene level classification on every multi-class classification issue. Models based on path level data, however, performed much better.

A profound learning neural net prepared on transcriptomic information to recognise pharmacological properties of numerous drugs over diverse natural frameworks and conditions is being illustrated for the first time. Here utilising profound neural net perplexity frameworks for medicate repositioning are also proposed. This work could be a verification of guideline for applying profound learning to medicate revelation and advancement.

The ChEMBL database, the most widely used database by chemists, has compared the performance of deep learning to seven target prediction methods, including two commercial predictors, three predictors deployed by pharma, and machine learning methods that could scale to the Kaggle dataset. ChEMBL has 13 million compound descriptors, 1.3 million compounds, and 5,000 drug targets, compared to the Kaggle data-set with 11,000 descriptors, 164,000 compounds, and 15 drug targets. Deep learning outperforms all other methods with respect to the area under ROC curve and was significantly better than all commercial products. Deep learning surpasses the threshold to make virtual compound screening possible and has the potential to become a standard tool in industrial drug design.

Studies have shown how the problem of predicting molecular properties can be solved by recursive neural network approaches. However, molecules are typically described by undirected cyclic graphs whereas the recursive approaches typically use directed acyclic graphs. Thus, by considering an ensemble of recursive neural networks associated with all possible vertex-centred acyclic orientations of the molecular graph, methods are being developed to address this discrepancy. One advantage of this outlook is that it relies only nominally on the identification of suitable molecular descriptors because suitable characterisations are learned automatically from the data. Several variants of this approach are applied to the problem of predicting aqueous solubility and tested on four benchmark data sets. Experimental results show that according to several evaluation metrics the performance of the deep learning methods matches or exceed the performance of other state-of-the-art methods and expose the fundamental limitations arising from training sets that are too noisy or too small. Through the ChemDB portal a Web-based predictor, AquaSol, is available online together with additional material.

The work of Ramsundar B. et al.; 2015 shows the collection of large amounts of publicly available data to create a dataset of almost 40 million measurements across over 200 biological targets to train the multi-task neural architectures at scale. Their investigation shows several aspects of the multitask framework by performing a series of empirical studies and obtain some interesting results:

Massive multitask networks achieve significantly better predictive accuracies than single-task methods

The predictive power of multitask networks improves as additional tasks and data are added

Total data volume and total number of tasks both contribute significantly to multitask enhancement.

Multitask networks provide limited transferability to non-training tasks. These results underline the need for more data sharing and algorithmic innovation to speed up the process of drug discovery. One of the bottlenecks in biopharmaceutical innovation is biological screening. Virtual screening has been an attractive solution. Molecular Dynamics Based Virtual Screening MDVS can significantly improve the performance of a virtual screening campaign. Reducing the high performance computing HPC cost and increasing the computational speed will significantly accelerate biomedical innovation. There are no disease-modifying drugs against, for example, Alzheimer’s, osteoarthritis, metabolic syndromes, and important cancers; the MDVS campaigns described in this article may shorten the time for preclinical studies for diseasemodifying drugs and anti-pandemic drugs. MDVS involves validating the  interactions of thousands or millions of individual compounds against a drug target. The task is parallelisable by dividing a great number of compound-protein complexes into smaller computational packages, which are distributed to GPU-CPU units for calculations. MDVS does not a require massive memory but demands only greater HPC power. Many supercomputers adopt GPUs as coprocessors to reduce manufacturing costs and increase calculation speed. However, the architecture of combining CPUs and GPUs is one of the key factors in computing performance. Another key factor is the application programme. Today, more and more MD simulation programmes offer their parallel or GPU versions, but they have different ways of exploiting HPC power. Consequently, their performances vary. HPC technology may continue to obey Moore’s law, but the updating of application programmes can be a bottleneck. Therefore, it is time to invest more in developing HPC-based application programs while we are building a supercomputer with a higher Linpack Benchmark score.

Today, HPC is a vital part of the research on cancer, but it is at a difficult stage with requirements shifting from simply increasing compute capability in flops to aggregating multiple databases, improving memory bandwidth, using diverse hardware, and enhancing software efficiency. The emergence of rapid and relatively economical genome sequencing has provided scientists and caregivers the ability to study the links between genes and cancer at the level of individual patients, and to use massive genomics and other biological databases to better apprehend patient genotypes and their inferences for disease and treatment.

The National Cancer Institute (NCI) is also delivering more than 20 types of data sources that feed into a neural network model, which then constructs a relationship between drug dosages and patient response. The data sources incorporate information on genetic sequence, gene expression profiles, proteomics, and metabolomics. The NCI also has a large historical database of medical images, for which the neural network can build a relationship with molecular profiles data of cancer, which go back 10 to 15 years. The objective is to predict a molecular profile using medical image data, upon which specific drugs can be prescribed to furnish a positive patient outcome.

In 2017, researchers from North Carolina State University established that ML techniques and molecular dynamics simulations could be merged to create precise computer prediction models. Known as ‘hyperpredictive’ models, researchers claimed that these could be used to predict whether a new chemical compound had the properties of being a suitable drug candidate. As the drug development process tends to be costly and time consuming, this unique combination of molecular dynamics and ML acted as a way to degenerate the number of chemical compounds that could be probable drug candidates. Hence, researchers used computer models that could predict communication of chemical compound with biological target of interest. When an attempt to narrow a field of 200 analogues down to 10 is being made, which is more commonly the case in drug development, the modelling technique must be extremely accurate. The current techniques are not reliable enough.

They have also demonstrated how a specific compound moves in the binding pocket of a protein into prediction models based on ML. The current method involves the use of two-dimensional structure of molecules for drug discovery but in reality, 3D structures have proved to be more effective. Computing and technological advances have allowed researchers to simulate complex data quite easily, which otherwise would have taken months.

Regardless of how you look at it, AI will unquestionably change each part of our lives. Importantly, administrators over the pharma business are seeing approaches to use AI in their line of business, including human services (or the biotech business to be exact). What's more, major pharmaceutical players are as of now getting their feet wet in the realm of AI and man-made reasoning.

Truth be told, the majority of the 10 alleged Big Pharma organisations (in particular Novartis, Roche, Pfizer, Merck, AstraZeneca, GlaxoSmithKline, Sanofi, Abbvie, Bristol-Myers Squibb and Johnson & Johnson) have either explicitly teamed up with or gained Artificial Intelligence advancements to exploit the open doors AI conveys to the table.

References

ChemDB portal [http://cdb.ics.uci.edu/]
Artificial Intelligence swarms drug discovery.Is India catching up too? https://analyticsindiamag.com/artificialintelligence-swarms-drug-discoveryindia-catching/
Top 5 AI Companies In India Creating Solutions For Pharma Sector https://analyticsindiamag.com/top-5-ai-companies-inindia-creating-solutions-forpharma-sector/

--Issue 35--

Author Bio

Surabhi Johari

Surabhi Johari is a Professor in the Institute of Management Sciences at Delhi NCR. She is a member of ABPIO Bioinformatics Platforms. The focus of her research is on applying bioinformatics applications to pharmacology, gene of interest, proteins. She found a home in academia in the intersection of Microbiologists and Medicine.