Hyperlipidemia has been shown to directly lead to diseases like cardiovascular disease, cancer, Type II Diabetes and Alzheimer’s disease. This happens because lipids act as signalling molecules in many biochemical pathways inside our body.In this work, we took a systems viewof the pathways targeted by lipid lowering drugs to determine the driver nodes, or nodes that can control the pathway network. Next, a Random Forest machine learning classifier was applied to the approved drugs interacting with the identified driver nodesto select drugs that can be repurposed for Hyperlipidemia induced complex diseases.
Hyperlipidemia and its associated diseases
Common health problems such as obesity, type-2-diabetes, hypertension and heart problems are caused by the effect of multiple genes, environmental factors as well as lifestyle habits, are called complex diseases. A major risk factor for these diseases is hyperlipidemia (or high lipid levels), which is a complex disorder leading to increased level of blood lipid levels. Hyperlipidemia may occur because of genetic and non-genetic factors. Lipids have a crucial role in health and disease as they are not just the energy storehouse of the body, but also involved in signalling in a large number of interconnected biochemical pathways. Each pathway consists of a number of proteins that together perform a function in the body. So, an imbalance in lipid levels can disturb the signalling patterns in the body, and the lipid profile forms a critical part of a routine health check-up.
High levels of LDL or “bad” cholesterol increases the risk ofcardiovascular disease, asthma, Alzheimer’s, autoimmune, and kidney diseases.Even during the outbreak of COVID-19, high lipid levels post-recoveryhave been noted and may lead to liver injury. In fact, since cholesterol assists viral entry into human cells, cholesterol-lowering agents help fight the virus. Lipid-lowering drugs such as statins lower therisk of heart disease and liver injury after COVID-19.
Network Systems Biology and machine learning for finding new drugs
Network biology provides a powerful platform to integrate different types of biological data in a single frame for the study of complex diseases. Network measures are increasingly being used in study of complex diseases, and in finding new drug targets.
Even today, it takes a lot of time for a drug to come to the market. On the other hand, drug repurposing can be carried out to find new use for already existing approved drugs. As these drugs are already safe, they can be commercialised for treatment of another disease without delay. The advancement in artificial intelligence and machine learning techniques can help in drug repurposing.
An network view of Hypelipidemia
When we looked at the 35 approved drugs for hyperlipidemia treatment, and their associated biochemical pathways, we found that they were interconnected to each other. This is not really surprising as a set of related proteins in the one or more pathways is responsible for a biological function. So, we made a pathway network and also included the approved lipid lowering drugs. This drug-target network (DTN) had452 proteins, 35 approved Hyperlipidemia drugs and 12,410 edges in whichthe drugs interacted with only 34.7% of the network proteins. In network control theory, driver nodes are the key elements of the network that drive the communication and signalling. To identify the driver nodes, a directed DTN (DDTN) was needed. The information about the direction of the network is encoded in the biochemical pathways of an organism. So, we identified themain signalling pathways in our DTN andmerged 34 such pathways to develop a DDTN.
Next, CytoScape was used to find 78 driver nodes. The majority of driver nodes were found to be encoded by non-essential genes and were associated with one or more disease traits. We also found that these driver nodes were either successful drug targets or under clinical trials. So, we proposed that the drugs associated with the driver nodes in our DDTN can be repurposed to treat the complex diseases associated with Hyperlipidemia. But there were 130 such drugs, and the best ones needed to be identified, for which we applied machine learning.
A machine learning approach for finding new drugs for Hyperlipidemia induced complex diseases
To apply a machine learning classifier for our study, we needed a positive, negative and test set of drugs. The positive set consisted of drugs that are approved or under investigation for Hyperlipidemia or associated diseases, whereas the negative set consisted of drugs that increases the lipid level as a side effect. The positive set of drugs was retrieved from Drug Bank, while the negative set drugs was obtained through literature search. Finally, the predicted set consisted of all the approved drugs that were associated with driver nodes. For each set, 1445 molecular descriptors such as atom count, bond count, carbon types, hydrogen bond acceptor count, hydrogen bond donor count etc. were calculated.
With this dataset, we trained and applied a random forest classifier to narrow down the number of potential drugs that can be repurposed for Hyperlipidemia. The model was further subjected to five-fold cross-validation to check the accuracy of its prediction. The candidates having prediction score ≥ 0.65 as assigned by random forest classifier were selected. Next, we searched the literature to see if any direct or indirect role of the selected drugs in lipid lowering or lipid metabolism had been reported.
Our positive, negative and test set consisted of 50 lipid-lowering, 84 lipid-raising and 130 approved drugs respectively. The model showed an average accuracy of 76.8 % during 5-fold cross validation. Further, the precision and recall for the positive predictions were 0.92 and 0.72 respectively. The area under Receiver Operating Characteristic (ROC) curve was 0.79 ± 0.06, that our method could recognise the drugs and separate them from the non-drugs.
Novel drugs to be repurposed
Based on our integrated approach, nine drugs were predicted that can be repurposed for Hyperlipidemia and its associated diseases. These included:
Overall, through our work, we have shown the systems biology and machine learning logic that can help in understanding and treatingof complex lifestyle diseases.
Note : This article is based on our recentwork published in 2021 (reference 4). The complete text can be found here: https://www.sciencedirect.com/science/article/abs/pii/S1476927121000724.