主 办:北 京 中 医 药 大 学
ISSN 1006-2157 CN 11-3574/R

JOURNAL OF BEIJING UNIVERSITY OF TRADITIONAL CHINESE MEDICINE ›› 2019, Vol. 42 ›› Issue (1): 30-36.doi: 10.3969/j.issn.1006-2157.2019.01.006

• Chinese Medicinal Pharmacology • Previous Articles     Next Articles

Screening anti-fibrosis Chinese medicinal compounds based on machine learning

Wang Xiting, Li Yu#, Zhang Lan, Liu Meng, Li Cheng, Yang Qiushi, Hang Xiaoyi, Liu Yi   

  1. School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
  • Received:2018-06-06 Online:2019-01-30 Published:2019-03-01
  • Contact: Li Yu, female, PhD., Researcher, Master's Supervisor. Research direction: prevention and treatment of fibrosis with Chinese medicine. E-mail: liyubeijing1973@163.com
  • Supported by:
    Surface Project of National Natural Science Foundation of China (No. 81573716)

Abstract: Objective To establish a new-type virtual screening predictive model of Chinese medicinal compounds with anti-fibrosis effects, and to verify the predictive performance of the model. Methods The dimension reduction and characteristic optimization of molecular fingerprints were implemented by using random forest (RF) algorithm and gradient boosting decision tree (GBDT) algorithm. A hybrid model of characteristic optimization-machine learning was established, and optimized characteristics were input into logistic regression (LR) and machine learning algorithm of artificial neural network (ANN) for training. Precision, recall rate and F1 value were used for reviewing the performances of various model combinations. The virtual screening predictive model of Chinese medicinal compounds with anti-fibrosis effect was determined according to results of model performance reviewing. The predictive results of anti-fibrosis activity of Chinese medicinal compounds were compared between the virtual screening predictive model and molecular docking model for further verifying the predictive efficiency of the virtual screening predictive model. Results The precision of RF model was 0.76, recall rate was 0.75 and F1 value was 0.74 (AUC=0.818). The precision that of GBDT model was 0.76, recall rate was 0.74 and F1 value was 0.72 (AUC=0.829). The precision of ANN model was 0.75, racall rate was 0.75 and F1 value was 0.75 (AUC=0.802), and that of model of RF+LR was 0.77, recall rate was 0.76 and F1 value was 0.75 (AUC=0.840). The precision of model of RF+LR was 0.74, recall rate was 0.84 and F1 value was 0.79 (AUC=0.850), and that of model of GBDT+LR was 0.80, recall rate was 0.80 and F1 value was 0.79 (AUC=0.872). The precision of model of GBDT+ANN was 0.73, recall rate was 0.91 and F1 value was 0.81 (AUC=0.837). The results of molecular docking activities of Chinese medicinal compounds including curcumin, glycyrrhizic acid, hydro-xysafflor yellow A, emodine and gypenoside were accordance with the predictive results of the virtual screening predictive model. Conclusion The model based on RF+LR is better than the models established based on other Methods. The virtual screening predictive model has good performance in prediction of Chinese medicinal compounds through comparing with molecular docking model. The method has feature of high-throughput screening and can make up the shortage of compound screening efficiency in molecular docking. It provides a new way for virtual screening prediction of Chinese medicinal compounds with anti-fibrosis effects.

Key words: organ fibrosis, machine learning, molecular fingerprinting, Chinese medicinal compound screening

CLC Number: 

  • R285.5