主 办:北 京 中 医 药 大 学
ISSN 1006-2157 CN 11-3574/R

JOURNAL OF BEIJING UNIVERSITY OF TRADITIONAL CHINESE MEDICINE ›› 2021, Vol. 44 ›› Issue (6): 538-543.doi: 10.3969/j.issn.1006-2157.2021.06.008

• TCM Informatics • Previous Articles     Next Articles

A model for diagnosing TCM cold and heat patterns based on random forest algorithm*

Shu Chenjie, Liang Hao, Wang Yun#   

  1. School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 102488, China
  • Received:2020-11-21 Online:2021-06-30 Published:2021-06-25
  • Contact: Prof. Wang Yun, Ph.D., Doctoral Supervisor. School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 102488. E-mail: wangyun@bucm.edu.cn
  • Supported by:
    National Natural Science Foundation of China (No.81973495)

Abstract: Objective To construct a model for diagnosing cold and heat patterns from the perspective of symptoms to provide basis for standardizing cold-heat pattern identification. Methods Symptoms related to the “cold” and “heat” patterns were selected from a constructed pattern elements-symptom data table from “Study on Pattern Standardization and Pattern Identification System”. The top 15 symptoms were selected through feature screening of random forest algorithm. The dataset was split randomly into the training set and the test set with a ratio of 7∶3. After the data were resampled, random forest models for the cold and the heat patterns were constructed with the best parameters. The models were then evaluated with parameters including area under the ROC curve (AUC), sensitivity and specificity. Results The key characteristic variables of cold patterns include tight floating pulse, aversion to cold, absence of sweating, white tongue coating, pain relieved with warmth, cold pain, pale tongue, aversion to cold with fever, absence of thirst, body pain, headache, greasy coating, poor appetite, loose stool, and cold limbs. The model has an AUC of 0.912, a specificity of 0.89, and a sensitivity of 0.80. The key characteristic variables of heat patterns include yellow coating, thirst, slippery rapid pulse, fever, high fever, rapid pulse, dark urine, red tongue, wiry rapid pulse, bitter taste in the mouth, greasy coating, crimson tongue, brown urine, vexation, and headache. The model has an AUC of 0.891, a specificity of 0.85 and a sensitivity of 0.86. Conclusion Based on variable screening and random forest algorithm, models for identification of cold and heat patterns could be established with satisfactory classification effect, which could serve as an indirect means of standardizing cold and heat pattern identification.

Key words: random forest algorithm, model for diagnosis, pattern elements

CLC Number: 

  • R241.3