Treatment determination based on syndrome differentiation is the key of Chinese medicine. A feasible way of improving the clinical therapy effectiveness is needed to correctly differentiate the syndrome classifications based on the clinical manifestations. In this paper, a novel data mining method based on manifold ranking (MR) is proposed to explore the relation between syndromes and symptoms for viral hepatitis. Since MR could take the symptom data with expert differentiation and the symptom data without expert differentiation into the task of syndrome classification, the clinical information used for modeling the syndrome features is greatly enlarged so as to improve the precise of syndrome classification. In addition, the proposed method of syndrome classification could also avoid two disadvantages in previous methods: linear relation of the clinical data and mutually exclusive symptoms among different syndromes. And it could help exploit the latent relation between syndromes and symptoms more effectively. Better performance of syndrome classification is able to be achieved according to the experimental results and the clinical experts.
OBJECTIVE: To help researchers selecting appropriate data mining models to provide better evidence for the clinical practice of Traditional Chinese Medicine(TCM) diagnosis and therapy.METHODS: Clinical issues based on data mining models were comprehensively summarized from four significant elements of the clinical studies:symptoms, symptom patterns, herbs, and efficacy.Existing problems were further generalized to determine the relevant factors of the performance of data mining models, e.g. data type, samples, parameters, variable labels. Combining these relevant factors, the TCM clinical data features were compared with regards to statistical characters and informatics properties. Data models were compared simultaneously from the view of applied conditions and suitable scopes.RESULTS: The main application problems were the inconsistent data type and the small samples for the used data mining models, which caused the inappropriate results, even the mistake results. These features, i.e. advantages, disadvantages, satisfied data types, tasks of data mining, and the TCM issues, were summarized and compared.CONCLUSION: By aiming at the special features of different data mining models, the clinical doctors could select the suitable data mining models to resolve the TCM problem.